NLP refactoring - Stage 2 #368

ekmb · 2020-02-13T22:16:07Z

Signed-off-by: Evelina Bakhturina ebakhturina@nvidia.com

Stage 2 of NLP refactoring:

Cleaning up and restructuring of functions and files in nlp collection.
Cleaning up losses
++ Added weighting option to LossAggregatorNM
++ Moved LossAggregatorNM to the losses.py in the backend common
++ Splited JointIntentSlotLoss into two separate common losses and removed it
++ Merged MaskedLanguageModelingLossNM, PaddedSmoothedCrossEntropyLossNM and SmoothedCrossEntropyLoss into a unified loss SmoothedCrossEntropyLoss
++ Changed QuestionAnsweringLoss to a more general name SpanningLoss
++ Changed TRADEMaskedCrossEntropy to a more general name MaskedXEntropyLoss
++ Removed TokenClassificationLoss, CrossEntropyLoss3D and JointIntentSlotLoss
++ Added weighting and masking support to CrossEntropyLossNM
++ Added dynamic port sizes to CrossEntropyLossNM
++ Changed CrossEntropyLoss to CrossEntropyLossNM to prevent confusion with pytorch's CrossEntropyLoss

Signed-off-by: Evelina Bakhturina <ebakhturina@nvidia.com>

lgtm-com · 2020-02-13T22:27:15Z

This pull request introduces 2 alerts and fixes 1 when merging 8be0691 into f072029 - view on LGTM.com

new alerts:

2 for 'import *' may pollute namespace

fixed alerts:

1 for Module imports itself

Signed-off-by: Evelina Bakhturina <ebakhturina@nvidia.com>

lgtm-com · 2020-02-13T22:54:18Z

This pull request introduces 7 alerts and fixes 1 when merging ce70f26 into f072029 - view on LGTM.com

new alerts:

5 for First parameter of a method is not named 'self'
2 for 'import *' may pollute namespace

fixed alerts:

1 for Module imports itself

Signed-off-by: Evelina Bakhturina <ebakhturina@nvidia.com>

lgtm-com · 2020-02-14T00:14:58Z

This pull request introduces 9 alerts and fixes 1 when merging 0820752 into f072029 - view on LGTM.com

new alerts:

5 for First parameter of a method is not named 'self'
2 for 'import *' may pollute namespace
1 for Explicit export is not defined
1 for Implicit string concatenation in a list

fixed alerts:

1 for Module imports itself

Signed-off-by: Evelina Bakhturina <ebakhturina@nvidia.com>

lgtm-com · 2020-02-14T00:37:53Z

This pull request introduces 9 alerts and fixes 1 when merging 5b74599 into f072029 - view on LGTM.com

new alerts:

5 for First parameter of a method is not named 'self'
2 for 'import *' may pollute namespace
1 for Explicit export is not defined
1 for Implicit string concatenation in a list

fixed alerts:

1 for Module imports itself

Signed-off-by: Evelina Bakhturina <ebakhturina@nvidia.com>

…ctoring_stage2

lgtm-com · 2020-02-14T16:54:06Z

This pull request introduces 9 alerts and fixes 1 when merging 60f6e3c into 142bed9 - view on LGTM.com

new alerts:

5 for First parameter of a method is not named 'self'
2 for 'import *' may pollute namespace
1 for Explicit export is not defined
1 for Implicit string concatenation in a list

fixed alerts:

1 for Module imports itself

Signed-off-by: Evelina Bakhturina <ebakhturina@nvidia.com>

lgtm-com · 2020-02-14T18:42:58Z

This pull request introduces 7 alerts and fixes 1 when merging 4428f37 into 142bed9 - view on LGTM.com

new alerts:

5 for First parameter of a method is not named 'self'
2 for 'import *' may pollute namespace

fixed alerts:

1 for Module imports itself

Signed-off-by: Evelina Bakhturina <ebakhturina@nvidia.com>

lgtm-com · 2020-02-14T18:59:59Z

This pull request introduces 2 alerts and fixes 1 when merging 8b1d72d into 142bed9 - view on LGTM.com

new alerts:

2 for First parameter of a method is not named 'self'

fixed alerts:

1 for Module imports itself

Signed-off-by: Evelina Bakhturina <ebakhturina@nvidia.com>

lgtm-com · 2020-02-14T21:37:57Z

This pull request introduces 2 alerts and fixes 1 when merging 79cb8f0 into 142bed9 - view on LGTM.com

new alerts:

2 for First parameter of a method is not named 'self'

fixed alerts:

1 for Module imports itself

Signed-off-by: Evelina Bakhturina <ebakhturina@nvidia.com>

lgtm-com · 2020-02-14T22:03:55Z

This pull request introduces 2 alerts and fixes 1 when merging 04311df into 142bed9 - view on LGTM.com

new alerts:

2 for First parameter of a method is not named 'self'

fixed alerts:

1 for Module imports itself

Signed-off-by: Evelina Bakhturina <ebakhturina@nvidia.com>

lgtm-com · 2020-02-14T22:17:08Z

This pull request introduces 2 alerts and fixes 1 when merging 563b69b into 142bed9 - view on LGTM.com

new alerts:

2 for First parameter of a method is not named 'self'

fixed alerts:

1 for Module imports itself

Signed-off-by: Evelina Bakhturina <ebakhturina@nvidia.com>

Merged PaddedSmoothedCrossEntropyLossNM with MaskedLanguageModelingLossNM into unified SmoothedCrossEntropyLossNM. Moved SmoothedCrossEntropyLoss into the file for SmoothedCrossEntropyLossNM. Signed-off-by: VahidooX <vnoroozi@nvidia.com>

Signed-off-by: VahidooX <vnoroozi@nvidia.com>

…_refactoring_stage2

lgtm-com · 2020-02-20T20:10:54Z

This pull request introduces 2 alerts and fixes 1 when merging 9caa0c6 into 49bf035 - view on LGTM.com

new alerts:

1 for Unused import
1 for Module is imported with 'import' and 'import from'

fixed alerts:

1 for Module imports itself

Signed-off-by: Evelina Bakhturina <ebakhturina@nvidia.com>

… into nlp_refactoring_stage2

lgtm-com · 2020-02-20T20:27:15Z

This pull request introduces 1 alert and fixes 1 when merging 88a95b5 into 49bf035 - view on LGTM.com

new alerts:

1 for Module is imported with 'import' and 'import from'

fixed alerts:

1 for Module imports itself

okuchaiev

Looks good to me, could you please fix few minor issues with docstrings for loss modules

examples/nlp/glue_benchmark/glue_benchmark_with_bert.py

nemo/backends/pytorch/common/losses.py

Signed-off-by: VahidooX <vnoroozi@nvidia.com>

Signed-off-by: Evelina Bakhturina <ebakhturina@nvidia.com>

lgtm-com · 2020-02-20T21:53:30Z

This pull request introduces 1 alert and fixes 1 when merging ac9f023 into 553b0ae - view on LGTM.com

new alerts:

1 for Module is imported with 'import' and 'import from'

fixed alerts:

1 for Module imports itself

Signed-off-by: VahidooX <vnoroozi@nvidia.com>

… into nlp_refactoring_stage2

lgtm-com · 2020-02-20T22:07:15Z

This pull request introduces 1 alert and fixes 1 when merging 47b8f5c into 553b0ae - view on LGTM.com

new alerts:

1 for Module is imported with 'import' and 'import from'

fixed alerts:

1 for Module imports itself

Signed-off-by: Evelina Bakhturina <ebakhturina@nvidia.com>

lgtm-com · 2020-02-20T22:27:31Z

This pull request introduces 1 alert and fixes 1 when merging bf2d7b2 into 553b0ae - view on LGTM.com

new alerts:

1 for Module is imported with 'import' and 'import from'

fixed alerts:

1 for Module imports itself

Signed-off-by: VahidooX <vnoroozi@nvidia.com>

lgtm-com · 2020-02-20T22:39:42Z

This pull request fixes 1 alert when merging 5745b56 into 553b0ae - view on LGTM.com

fixed alerts:

1 for Module imports itself

Signed-off-by: VahidooX <vnoroozi@nvidia.com>

lgtm-com · 2020-02-20T22:52:55Z

This pull request introduces 1 alert and fixes 1 when merging 6afa104 into 553b0ae - view on LGTM.com

new alerts:

1 for Module is imported with 'import' and 'import from'

fixed alerts:

1 for Module imports itself

nemo/collections/nlp/nm/losses/masked_xentropy_loss.py

blisc · 2020-02-21T00:09:25Z

nemo/collections/nlp/nm/losses/smoothed_cross_entropy_loss.py

+        """Returns definitions of module input ports.
+        """
+        return {
+            "logits": NeuralType(('B', 'T', 'D'), LogitsType()),


Shouldn't we rename this to log_probabilities?

blisc · 2020-02-21T00:11:38Z

nemo/collections/nlp/nm/losses/masked_xentropy_loss.py



-class TRADEMaskedCrossEntropy(LossNM):
+class MaskedXEntropyLoss(LossNM):


Can we consistent in using either XEntropy or CrossEntropy? I vote for CrossEntropy

Changed it to MaskedLogLoss.

why not MaskedCrossEntropyLoss?

blisc · 2020-02-21T00:17:36Z

nemo/collections/nlp/utils/functional_utils.py

+def _compute_softmax(scores):
+    """Compute softmax probability over raw logits."""
+    if not scores:
+        return []
+
+    max_score = None
+    for score in scores:
+        if max_score is None or score > max_score:
+            max_score = score
+
+    exp_scores = []
+    total_sum = 0.0
+    for score in scores:
+        x = math.exp(score - max_score)
+        exp_scores.append(x)
+        total_sum += x
+
+    probs = []
+    for score in exp_scores:
+        probs.append(score / total_sum)
+    return probs


When would we ever want to do this without going through numpy or torch?

ekmb added 2 commits February 13, 2020 14:13

refactor dataset utils

2576bc2

Signed-off-by: Evelina Bakhturina <ebakhturina@nvidia.com>

black fix

8be0691

Signed-off-by: Evelina Bakhturina <ebakhturina@nvidia.com>

refactor datasets

ce70f26

Signed-off-by: Evelina Bakhturina <ebakhturina@nvidia.com>

ekmb added 4 commits February 13, 2020 15:03

import fixes

beafa1e

Signed-off-by: Evelina Bakhturina <ebakhturina@nvidia.com>

import fix

98ec131

Signed-off-by: Evelina Bakhturina <ebakhturina@nvidia.com>

import fix

d02367a

Signed-off-by: Evelina Bakhturina <ebakhturina@nvidia.com>

black fix

0820752

Signed-off-by: Evelina Bakhturina <ebakhturina@nvidia.com>

ekmb marked this pull request as ready for review February 14, 2020 00:25

import fixes

5b74599

Signed-off-by: Evelina Bakhturina <ebakhturina@nvidia.com>

ekmb added 2 commits February 14, 2020 08:45

wip black fix

60f6e3c

Signed-off-by: Evelina Bakhturina <ebakhturina@nvidia.com>

Merge branch 'master' of https://github.com/NVIDIA/NeMo into nlp_refa…

e29638b

…ctoring_stage2

import fixed

4428f37

Signed-off-by: Evelina Bakhturina <ebakhturina@nvidia.com>

import fixed

8b1d72d

Signed-off-by: Evelina Bakhturina <ebakhturina@nvidia.com>

trade example fix

79cb8f0

Signed-off-by: Evelina Bakhturina <ebakhturina@nvidia.com>

utils refacoting

04311df

Signed-off-by: Evelina Bakhturina <ebakhturina@nvidia.com>

ekmb added 2 commits February 14, 2020 14:08

text_cl fixed

b78496d

Signed-off-by: Evelina Bakhturina <ebakhturina@nvidia.com>

rm output

563b69b

Signed-off-by: Evelina Bakhturina <ebakhturina@nvidia.com>

ekmb and others added 3 commits February 14, 2020 14:29

glue fix, jenkins updated

67957a4

Signed-off-by: Evelina Bakhturina <ebakhturina@nvidia.com>

revert jenkins

0f49b0c

Signed-off-by: Evelina Bakhturina <ebakhturina@nvidia.com>

Removed CrossEntropy3D.

97eeb39

Merged PaddedSmoothedCrossEntropyLossNM with MaskedLanguageModelingLossNM into unified SmoothedCrossEntropyLossNM. Moved SmoothedCrossEntropyLoss into the file for SmoothedCrossEntropyLossNM. Signed-off-by: VahidooX <vnoroozi@nvidia.com>

VahidooX added 2 commits February 20, 2020 12:03

Fixed comments.

7914f0d

Signed-off-by: VahidooX <vnoroozi@nvidia.com>

Merge remote-tracking branch 'remote/nlp_refactoring_stage2' into nlp…

9caa0c6

…_refactoring_stage2

ekmb added 2 commits February 20, 2020 12:18

bug fixed

d850f34

Signed-off-by: Evelina Bakhturina <ebakhturina@nvidia.com>

Merge branch 'nlp_refactoring_stage2' of https://github.com/NVIDIA/NeMo…

88a95b5

… into nlp_refactoring_stage2

okuchaiev self-requested a review February 20, 2020 20:30

okuchaiev requested changes Feb 20, 2020

View reviewed changes

examples/nlp/glue_benchmark/glue_benchmark_with_bert.py Outdated Show resolved Hide resolved

nemo/backends/pytorch/common/losses.py Show resolved Hide resolved

nemo/backends/pytorch/common/losses.py Show resolved Hide resolved

nemo/backends/pytorch/common/losses.py Show resolved Hide resolved

VahidooX and others added 3 commits February 20, 2020 13:38

Resolved reviews.

963ea89

Signed-off-by: VahidooX <vnoroozi@nvidia.com>

import fixed

33ac3df

Signed-off-by: Evelina Bakhturina <ebakhturina@nvidia.com>

Merge remote-tracking branch 'origin/master' into nlp_refactoring_stage2

ac9f023

okuchaiev previously approved these changes Feb 20, 2020

View reviewed changes

VahidooX added 2 commits February 20, 2020 13:57

Updated comments.

75271bd

Signed-off-by: VahidooX <vnoroozi@nvidia.com>

Merge branch 'nlp_refactoring_stage2' of https://github.com/NVIDIA/NeMo…

47b8f5c

… into nlp_refactoring_stage2

VahidooX dismissed okuchaiev’s stale review via 47b8f5c February 20, 2020 21:57

licences updated

bf2d7b2

Signed-off-by: Evelina Bakhturina <ebakhturina@nvidia.com>

VahidooX added 2 commits February 20, 2020 14:29

Fixed lgtm warnings.

56b6ea1

Signed-off-by: VahidooX <vnoroozi@nvidia.com>

Fixed lgtm warnings.

5745b56

Signed-off-by: VahidooX <vnoroozi@nvidia.com>

VahidooX requested a review from okuchaiev February 20, 2020 22:32

Fixed bug.

6afa104

Signed-off-by: VahidooX <vnoroozi@nvidia.com>

okuchaiev approved these changes Feb 20, 2020

View reviewed changes

tkornuta-nvidia merged commit 46045f9 into master Feb 21, 2020

tkornuta-nvidia deleted the nlp_refactoring_stage2 branch February 21, 2020 00:12

blisc reviewed Feb 21, 2020

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NLP refactoring - Stage 2 #368

NLP refactoring - Stage 2 #368

ekmb commented Feb 13, 2020 •

edited by VahidooX

lgtm-com bot commented Feb 13, 2020

lgtm-com bot commented Feb 13, 2020

lgtm-com bot commented Feb 14, 2020

lgtm-com bot commented Feb 14, 2020

lgtm-com bot commented Feb 14, 2020

lgtm-com bot commented Feb 14, 2020

lgtm-com bot commented Feb 14, 2020

lgtm-com bot commented Feb 14, 2020

lgtm-com bot commented Feb 14, 2020

lgtm-com bot commented Feb 14, 2020

lgtm-com bot commented Feb 20, 2020

lgtm-com bot commented Feb 20, 2020

okuchaiev left a comment

lgtm-com bot commented Feb 20, 2020

lgtm-com bot commented Feb 20, 2020

lgtm-com bot commented Feb 20, 2020

lgtm-com bot commented Feb 20, 2020

lgtm-com bot commented Feb 20, 2020

blisc Feb 21, 2020

blisc Feb 21, 2020

VahidooX Feb 21, 2020

blisc Feb 21, 2020

blisc Feb 21, 2020



		class TRADEMaskedCrossEntropy(LossNM):
		class MaskedXEntropyLoss(LossNM):

NLP refactoring - Stage 2 #368

NLP refactoring - Stage 2 #368

Conversation

ekmb commented Feb 13, 2020 • edited by VahidooX

lgtm-com bot commented Feb 13, 2020

lgtm-com bot commented Feb 13, 2020

lgtm-com bot commented Feb 14, 2020

lgtm-com bot commented Feb 14, 2020

lgtm-com bot commented Feb 14, 2020

lgtm-com bot commented Feb 14, 2020

lgtm-com bot commented Feb 14, 2020

lgtm-com bot commented Feb 14, 2020

lgtm-com bot commented Feb 14, 2020

lgtm-com bot commented Feb 14, 2020

lgtm-com bot commented Feb 20, 2020

lgtm-com bot commented Feb 20, 2020

okuchaiev left a comment

Choose a reason for hiding this comment

lgtm-com bot commented Feb 20, 2020

lgtm-com bot commented Feb 20, 2020

lgtm-com bot commented Feb 20, 2020

lgtm-com bot commented Feb 20, 2020

lgtm-com bot commented Feb 20, 2020

blisc Feb 21, 2020

Choose a reason for hiding this comment

blisc Feb 21, 2020

Choose a reason for hiding this comment

VahidooX Feb 21, 2020

Choose a reason for hiding this comment

blisc Feb 21, 2020

Choose a reason for hiding this comment

blisc Feb 21, 2020

Choose a reason for hiding this comment

ekmb commented Feb 13, 2020 •

edited by VahidooX