Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NLP refactoring - Stage 2 #368

Merged
merged 124 commits into from Feb 21, 2020
Merged

NLP refactoring - Stage 2 #368

merged 124 commits into from Feb 21, 2020

Conversation

ekmb
Copy link
Collaborator

@ekmb ekmb commented Feb 13, 2020

Signed-off-by: Evelina Bakhturina ebakhturina@nvidia.com

Stage 2 of NLP refactoring:

  • Cleaning up and restructuring of functions and files in nlp collection.
  • Cleaning up losses
    ++ Added weighting option to LossAggregatorNM
    ++ Moved LossAggregatorNM to the losses.py in the backend common
    ++ Splited JointIntentSlotLoss into two separate common losses and removed it
    ++ Merged MaskedLanguageModelingLossNM, PaddedSmoothedCrossEntropyLossNM and SmoothedCrossEntropyLoss into a unified loss SmoothedCrossEntropyLoss
    ++ Changed QuestionAnsweringLoss to a more general name SpanningLoss
    ++ Changed TRADEMaskedCrossEntropy to a more general name MaskedXEntropyLoss
    ++ Removed TokenClassificationLoss, CrossEntropyLoss3D and JointIntentSlotLoss
    ++ Added weighting and masking support to CrossEntropyLossNM
    ++ Added dynamic port sizes to CrossEntropyLossNM
    ++ Changed CrossEntropyLoss to CrossEntropyLossNM to prevent confusion with pytorch's CrossEntropyLoss

Signed-off-by: Evelina Bakhturina <ebakhturina@nvidia.com>
Signed-off-by: Evelina Bakhturina <ebakhturina@nvidia.com>
@lgtm-com
Copy link

lgtm-com bot commented Feb 13, 2020

This pull request introduces 2 alerts and fixes 1 when merging 8be0691 into f072029 - view on LGTM.com

new alerts:

  • 2 for 'import *' may pollute namespace

fixed alerts:

  • 1 for Module imports itself

Signed-off-by: Evelina Bakhturina <ebakhturina@nvidia.com>
@lgtm-com
Copy link

lgtm-com bot commented Feb 13, 2020

This pull request introduces 7 alerts and fixes 1 when merging ce70f26 into f072029 - view on LGTM.com

new alerts:

  • 5 for First parameter of a method is not named 'self'
  • 2 for 'import *' may pollute namespace

fixed alerts:

  • 1 for Module imports itself

Signed-off-by: Evelina Bakhturina <ebakhturina@nvidia.com>
Signed-off-by: Evelina Bakhturina <ebakhturina@nvidia.com>
Signed-off-by: Evelina Bakhturina <ebakhturina@nvidia.com>
Signed-off-by: Evelina Bakhturina <ebakhturina@nvidia.com>
@lgtm-com
Copy link

lgtm-com bot commented Feb 14, 2020

This pull request introduces 9 alerts and fixes 1 when merging 0820752 into f072029 - view on LGTM.com

new alerts:

  • 5 for First parameter of a method is not named 'self'
  • 2 for 'import *' may pollute namespace
  • 1 for Explicit export is not defined
  • 1 for Implicit string concatenation in a list

fixed alerts:

  • 1 for Module imports itself

@ekmb ekmb marked this pull request as ready for review February 14, 2020 00:25
Signed-off-by: Evelina Bakhturina <ebakhturina@nvidia.com>
@lgtm-com
Copy link

lgtm-com bot commented Feb 14, 2020

This pull request introduces 9 alerts and fixes 1 when merging 5b74599 into f072029 - view on LGTM.com

new alerts:

  • 5 for First parameter of a method is not named 'self'
  • 2 for 'import *' may pollute namespace
  • 1 for Explicit export is not defined
  • 1 for Implicit string concatenation in a list

fixed alerts:

  • 1 for Module imports itself

Signed-off-by: Evelina Bakhturina <ebakhturina@nvidia.com>
@lgtm-com
Copy link

lgtm-com bot commented Feb 14, 2020

This pull request introduces 9 alerts and fixes 1 when merging 60f6e3c into 142bed9 - view on LGTM.com

new alerts:

  • 5 for First parameter of a method is not named 'self'
  • 2 for 'import *' may pollute namespace
  • 1 for Explicit export is not defined
  • 1 for Implicit string concatenation in a list

fixed alerts:

  • 1 for Module imports itself

Signed-off-by: Evelina Bakhturina <ebakhturina@nvidia.com>
@lgtm-com
Copy link

lgtm-com bot commented Feb 14, 2020

This pull request introduces 7 alerts and fixes 1 when merging 4428f37 into 142bed9 - view on LGTM.com

new alerts:

  • 5 for First parameter of a method is not named 'self'
  • 2 for 'import *' may pollute namespace

fixed alerts:

  • 1 for Module imports itself

Signed-off-by: Evelina Bakhturina <ebakhturina@nvidia.com>
@lgtm-com
Copy link

lgtm-com bot commented Feb 14, 2020

This pull request introduces 2 alerts and fixes 1 when merging 8b1d72d into 142bed9 - view on LGTM.com

new alerts:

  • 2 for First parameter of a method is not named 'self'

fixed alerts:

  • 1 for Module imports itself

Signed-off-by: Evelina Bakhturina <ebakhturina@nvidia.com>
@lgtm-com
Copy link

lgtm-com bot commented Feb 14, 2020

This pull request introduces 2 alerts and fixes 1 when merging 79cb8f0 into 142bed9 - view on LGTM.com

new alerts:

  • 2 for First parameter of a method is not named 'self'

fixed alerts:

  • 1 for Module imports itself

Signed-off-by: Evelina Bakhturina <ebakhturina@nvidia.com>
@lgtm-com
Copy link

lgtm-com bot commented Feb 14, 2020

This pull request introduces 2 alerts and fixes 1 when merging 04311df into 142bed9 - view on LGTM.com

new alerts:

  • 2 for First parameter of a method is not named 'self'

fixed alerts:

  • 1 for Module imports itself

Signed-off-by: Evelina Bakhturina <ebakhturina@nvidia.com>
Signed-off-by: Evelina Bakhturina <ebakhturina@nvidia.com>
@lgtm-com
Copy link

lgtm-com bot commented Feb 14, 2020

This pull request introduces 2 alerts and fixes 1 when merging 563b69b into 142bed9 - view on LGTM.com

new alerts:

  • 2 for First parameter of a method is not named 'self'

fixed alerts:

  • 1 for Module imports itself

ekmb and others added 3 commits February 14, 2020 14:29
Signed-off-by: Evelina Bakhturina <ebakhturina@nvidia.com>
Signed-off-by: Evelina Bakhturina <ebakhturina@nvidia.com>
Merged PaddedSmoothedCrossEntropyLossNM with MaskedLanguageModelingLossNM into unified SmoothedCrossEntropyLossNM.
Moved SmoothedCrossEntropyLoss into the file for SmoothedCrossEntropyLossNM.

Signed-off-by: VahidooX <vnoroozi@nvidia.com>
Signed-off-by: VahidooX <vnoroozi@nvidia.com>
@lgtm-com
Copy link

lgtm-com bot commented Feb 20, 2020

This pull request introduces 2 alerts and fixes 1 when merging 9caa0c6 into 49bf035 - view on LGTM.com

new alerts:

  • 1 for Unused import
  • 1 for Module is imported with 'import' and 'import from'

fixed alerts:

  • 1 for Module imports itself

Signed-off-by: Evelina Bakhturina <ebakhturina@nvidia.com>
@lgtm-com
Copy link

lgtm-com bot commented Feb 20, 2020

This pull request introduces 1 alert and fixes 1 when merging 88a95b5 into 49bf035 - view on LGTM.com

new alerts:

  • 1 for Module is imported with 'import' and 'import from'

fixed alerts:

  • 1 for Module imports itself

@okuchaiev okuchaiev self-requested a review February 20, 2020 20:30
Copy link
Member

@okuchaiev okuchaiev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, could you please fix few minor issues with docstrings for loss modules

examples/nlp/glue_benchmark/glue_benchmark_with_bert.py Outdated Show resolved Hide resolved
nemo/backends/pytorch/common/losses.py Show resolved Hide resolved
nemo/backends/pytorch/common/losses.py Show resolved Hide resolved
nemo/backends/pytorch/common/losses.py Show resolved Hide resolved
VahidooX and others added 3 commits February 20, 2020 13:38
Signed-off-by: VahidooX <vnoroozi@nvidia.com>
Signed-off-by: Evelina Bakhturina <ebakhturina@nvidia.com>
@lgtm-com
Copy link

lgtm-com bot commented Feb 20, 2020

This pull request introduces 1 alert and fixes 1 when merging ac9f023 into 553b0ae - view on LGTM.com

new alerts:

  • 1 for Module is imported with 'import' and 'import from'

fixed alerts:

  • 1 for Module imports itself

okuchaiev
okuchaiev previously approved these changes Feb 20, 2020
Signed-off-by: VahidooX <vnoroozi@nvidia.com>
@lgtm-com
Copy link

lgtm-com bot commented Feb 20, 2020

This pull request introduces 1 alert and fixes 1 when merging 47b8f5c into 553b0ae - view on LGTM.com

new alerts:

  • 1 for Module is imported with 'import' and 'import from'

fixed alerts:

  • 1 for Module imports itself

Signed-off-by: Evelina Bakhturina <ebakhturina@nvidia.com>
@lgtm-com
Copy link

lgtm-com bot commented Feb 20, 2020

This pull request introduces 1 alert and fixes 1 when merging bf2d7b2 into 553b0ae - view on LGTM.com

new alerts:

  • 1 for Module is imported with 'import' and 'import from'

fixed alerts:

  • 1 for Module imports itself

Signed-off-by: VahidooX <vnoroozi@nvidia.com>
Signed-off-by: VahidooX <vnoroozi@nvidia.com>
@lgtm-com
Copy link

lgtm-com bot commented Feb 20, 2020

This pull request fixes 1 alert when merging 5745b56 into 553b0ae - view on LGTM.com

fixed alerts:

  • 1 for Module imports itself

Signed-off-by: VahidooX <vnoroozi@nvidia.com>
@lgtm-com
Copy link

lgtm-com bot commented Feb 20, 2020

This pull request introduces 1 alert and fixes 1 when merging 6afa104 into 553b0ae - view on LGTM.com

new alerts:

  • 1 for Module is imported with 'import' and 'import from'

fixed alerts:

  • 1 for Module imports itself

@tkornuta-nvidia tkornuta-nvidia merged commit 46045f9 into master Feb 21, 2020
@tkornuta-nvidia tkornuta-nvidia deleted the nlp_refactoring_stage2 branch February 21, 2020 00:12
"""Returns definitions of module input ports.
"""
return {
"logits": NeuralType(('B', 'T', 'D'), LogitsType()),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we rename this to log_probabilities?



class TRADEMaskedCrossEntropy(LossNM):
class MaskedXEntropyLoss(LossNM):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we consistent in using either XEntropy or CrossEntropy? I vote for CrossEntropy

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed it to MaskedLogLoss.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not MaskedCrossEntropyLoss?

Comment on lines +42 to +62
def _compute_softmax(scores):
"""Compute softmax probability over raw logits."""
if not scores:
return []

max_score = None
for score in scores:
if max_score is None or score > max_score:
max_score = score

exp_scores = []
total_sum = 0.0
for score in scores:
x = math.exp(score - max_score)
exp_scores.append(x)
total_sum += x

probs = []
for score in exp_scores:
probs.append(score / total_sum)
return probs
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When would we ever want to do this without going through numpy or torch?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants