Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dialogue task #3884

Merged
merged 155 commits into from Apr 29, 2022
Merged

Dialogue task #3884

merged 155 commits into from Apr 29, 2022

Conversation

Zhilin123
Copy link
Collaborator

@Zhilin123 Zhilin123 commented Mar 25, 2022

What does this PR do ?

Add various functionalities to dialogue domain for NeMo

Collection: NLP

Changelog

  1. Support Zero Shot Intent Recognition
  2. Further refactored Dialogue module
  3. Implement Dialogue GPT Generation Model
  4. Support MS Marco Data Processor
  5. Implement Dialogue S2S Generation Model (HF fully supported, Megatron training supported, inference pending integration of common generation API)
  6. Support System Response Generation using user utterance and system slots based on SGD dataset
  7. Support Design Data Processor
  8. Implement HF BART based classifier into zero shot intent model
  9. Implement Dialogue Nearest Neighbour Model
  10. Refactor Dialogue SGD Data Processor to make interface with models cleaner
  11. Update Nearest Neighbour Model and ZeroShotIntentModel to support SGD dataset and ZeroShot Datasets
  12. Support Mellon QA Data Processor
  13. Add Documentation and Tutorial

See details in NVIDIA only dev log

Usage

  • You can potentially add a usage example below
# Add a code snippet demonstrating how to use this 

Before your PR is "Ready for review"

Pre checks:

  • Make sure you read and followed Contributor guidelines
  • Did you write any new necessary tests?
  • Did you add or update any necessary documentation?
  • Does the PR affect components that are optional to install? (Ex: Numba, Pynini, Apex etc)
    • Reviewer: Does the PR have correct import guards for all optional libraries?

PR Type:

  • New Feature
  • Bugfix
  • Documentation

If you haven't finished some of the above items you can still open "Draft" PR.

Who can review?

Anyone in the community is free to review the PR once the checks have passed.
Contributor guidelines contains specific people who can review PRs to various areas.

Additional Information

  • Related to # (issue)

Zhilin123 and others added 30 commits January 26, 2022 14:45
Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>
Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>
Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>
Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>
Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>
Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>
Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>
Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>
Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>
Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>
Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>
Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>
Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>
Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>
Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>
…_init__.py

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>
Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>
Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>
Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>
Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>
Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>
Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>
@lgtm-com
Copy link

lgtm-com bot commented Apr 26, 2022

This pull request introduces 3 alerts and fixes 5 when merging c3c3f25 into 75c1668 - view on LGTM.com

new alerts:

  • 1 for Signature mismatch in overriding method
  • 1 for Unused local variable
  • 1 for Unused import

fixed alerts:

  • 5 for Unused import

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>
@lgtm-com
Copy link

lgtm-com bot commented Apr 26, 2022

This pull request introduces 3 alerts and fixes 5 when merging 4fe591b into 75c1668 - view on LGTM.com

new alerts:

  • 1 for Signature mismatch in overriding method
  • 1 for Unused local variable
  • 1 for Unused import

fixed alerts:

  • 5 for Unused import

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>
Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>
@lgtm-com
Copy link

lgtm-com bot commented Apr 26, 2022

This pull request introduces 3 alerts and fixes 5 when merging 4926bd1 into 75c1668 - view on LGTM.com

new alerts:

  • 1 for Signature mismatch in overriding method
  • 1 for Unused local variable
  • 1 for Unused import

fixed alerts:

  • 5 for Unused import

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>
@lgtm-com
Copy link

lgtm-com bot commented Apr 26, 2022

This pull request introduces 3 alerts and fixes 5 when merging a761b7b into b93a64a - view on LGTM.com

new alerts:

  • 1 for Signature mismatch in overriding method
  • 1 for Unused local variable
  • 1 for Unused import

fixed alerts:

  • 5 for Unused import

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>
@lgtm-com
Copy link

lgtm-com bot commented Apr 27, 2022

This pull request introduces 3 alerts and fixes 5 when merging c5ac004 into d97e0d3 - view on LGTM.com

new alerts:

  • 1 for Signature mismatch in overriding method
  • 1 for Unused local variable
  • 1 for Unused import

fixed alerts:

  • 5 for Unused import

…apex

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>
@lgtm-com
Copy link

lgtm-com bot commented Apr 27, 2022

This pull request introduces 3 alerts and fixes 5 when merging 943cf59 into 6b6e881 - view on LGTM.com

new alerts:

  • 1 for Signature mismatch in overriding method
  • 1 for Unused local variable
  • 1 for Unused import

fixed alerts:

  • 5 for Unused import

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>
@lgtm-com
Copy link

lgtm-com bot commented Apr 27, 2022

This pull request introduces 3 alerts and fixes 5 when merging 1b5766d into da1b56c - view on LGTM.com

new alerts:

  • 1 for Signature mismatch in overriding method
  • 1 for Unused local variable
  • 1 for Unused import

fixed alerts:

  • 5 for Unused import

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>
@lgtm-com
Copy link

lgtm-com bot commented Apr 27, 2022

This pull request introduces 3 alerts and fixes 5 when merging 690bdab into 0d052c8 - view on LGTM.com

new alerts:

  • 1 for Signature mismatch in overriding method
  • 1 for Unused local variable
  • 1 for Unused import

fixed alerts:

  • 5 for Unused import

Copy link
Contributor

@carolmanderson carolmanderson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the changes and tutorial notebook. This is looking good. One thing I think is important to flesh out is what the different pretrained language model options are and how the naming works. Maybe it is obvious us that 'gpt2' is going to rely on the eponymous HuggingFace gpt2 model, but I'm not sure it'll be obvious to a new person approaching this framework. Similarly for models on NGC, how can they determine which models are compatible, how does the naming work, do they need to download the model, etc.

I have a couple suggestions for future improvements of the notebook (not for this PR). The real power of your approach is its modularity, especially the ability to swap out language models with different architectures for the same task. I think it would be cool to show that off in the notebook by adding an example where you train on the same data/task with a different model type. I would also like to see some examples of the input data displayed. You can see this in some of the other nemo tutorials. There is sometimes a cell that displays a few training examples to give the user a peek. Finally, you could also add some more information about inference to the tutorial. Maybe show an example of reloading a model that was previously fine tuned in the notebook and running inference.

docs/source/nlp/dialogue.rst Show resolved Hide resolved
tutorials/nlp/Dialogue.ipynb Outdated Show resolved Hide resolved
tutorials/nlp/Dialogue.ipynb Outdated Show resolved Hide resolved
tutorials/nlp/Dialogue.ipynb Show resolved Hide resolved
tutorials/nlp/Dialogue.ipynb Outdated Show resolved Hide resolved
tutorials/nlp/Dialogue.ipynb Show resolved Hide resolved
tutorials/nlp/Dialogue.ipynb Outdated Show resolved Hide resolved
docs/source/nlp/dialogue.rst Outdated Show resolved Hide resolved
Zhilin123 and others added 2 commits April 28, 2022 15:10
carolmanderson
carolmanderson previously approved these changes Apr 28, 2022
@lgtm-com
Copy link

lgtm-com bot commented Apr 28, 2022

This pull request introduces 4 alerts and fixes 5 when merging fb32ff0 into f776442 - view on LGTM.com

new alerts:

  • 1 for Signature mismatch in overriding method
  • 1 for Unused local variable
  • 1 for Unused import
  • 1 for Unreachable code

fixed alerts:

  • 5 for Unused import

Signed-off-by: Zhilin Wang <zhilinw@nvidia.com>
@lgtm-com
Copy link

lgtm-com bot commented Apr 28, 2022

This pull request introduces 2 alerts and fixes 5 when merging e15d655 into f776442 - view on LGTM.com

new alerts:

  • 1 for Signature mismatch in overriding method
  • 1 for Unused local variable

fixed alerts:

  • 5 for Unused import

@lgtm-com
Copy link

lgtm-com bot commented Apr 28, 2022

This pull request introduces 2 alerts and fixes 5 when merging 24adbd5 into 655ff80 - view on LGTM.com

new alerts:

  • 1 for Signature mismatch in overriding method
  • 1 for Unused local variable

fixed alerts:

  • 5 for Unused import

@Zhilin123 Zhilin123 merged commit 58ff608 into main Apr 29, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants