Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scoring and Analysis of Adapter Experiments #5187

Merged
merged 9 commits into from
Oct 20, 2022
Merged

Conversation

shan18
Copy link
Member

@shan18 shan18 commented Oct 18, 2022

What does this PR do ?

Added the script for scoring and analysis of the adapter experiments.

Collection: asr

Changelog

  • Created a new script: examples/asr/asr_adapters/scoring_and_analysis.py
  • Updated README: examples/asr/asr_adapters/README.md

Before your PR is "Ready for review"

Pre checks:

  • Make sure you read and followed Contributor guidelines
  • Did you write any new necessary tests?
  • Did you add or update any necessary documentation?
  • Does the PR affect components that are optional to install? (Ex: Numba, Pynini, Apex etc)
    • Reviewer: Does the PR have correct import guards for all optional libraries?

PR Type:

  • New Feature
  • Bugfix
  • Documentation

Who can review?

Anyone in the NeMo ASR Team

Copy link
Collaborator

@titu1994 titu1994 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good, requires a bit of shuffling of the docstring + more comments on which args users must change and what should be format of the csv file c

examples/asr/asr_adapters/README.md Show resolved Hide resolved
Usage:
python scoring_and_analysis.py \
--csv <path to cleaned result csv file> \
--dataset_type_column <column in csv with the dataset types>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move the basic usage documentation inside this script.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -172,8 +172,9 @@ def main(cfg: TranscriptionConfig) -> TranscriptionConfig:
if hasattr(asr_model, 'change_decoding_strategy'):
# Check if ctc or rnnt model
if hasattr(asr_model, 'joint'): # RNNT model
rnnt_decoding = RNNTDecodingConfig(fused_batch_size=-1, compute_langs=cfg.compute_langs)
asr_model.change_decoding_strategy(rnnt_decoding)
cfg.rnnt_decoding.fused_batch_size = -1
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rebase your pr, I've merged the other pr with these changes.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is done

titu1994
titu1994 previously approved these changes Oct 20, 2022
Copy link
Collaborator

@titu1994 titu1994 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great ! Thanks !

@titu1994
Copy link
Collaborator

Rebase, then force push

@titu1994 titu1994 merged commit 85fc659 into NVIDIA:main Oct 20, 2022
XuesongYang pushed a commit that referenced this pull request Oct 20, 2022
* Fixed bug in transcribe_speech.py where decoding strategy was not being updated.

Signed-off-by: Shantanu Acharya <shantanua@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Created the script to calculate scores and perform analysis on the grid of experiments

Signed-off-by: Shantanu Acharya <shantanua@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Include the basic usage doc inside the scoring script

* Update the docstrings in the adapters scoring script

Signed-off-by: Shantanu Acharya <shantanua@nvidia.com>

Signed-off-by: Shantanu Acharya <shantanua@nvidia.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

update
1-800-BAD-CODE pushed a commit to 1-800-BAD-CODE/NeMo that referenced this pull request Nov 13, 2022
* Fixed bug in transcribe_speech.py where decoding strategy was not being updated.

Signed-off-by: Shantanu Acharya <shantanua@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Created the script to calculate scores and perform analysis on the grid of experiments

Signed-off-by: Shantanu Acharya <shantanua@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Include the basic usage doc inside the scoring script

* Update the docstrings in the adapters scoring script

Signed-off-by: Shantanu Acharya <shantanua@nvidia.com>

Signed-off-by: Shantanu Acharya <shantanua@nvidia.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: 1-800-bad-code <shane.carroll@utsa.edu>
hainan-xv pushed a commit to hainan-xv/NeMo that referenced this pull request Nov 29, 2022
* Fixed bug in transcribe_speech.py where decoding strategy was not being updated.

Signed-off-by: Shantanu Acharya <shantanua@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Created the script to calculate scores and perform analysis on the grid of experiments

Signed-off-by: Shantanu Acharya <shantanua@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Include the basic usage doc inside the scoring script

* Update the docstrings in the adapters scoring script

Signed-off-by: Shantanu Acharya <shantanua@nvidia.com>

Signed-off-by: Shantanu Acharya <shantanua@nvidia.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>
hainan-xv pushed a commit to hainan-xv/NeMo that referenced this pull request Nov 29, 2022
* Fixed bug in transcribe_speech.py where decoding strategy was not being updated.

Signed-off-by: Shantanu Acharya <shantanua@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Created the script to calculate scores and perform analysis on the grid of experiments

Signed-off-by: Shantanu Acharya <shantanua@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Include the basic usage doc inside the scoring script

* Update the docstrings in the adapters scoring script

Signed-off-by: Shantanu Acharya <shantanua@nvidia.com>

Signed-off-by: Shantanu Acharya <shantanua@nvidia.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants