Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ASR evaluator #5728

Merged
merged 38 commits into from Jan 11, 2023
Merged

ASR evaluator #5728

merged 38 commits into from Jan 11, 2023

Conversation

fayejf
Copy link
Collaborator

@fayejf fayejf commented Jan 3, 2023

What does this PR do ?

A tool to thoroughly evaluate an ASR model.

Collection: ASR

Changelog

  • Simple step to evaluate a model in all three modes currently supported by NeMo: offline, chunked, and offline_by_chunked.
  • On-the-fly data augmentation (silence, noise, etc.,) for ASR robustness evaluation.
  • Investigate the model's performance by detailed insertion, deletion, and substitution error rates for each and all samples.
  • Evaluate models' reliability on different target groups such as gender, and audio length if metadata is presented.

Usage

python asr_evaluator.py \
engine.pretrained_name="stt_en_conformer_transducer_large" \
engine.inference_mode.mode="offline" \
engine.test_ds.augmentor.noise.manifest_path=<manifest file for noise data>

Before your PR is "Ready for review"

Pre checks:

  • Make sure you read and followed Contributor guidelines
  • Did you write any new necessary tests?
  • Did you add or update any necessary documentation?
  • Does the PR affect components that are optional to install? (Ex: Numba, Pynini, Apex etc)
    • Reviewer: Does the PR have correct import guards for all optional libraries?

PR Type:

  • New Feature
  • Bugfix
  • Documentation

Who can review?

Anyone in the community is free to review the PR once the checks have passed.
Contributor guidelines contains specific people who can review PRs to various areas.

Additional Information

Note because the PR is getting bigger, on-the-fly augmentation for buffered inference mode would be included in another PR soon.

Signed-off-by: fayejf <fayejf07@gmail.com>
Signed-off-by: fayejf <fayejf07@gmail.com>
Signed-off-by: fayejf <fayejf07@gmail.com>
Signed-off-by: fayejf <fayejf07@gmail.com>
Signed-off-by: fayejf <fayejf07@gmail.com>
Signed-off-by: fayejf <fayejf07@gmail.com>
@github-actions github-actions bot added the ASR label Jan 3, 2023
fayejf and others added 2 commits January 3, 2023 13:26
Signed-off-by: fayejf <36722593+fayejf@users.noreply.github.com>
Signed-off-by: fayejf <fayejf07@gmail.com>
Signed-off-by: fayejf <fayejf07@gmail.com>
Signed-off-by: fayejf <fayejf07@gmail.com>
Signed-off-by: fayejf <fayejf07@gmail.com>
@fayejf fayejf marked this pull request as ready for review January 6, 2023 07:02
fayejf and others added 2 commits January 6, 2023 14:21
Copy link
Collaborator

@stevehuang52 stevehuang52 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good work, just a few small comments. We should make sure that scripts inside nemo.collection do not depend on scripts in examples/ and functions in NeMo be language-agnostic.

@@ -182,6 +186,9 @@ def transcribe(
'channel_selector': channel_selector,
}

if augmentor_config:
config['augmentor_config'] = augmentor_config
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'augmentor_config' can be simplified to just 'augmentor', since it's part of the config variable we can drop the _config suffix.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make sense. updated.

if (cfg.model_path and cfg.pretrained_name) or (not cfg.model_path and not cfg.pretrained_name):
raise ValueError("Please specify either cfg.model_path or cfg.pretrained_name!")

if cfg.inference_mode.mode == "offline":
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cfg.inference_mode.mode seems a bit redundant, can we change to just cfg.inference.mode?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated.

f"model_stride={cfg.inference_mode.model_stride} ",
shell=True,
check=True,
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's unsafe for things inside NeMo collections to depend on NeMo example scripts, since examples may not be installed or can be lost/modified easily. If this function has to use something from the example folder, it's better to move the whole function to an example scripts other than putting it in NeMo

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good points! Moved eval_utils.py to utils.py in tools/asr_evaluator

f"eval_config_yaml={temp_eval_config_yaml_file} ",
shell=True,
check=True,
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as before, It's unsafe for things inside NeMo collections to depend on NeMo example scripts

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

cfg.output_filename = model_name + "-" + dataset_name + "-" + mode_name + ".json"

temp_eval_config_yaml_file = "temp_eval_config.yaml"
with open(temp_eval_config_yaml_file, "w") as f:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider (but not mandatory) using tempfile.NamedTemporaryFile, which automatically manages temp files:

import tempfile
with tempfile.NamedTemporaryFile(mode='w', encoding='utf-8') as f:
    OmegaConf.save(cfg, f)
    f.seek(0)  # reset file pointer
    subprocess.run(
        xxxxxxx
        eval_config_yaml=f.name
    )

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good idea! updated

return cfg, total_res


def target_metadata_wer(manifest: str, target: str, meta_cfg: DictConfig, eval_metric: str = "wer",) -> dict:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe cal_target_metadata_wer?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated


words = sample["words"]
wer_each_class[target_class]["words"] += words
wer_each_class[target_class]["errors"] += words * sample[eval_metric]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better to check whether the input eval_metric is supported, and raise exceptions if it's not valid

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

logging.info(f"metadata '{target}' does not present in manifest. Skipping! ")
return None

values = ['samples', 'words', 'errors', 'inss', 'dels', 'subs']
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add some doctrings on each value to explain what each one means.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated in function docstring

logging.info(f'Hydra config: {OmegaConf.to_yaml(cfg)}')

# Set and save random seed and git hash for reproducibility
random.seed(cfg.env.random_seed)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about numpy and pytorch random seeds?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated. Need to fix rng in perturb in another PR.

)
cfg = run_chunked_inference(cfg)

elif cfg.inference_mode.mode == "offline_by_chunked":
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the fundamental difference between "chunked" and "offline_by_chunked"? It seems that the only difference is only that "offline_by_chunked" will set default parameters while "chunked" will not. Could you please add more docstrings to explain the different use cases?

Copy link
Collaborator Author

@fayejf fayejf Jan 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what you said is totally correct. Added more explanation in docstring.

fayejf and others added 4 commits January 9, 2023 11:40
Signed-off-by: fayejf <fayejf07@gmail.com>
Signed-off-by: fayejf <fayejf07@gmail.com>
Signed-off-by: fayejf <fayejf07@gmail.com>
fayejf and others added 7 commits January 9, 2023 12:18
Signed-off-by: fayejf <fayejf07@gmail.com>
Signed-off-by: fayejf <fayejf07@gmail.com>
Signed-off-by: fayejf <fayejf07@gmail.com>
Signed-off-by: fayejf <fayejf07@gmail.com>
Signed-off-by: fayejf <fayejf07@gmail.com>
Signed-off-by: fayejf <fayejf07@gmail.com>
tools/asr_evaluator/asr_evaluator.py Fixed Show fixed Hide fixed
tools/asr_evaluator/utils.py Fixed Show fixed Hide fixed
fayejf and others added 2 commits January 9, 2023 17:02
Signed-off-by: fayejf <fayejf07@gmail.com>
fayejf and others added 3 commits January 10, 2023 10:47
Signed-off-by: fayejf <fayejf07@gmail.com>
Signed-off-by: fayejf <fayejf07@gmail.com>
- **ENGINE**. To conduct ASR inference.
- **ANALYST**. To evaluate model performance based on predictions.

In Analyst, you can evaluate on all metadata if it presents in manifest. For exmaple, you can evaluate the peformance of model based on duration of each sample, such as how's the model peforms on samples smaller than 5s and longer than 5s by [[0,5][5,100000]] and get wer/cer of each slot. Or how's the model performs on postive (happy, laugh) or neural (neural) or negative mood (sad) as below. And if you set save_wer_per_class=True, it will calculate wer for all (i.e. above 5 classes + cry) classes presented in the data.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you mean "neutral"? The meanings for metadata needs more explanation. For example, we can say that, with the following config, we can calculate WERs for audios in different interval groups, where each group (in seconds) is defined by [[0,2],[2,5],[5,10],[10,20],[20,100000]]. Also, we calculate the WERs for three groups of emotions, where each group is defined by [['happy','laugh'],['neural'],['sad']] (words in the same group will be treated as the same).


emotion:
enable: True
slot: [['happy','laugh'],['neural'],['sad']] # we could have 'cry' in data but not in slot we focus on.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you mean "neutral"?

fayejf and others added 2 commits January 10, 2023 15:41
Signed-off-by: fayejf <fayejf07@gmail.com>
Copy link
Collaborator

@stevehuang52 stevehuang52 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@fayejf fayejf merged commit c6b1ea5 into main Jan 11, 2023
@fayejf fayejf deleted the asr_evaluator_engine branch January 11, 2023 16:57
erastorgueva-nv pushed a commit that referenced this pull request Jan 12, 2023
* backbone

Signed-off-by: fayejf <fayejf07@gmail.com>

* engineer and analyzer

Signed-off-by: fayejf <fayejf07@gmail.com>

* offline_by_chunked

Signed-off-by: fayejf <fayejf07@gmail.com>

* test_ds wip

Signed-off-by: fayejf <fayejf07@gmail.com>

* temp remove inference

Signed-off-by: fayejf <fayejf07@gmail.com>

* mandarin yaml

Signed-off-by: fayejf <fayejf07@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* augmentor and a few updates

Signed-off-by: fayejf <fayejf07@gmail.com>

* address alerts and revert unnecessary changes

Signed-off-by: fayejf <fayejf07@gmail.com>

* Add readme

Signed-off-by: fayejf <fayejf07@gmail.com>

* rename

Signed-off-by: fayejf <fayejf07@gmail.com>

* typo fix

Signed-off-by: fayejf <fayejf07@gmail.com>

* small fix

Signed-off-by: fayejf <fayejf07@gmail.com>

* add missing header

Signed-off-by: fayejf <fayejf07@gmail.com>

* rename augmentor_config to augmentor

Signed-off-by: fayejf <fayejf07@gmail.com>

* raname inference_mode to inference

Signed-off-by: fayejf <fayejf07@gmail.com>

* move utils.py

Signed-off-by: fayejf <fayejf07@gmail.com>

* update temp file

Signed-off-by: fayejf <fayejf07@gmail.com>

* make wer cer clear

Signed-off-by: fayejf <fayejf07@gmail.com>

* seed_everything

Signed-off-by: fayejf <fayejf07@gmail.com>

* fix missing rn augmentor_config in rnnt

Signed-off-by: fayejf <fayejf07@gmail.com>

* fix rnnt transcribe

Signed-off-by: fayejf <fayejf07@gmail.com>

* add more docstring and style fix

Signed-off-by: fayejf <fayejf07@gmail.com>

* address codeQL

Signed-off-by: fayejf <fayejf07@gmail.com>

* reflect comments

Signed-off-by: fayejf <fayejf07@gmail.com>

* update readme

Signed-off-by: fayejf <fayejf07@gmail.com>

* clearer

Signed-off-by: fayejf <fayejf07@gmail.com>

Signed-off-by: fayejf <fayejf07@gmail.com>
Signed-off-by: fayejf <36722593+fayejf@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>
Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
titu1994 pushed a commit to titu1994/NeMo that referenced this pull request Mar 24, 2023
* backbone

Signed-off-by: fayejf <fayejf07@gmail.com>

* engineer and analyzer

Signed-off-by: fayejf <fayejf07@gmail.com>

* offline_by_chunked

Signed-off-by: fayejf <fayejf07@gmail.com>

* test_ds wip

Signed-off-by: fayejf <fayejf07@gmail.com>

* temp remove inference

Signed-off-by: fayejf <fayejf07@gmail.com>

* mandarin yaml

Signed-off-by: fayejf <fayejf07@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* augmentor and a few updates

Signed-off-by: fayejf <fayejf07@gmail.com>

* address alerts and revert unnecessary changes

Signed-off-by: fayejf <fayejf07@gmail.com>

* Add readme

Signed-off-by: fayejf <fayejf07@gmail.com>

* rename

Signed-off-by: fayejf <fayejf07@gmail.com>

* typo fix

Signed-off-by: fayejf <fayejf07@gmail.com>

* small fix

Signed-off-by: fayejf <fayejf07@gmail.com>

* add missing header

Signed-off-by: fayejf <fayejf07@gmail.com>

* rename augmentor_config to augmentor

Signed-off-by: fayejf <fayejf07@gmail.com>

* raname inference_mode to inference

Signed-off-by: fayejf <fayejf07@gmail.com>

* move utils.py

Signed-off-by: fayejf <fayejf07@gmail.com>

* update temp file

Signed-off-by: fayejf <fayejf07@gmail.com>

* make wer cer clear

Signed-off-by: fayejf <fayejf07@gmail.com>

* seed_everything

Signed-off-by: fayejf <fayejf07@gmail.com>

* fix missing rn augmentor_config in rnnt

Signed-off-by: fayejf <fayejf07@gmail.com>

* fix rnnt transcribe

Signed-off-by: fayejf <fayejf07@gmail.com>

* add more docstring and style fix

Signed-off-by: fayejf <fayejf07@gmail.com>

* address codeQL

Signed-off-by: fayejf <fayejf07@gmail.com>

* reflect comments

Signed-off-by: fayejf <fayejf07@gmail.com>

* update readme

Signed-off-by: fayejf <fayejf07@gmail.com>

* clearer

Signed-off-by: fayejf <fayejf07@gmail.com>

Signed-off-by: fayejf <fayejf07@gmail.com>
Signed-off-by: fayejf <36722593+fayejf@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants