Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add exportable mel spec #5512

Merged
merged 2 commits into from
Nov 28, 2022
Merged

Conversation

1-800-BAD-CODE
Copy link
Contributor

Signed-off-by: shane carroll shane.carroll@utsa.edu

What does this PR do ?

AudioToMelSpectrogramPreprocessor accepts a bool argument use_torchaudio which, if True, switches the featurizer to a torchaudio-based extractor which produces the same features but with an exportable graph.

New preprocessor mimics the old implementation sufficiently well to swap out the preprocessor of pre-trained models and export them.

Preprocessor can be exported to JIT; ONNX is blocked by pytorch/pytorch#81075

Collection: ASR

Changelog

Add an option to AudioToMelSpectrogramPreprocessor which can alter featurizer.

Add a class FilterbankFeaturesTA which is analogous to FilterbankFeatures

Usage

The following script will

  1. Load a pre-trained English Conformer
  2. Create a copy of the model's preprocessor, but with a torchaudio backend
  3. Compare new and old preprocessor outputs
  4. Export the preprocessor to JIT
  5. Compare JIT and PyTorch outputs
  6. Swap the pre-trained model's preprocessor to the new one, check WER matches old
from pathlib import Path

import torch
import hydra
from pytorch_lightning import seed_everything
from omegaconf import open_dict

from nemo.utils import logging
from nemo.utils.nemo_logging import Logger
from nemo.collections.asr.metrics.wer import word_error_rate
from nemo.collections.asr.models import ASRModel
from nemo.collections.asr.modules.audio_preprocessing import AudioToMelSpectrogramPreprocessor


# Optionally use a seed
# seed_everything(42)

logging.set_verbosity(Logger.CRITICAL)

# Get a pre-trained ASR model to compare preprocessors. Any model with a mel spec extractor should work.
m: ASRModel = ASRModel.from_pretrained("stt_en_conformer_ctc_small", map_location=torch.device("cpu"))
m.eval()
old_preprocessor = m.preprocessor

# Extract preprocessor config and set the flag to use the torchaudio-based extractor; keep all other arguments the same
new_config = m.cfg.preprocessor
with open_dict(new_config):
    new_config.use_torchaudio = True
# Instantiate an instance that uses torchaudio on the backend
new_preprocessor: AudioToMelSpectrogramPreprocessor = hydra.utils.instantiate(config=new_config)
new_preprocessor.eval()
print(f"New preprocessor featurizer type: {type(new_preprocessor.featurizer)}")

# Export the torchaudio preprocessor and load it back in as a `ScriptModule`.
new_preprocessor.export("tmp.pt")
jit_preprocessor = torch.jit.load("tmp.pt")

# Generate random input
batch_size = 4
max_length = 16000
signals = torch.randn(size=[batch_size, max_length])
lengths = torch.randint(low=200, high=max_length, size=[batch_size])
lengths[0] = max_length

# Extract features with all preprocessors
old_feats, old_feat_lens = old_preprocessor(input_signal=signals, length=lengths)
new_feats, new_feat_lens = new_preprocessor(input_signal=signals, length=lengths)
jit_feats, jit_feat_lens = jit_preprocessor(input_signal=signals, length=lengths)

# Make sure new output matches old output
# Need to relax the tolerance from defaults. We will check WER also, as an alternative verification of correctness.
rel_tolerance = 1e-2
abs_tolerance = 1e-4
torch.testing.assert_allclose(actual=new_feats, expected=old_feats, atol=abs_tolerance, rtol=rel_tolerance)
# Zero tolerance for integer lengths.
torch.testing.assert_allclose(actual=new_feat_lens, expected=old_feat_lens, atol=0, rtol=0)

print(f"Output comparison passed with relative tolerance {rel_tolerance} and absolute tolerance {abs_tolerance}.")

# Make sure JIT output matches PyTorch output
# Keep tolerance at defaults for JIT comparison
torch.testing.assert_allclose(actual=jit_feats, expected=new_feats)
torch.testing.assert_allclose(actual=jit_feat_lens, expected=new_feat_lens, atol=0, rtol=0)
print(f"Jit comparison passed with default tolerance.")

# To run a WER check you'll need to comment out some assumptions in the CTC transcribe method, as addressed in https://github.com/NVIDIA/NeMo/pull/2762 
# print("Testing WER with old/new preprocessor with some LibriSpeech data")
# We only need audio files; we're comparing model outputs to each other, not references
# dev_other_dir = "/path/to/LibriSpeech/dev-other"
# num_files_to_use = 100
# audio_files = [str(x) for x in Path(dev_other_dir).rglob("*.flac")]
# audio_files = audio_files[:num_files_to_use]
# print("Transcribing with the baseline model")
# old_output = m.transcribe(audio_files)
# m.preprocessor = new_preprocessor
# print("Transcribing after switching the preprocessor")
# new_output = m.transcribe(audio_files)
# wer = word_error_rate(hypotheses=new_output, references=old_output)
# print(f"WER with {len(audio_files)} audio files using old vs. new preprocessor is {wer}")

Before your PR is "Ready for review"

Pre checks:

  • Make sure you read and followed Contributor guidelines
  • Did you write any new necessary tests?
  • Did you add or update any necessary documentation?
  • Does the PR affect components that are optional to install? (Ex: Numba, Pynini, Apex etc)
    • Reviewer: Does the PR have correct import guards for all optional libraries?

PR Type:

  • New Feature
  • Bugfix
  • Documentation

If you haven't finished some of the above items you can still open "Draft" PR.

Who can review?

Anyone in the community is free to review the PR once the checks have passed.
Contributor guidelines contains specific people who can review PRs to various areas.

Additional Information

  • Related to: this issue comes up once in a while and is generally brushed aside.

Signed-off-by: shane carroll <shane.carroll@utsa.edu>
@github-actions github-actions bot added the ASR label Nov 27, 2022
@titu1994
Copy link
Collaborator

Thanks for your awesome pr ! I'll review it today

@titu1994
Copy link
Collaborator

This is fantastic ! I'm wondering if we could simply subclass and override the methods of the older code but I don't think it's necessary. This is much cleaner, though it does support only a subset of the other featurizer.

I'll send a PR later today to add some unittests as you have sent above to ensure value between the two remains same

@titu1994 titu1994 merged commit 21b088b into NVIDIA:main Nov 28, 2022
hainan-xv pushed a commit to hainan-xv/NeMo that referenced this pull request Nov 29, 2022
* add exportable mel spec

Signed-off-by: shane carroll <shane.carroll@utsa.edu>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: shane carroll <shane.carroll@utsa.edu>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>
hainan-xv pushed a commit to hainan-xv/NeMo that referenced this pull request Nov 29, 2022
* add exportable mel spec

Signed-off-by: shane carroll <shane.carroll@utsa.edu>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: shane carroll <shane.carroll@utsa.edu>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>
hainan-xv pushed a commit to hainan-xv/NeMo that referenced this pull request Dec 5, 2022
* add exportable mel spec

Signed-off-by: shane carroll <shane.carroll@utsa.edu>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: shane carroll <shane.carroll@utsa.edu>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>
andrusenkoau pushed a commit to andrusenkoau/NeMo that referenced this pull request Jan 5, 2023
* add exportable mel spec

Signed-off-by: shane carroll <shane.carroll@utsa.edu>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: shane carroll <shane.carroll@utsa.edu>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: andrusenkoau <andrusenkoau@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants