Skip to content

Lhotse dataloading: RIR augmentation and nemo/tarred input support for RIR and noise aug#9109

Merged
pzelasko merged 4 commits intomainfrom
lhotse-aug-rir-noise
May 6, 2024
Merged

Lhotse dataloading: RIR augmentation and nemo/tarred input support for RIR and noise aug#9109
pzelasko merged 4 commits intomainfrom
lhotse-aug-rir-noise

Conversation

@pzelasko
Copy link
Collaborator

@pzelasko pzelasko commented May 3, 2024

What does this PR do ?

Extensions in Lhotse dataloading augmentations:

  • RIR augmentation support (currently requires lhotse RecordingSet input, see e.g. https://github.com/lhotse-speech/lhotse/blob/master/lhotse/recipes/rir_noise.py )
  • Broader input type support for noise augmentation.
    • When a path is provided, it auto-detects
      • YAML input spec (.yaml)
      • lhotse non-tarred CutSet (.jsonl/.jsonl.gz)
      • NeMo non-tarred manifest (any other extension)
    • Also possible to provide a dict with the same keys as used for specifying the main training data, e.g.:
noise_path:
  manifest_filepath: my_noises/manifest__OP_0..4_CL_.json
  tarred_audio_filepaths: my_noises/audio__OP_0..4_CL_.tar

Collection: ASR

Changelog

  • Add specific line by line info of high level changes in this PR.

Usage

  • You can potentially add a usage example below
# Add a code snippet demonstrating how to use this 

GitHub Actions CI

The Jenkins CI system has been replaced by GitHub Actions self-hosted runners.

The GitHub Actions CI will run automatically when the "Run CICD" label is added to the PR.
To re-run CI remove and add the label again.
To run CI on an untrusted fork, a NeMo user with write access must first click "Approve and run".

Before your PR is "Ready for review"

Pre checks:

  • Make sure you read and followed Contributor guidelines
  • Did you write any new necessary tests?
  • Did you add or update any necessary documentation?
  • Does the PR affect components that are optional to install? (Ex: Numba, Pynini, Apex etc)
    • Reviewer: Does the PR have correct import guards for all optional libraries?

PR Type:

  • New Feature
  • Bugfix
  • Documentation

If you haven't finished some of the above items you can still open "Draft" PR.

Who can review?

Anyone in the community is free to review the PR once the checks have passed.
Contributor guidelines contains specific people who can review PRs to various areas.

Additional Information

  • Related to # (issue)

pzelasko added 2 commits May 3, 2024 13:25
…r RIR and noise aug

Signed-off-by: Piotr Żelasko <petezor@gmail.com>
Signed-off-by: Piotr Żelasko <petezor@gmail.com>
@pzelasko pzelasko requested a review from KunalDhawan May 3, 2024 17:32
@github-actions github-actions bot added the common label May 3, 2024
from omegaconf import DictConfig, OmegaConf

from nemo.collections.common.data.lhotse.cutset import read_cutset_from_config
from nemo.collections.common.data.lhotse.cutset import guess_parse_cutset, read_cutset_from_config

Check notice

Code scanning / CodeQL

Cyclic import

Import of module [nemo.collections.common.data.lhotse.cutset](1) begins an import cycle.
Signed-off-by: Piotr Żelasko <petezor@gmail.com>
Signed-off-by: Piotr Żelasko <petezor@gmail.com>
@pzelasko pzelasko added Run CICD and removed Run CICD labels May 3, 2024

It's intended to be used in a generic context where we are not sure which way the user will specify the inputs.
"""
from nemo.collections.common.data.lhotse.dataloader import make_structured_with_schema_warnings

Check notice

Code scanning / CodeQL

Cyclic import

Import of module [nemo.collections.common.data.lhotse.dataloader](1) begins an import cycle.
Copy link
Collaborator

@KunalDhawan KunalDhawan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, LGTM!

# III) common options
keep_excessive_supervisions: bool = True # when a cut is truncated in the middle of a supervision, should we keep them.
# e. RIR augmentation (synthetic RIR if rir_path is None)
# at the moment supports only Lhotse recording manifests, e.g. https://github.com/lhotse-speech/lhotse/blob/master/lhotse/recipes/rir_noise.py
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just opening a discussion thread here, would be great to support other alternatives in the future

Copy link
Collaborator Author

@pzelasko pzelasko May 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I'll aim to follow up shortly on this. I'll need to release a new lhotse version that extends the API to support CutSet RIR inputs first (as NeMo manifests can be adapted to that), and didn't want to block this work for that.

FWIW constructing a recording set is a one-liner from lhotse import RecordingSet; RecordingSet.from_dir("my-rirs", pattern="*.wav").to_file("rirs.jsonl.gz")

@pzelasko pzelasko merged commit 957c988 into main May 6, 2024
@pzelasko pzelasko deleted the lhotse-aug-rir-noise branch May 6, 2024 18:03
rohitrango pushed a commit to rohitrango/NeMo that referenced this pull request Jun 25, 2024
…r RIR and noise aug (NVIDIA-NeMo#9109)

* Lhotse dataloading: RIR augmentation and nemo/tarred input support for RIR and noise aug

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Fix for RIR: currently lhotse requires RecordingSet

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Unit tests and fixes

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Unit test for RIR

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

---------

Signed-off-by: Piotr Żelasko <petezor@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants