Augmentation refactoring and torchaudio SoX effects support #124

pzelasko · 2020-11-11T16:24:51Z

TL;DR

Changing the data augmentation APIs in Lhotse to accept a callable with signature like: def augment_fn(audio: Union[torch.tensor, np.ndarray], sampling_rate: int) -> np.ndarray
mirroring WavAugment capabilities with torchaudio.sox_effects

…n a previous PR)

pzelasko · 2020-11-11T18:17:45Z

It's good to review.

pzelasko · 2020-11-11T18:24:15Z

@freewym please check this out - it will be small but breaking for the recipe you're creating, but hopefully makes the whole setup way easier (no need to compile libsox, wavaugment, etc. just install the latest pytorch + torchaudio with anaconda) and gets us rid of various quirks with multiprocessing.

mthrok

Looks good regarding the usage of torchaudio's Sox Effects.

vincentqb · 2020-11-11T19:07:14Z

(glad to see the migration here! cc facebookresearch/WavAugment#16)

freewym · 2020-11-11T23:10:10Z

@freewym please check this out - it will be small but breaking for the recipe you're creating, but hopefully makes the whole setup way easier (no need to compile libsox, wavaugment, etc. just install the latest pytorch + torchaudio with anaconda) and gets us rid of various quirks with multiprocessing.

Cool. I think I am supposed to install PyTorch 1.7 in order to test it. I will do later today.

mthrok · 2020-11-11T23:37:05Z

lhotse/augmentation/__init__.py

+from .common import AugmentFn
+from .wavaugment import *
+
+if str(_torchaudio.__version__) >= '0.7.0':


FYI: If torchaudio hits 0.10.0 release in the future (we do not know if we will move to the major 1.0 release or not), this could produce a wrong result.

$ python >>> '0.7.0' > '0.10.0' True

A future-proof way would be to split the version string and compare major version and minor version as number.

Well spotted!

packaging.version may be helpful in this case.
See https://stackoverflow.com/questions/11887762/how-do-i-compare-version-numbers-in-python

freewym · 2020-11-11T23:37:08Z

lhotse/features/base.py

@@ -10,7 +10,7 @@
 import torch

 from lhotse.audio import Recording
-from lhotse.augmentation import WavAugmenter
+from lhotse.augmentation import AugmentFn, WavAugmenter


WavAugmenter is no longer useful (?) If it is the case, all appearances of WavAugmenter should be removed within this file

Good point. I want to keep it for now so that people with PyTorch older than 1.7 can also use it, but I will probably add a deprecation warning...

freewym · 2020-11-12T00:19:24Z

lhotse/augmentation/torchaudio.py

+    ]
+
+
+def reverb(sampling_rate: int) -> List[List[str]]:


Is it possible to make such functions more general, in that they can accept more arguments, e.g., the lower/up bound of room sizes can been passed into this function?

Sure, I'll change that

Actually I'd rather make a follow-up PR later on, as I'm not sure which parameters it makes sense to tweak and how general they should be. If we want to tweak everything it's simpler to just write your own chain...

(I'm open to suggestions)

freewym · 2020-11-12T00:20:26Z

lhotse/augmentation/torchaudio.py

+    end: Union[int, float]
+
+    def sample(self):
+        return random.uniform(self.start, self.end)


can we use numpy random functions? It may be easier for me to seed it from outside

Would using this function help (note that it also makes the random cut ID creation deterministic)?

https://github.com/lhotse-speech/lhotse/blob/master/lhotse/utils.py#L33

OK, never mind. I will seed it in my own code.

I'll change it to numpy anyway, I guess more people are used to seeding numpy than random in training loops

freewym · 2020-11-12T05:35:34Z

lhotse/augmentation/torchaudio.py

+    return [
+        # Random speed perturbation factor between 0.9x and 1.1x the original speed
+        ['speed', RandomValue(0.9, 1.1)],
+        ['rate', sampling_rate],  # Resample back to the original sampling rate (speed changes it)


Looks like this line makes the running hang. It works without this line.

edit: actually not hang. It terminated with
"File "/export/fs04/a07/ywang/fairseq4/espresso/tools/lhotse/lhotse/cut.py", line 1311, in compute_and_store_features
executor.submit(
File "/export/b03/ywang/anaconda3/lib/python3.8/concurrent/futures/process.py", line 629, in submit
raise BrokenProcessPool(self._broken)
concurrent.futures.process.BrokenProcessPool: A child process terminated abruptly, the process pool is not usable anymore"

😫

I'll have a look

I was able to replicate the issue and add a unit test that causes it. I submitted the issue to torchaudio here: pytorch/audio#1021

BTW, if replacing RandomValue() above with a function _get_value(factor) where factor is simply returned, the running hangs as well. Do you have any clue of the cause?

Is the function defined as a closure (i.e. within another function) and captures some variable outside of its scope? That could explain it... Otherwise, I don't know.

@freewym some good news, if you create executor like: ProcessPoolExecutor(..., mp_context=multiprocessing.get_context("spawn")) it solves the segfault/hanging problem. Could you try? If it works I'll go on and merge this

Credits to @mthrok for suggesting this

(to make it clear: it works for me on the grid, on my mac, and in GitHub CI, so it should be okay)

Yeah, it works!

…ctoring

…/pzelasko/lhotse into feature/augmentation-refactoring

pzelasko added 5 commits November 9, 2020 19:36

Test for wav augment + parallel executor

f3700df

Make the test conditional on WavAugment being installed

4de6057

Merge branch 'master' into feature/augmentation-v2

c626357

Refactor the data augmentation APIs to accept a generic callable

50e9b61

Add torchaudio data augmentation support

4d45676

pzelasko mentioned this pull request Nov 11, 2020

Inconsistent SoX speed behaviour compared to WavAugment pytorch/audio#1019

Closed

pzelasko linked an issue Nov 11, 2020 that may be closed by this pull request

Data augmentation with Torchaudio #100

Closed

pzelasko mentioned this pull request Nov 11, 2020

Data augmentation with Torchaudio #100

Closed

pzelasko added 4 commits November 11, 2020 12:59

Resolve the issue with mismatched num_samples after speed

fa15e0a

Update audio augmentation docs

4a54cee

Update highlighted examples (forgot to change output_dir to storage i…

ac426ad

…n a previous PR)

Reduce the number of randomized tests for speed

2e1c39a

pzelasko changed the title ~~[WIP] Augmentation refactoring and torchaudio SoX effects support~~ Augmentation refactoring and torchaudio SoX effects support Nov 11, 2020

pzelasko added this to the v0.2 milestone Nov 11, 2020

pzelasko added the breaking-change label Nov 11, 2020

mthrok reviewed Nov 11, 2020

View reviewed changes

freewym reviewed Nov 11, 2020

View reviewed changes

freewym reviewed Nov 12, 2020

View reviewed changes

pzelasko added 5 commits November 12, 2020 10:42

Address code reviews

ba156ca

Merge branch 'master' into feature/augmentation-v2

28bc3f8

Merge branch 'feature/augmentation-v2' into feature/augmentation-refa…

b99ecc7

…ctoring

Expose all augmentation-related imports in top-level lhotse module

6aa7b0f

Merge branch 'feature/augmentation-refactoring' of https://github.com…

7234dab

…/pzelasko/lhotse into feature/augmentation-refactoring

pzelasko added 3 commits November 12, 2020 10:55

Update the test that was hanging with augment to use torchaudio

1b0ebf1

Workaround for the multiprocessing bug

62988a0

Fix imports shadowing

ba685cd

pzelasko merged commit d977170 into master Nov 13, 2020

pzelasko mentioned this pull request Nov 13, 2020

Update Lhotse's augmentation API usage k2-fsa/snowfall#5

Closed

pzelasko deleted the feature/augmentation-refactoring branch July 1, 2021 01:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Augmentation refactoring and torchaudio SoX effects support #124

Augmentation refactoring and torchaudio SoX effects support #124

pzelasko commented Nov 11, 2020 •

edited

Loading

pzelasko commented Nov 11, 2020

pzelasko commented Nov 11, 2020

mthrok left a comment

vincentqb commented Nov 11, 2020

freewym commented Nov 11, 2020

mthrok Nov 11, 2020

pzelasko Nov 11, 2020

csukuangfj Nov 12, 2020

freewym Nov 11, 2020

pzelasko Nov 11, 2020

freewym Nov 12, 2020

pzelasko Nov 12, 2020

pzelasko Nov 12, 2020 •

edited

Loading

freewym Nov 12, 2020

pzelasko Nov 12, 2020

freewym Nov 12, 2020

pzelasko Nov 12, 2020

freewym Nov 12, 2020 •

edited

Loading

pzelasko Nov 12, 2020

pzelasko Nov 12, 2020

freewym Nov 12, 2020 •

edited

Loading

pzelasko Nov 12, 2020

pzelasko Nov 13, 2020

pzelasko Nov 13, 2020

pzelasko Nov 13, 2020

freewym Nov 13, 2020

pzelasko Nov 13, 2020

Augmentation refactoring and torchaudio SoX effects support #124

Augmentation refactoring and torchaudio SoX effects support #124

Conversation

pzelasko commented Nov 11, 2020 • edited Loading

pzelasko commented Nov 11, 2020

pzelasko commented Nov 11, 2020

mthrok left a comment

Choose a reason for hiding this comment

vincentqb commented Nov 11, 2020

freewym commented Nov 11, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pzelasko Nov 12, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

freewym Nov 12, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

freewym Nov 12, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pzelasko commented Nov 11, 2020 •

edited

Loading

pzelasko Nov 12, 2020 •

edited

Loading

freewym Nov 12, 2020 •

edited

Loading

freewym Nov 12, 2020 •

edited

Loading