Looping audio (#136)

Summary: ## Related Issue Fixes #129 - [x] I have read CONTRIBUTING.md to understand how to contribute to this repository :) <Please summarize what you are trying to achieve, what changes you made, and how they achieve the desired result.> Preliminary loop augmentation for the audio modality, yet to add stereo `test_loop_audio.wave` ## Unit Tests If your changes touch the `audio` module, please run all of the `audio` tests and paste the output here. Likewise for `image`, `text`, & `video`. If your changes could affect behavior in multiple modules, please run the tests for all potentially affected modules. If you are unsure of which modules might be affected by your changes, please just run all the unit tests. ### Audio ```bash python -m unittest discover -s augly/tests/audio_tests/ -p "*" ......./home/adityaprasad/.local/lib/python3.8/site-packages/librosa/util/utils.py:2099: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here. Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations np.dtype(np.float): np.complex, /home/adityaprasad/.local/lib/python3.8/site-packages/librosa/util/utils.py:2099: DeprecationWarning: `np.complex` is a deprecated alias for the builtin `complex`. To silence this warning, use `complex` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.complex128` here. Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations np.dtype(np.float): np.complex, /home/adityaprasad/.local/lib/python3.8/site-packages/librosa/util/utils.py:869: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here. Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations mag = np.abs(S).astype(np.float) ........../home/adityaprasad/.local/lib/python3.8/site-packages/librosa/core/spectrum.py:1223: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here. Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations time_steps = np.arange(0, D.shape[1], rate, dtype=np.float) /home/adityaprasad/.local/lib/python3.8/site-packages/librosa/util/utils.py:2099: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here. Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations np.dtype(np.float): np.complex, /home/adityaprasad/.local/lib/python3.8/site-packages/librosa/util/utils.py:2099: DeprecationWarning: `np.complex` is a deprecated alias for the builtin `complex`. To silence this warning, use `complex` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.complex128` here. Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations np.dtype(np.float): np.complex, /home/adityaprasad/.local/lib/python3.8/site-packages/librosa/core/spectrum.py:1223: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here. Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations time_steps = np.arange(0, D.shape[1], rate, dtype=np.float) /home/adityaprasad/.local/lib/python3.8/site-packages/librosa/util/utils.py:869: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here. Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations mag = np.abs(S).astype(np.float) ................................................... ---------------------------------------------------------------------- Ran 68 tests in 4.976s OK ``` ### Image ```bash python -m unittest discover -s augly/tests/image_tests/ -p "*_test.py" # Or `python -m unittest discover -s augly/tests/image_tests/ -p "*.py"` to run pytorch test too (must install `torchvision` to run) ``` ### Text ```bash python -m unittest discover -s augly/tests/text_tests/ -p "*" ``` ### Video ```bash python -m unittest discover -s augly/tests/video_tests/ -p "*" ``` ### All ```bash python -m unittest discover -s augly/tests/ -p "*" ``` ## Other testing If applicable, test your changes and paste the output here. For example, if your changes affect the requirements/installation, then test installing augly in a fresh conda env, then make sure you are able to import augly & run the unit test Pull Request resolved: #136 Reviewed By: zpapakipos Differential Revision: D31821541 Pulled By: jbitton fbshipit-source-id: 241fbca68a52f20cb6924d4d753f77daa4ed6f6d
facebookresearch · Oct 22, 2021 · 785e35b · 785e35b
1 parent 263284f
commit 785e35b
Show file tree

Hide file tree

Showing 11 changed files with 130 additions and 0 deletions.
diff --git a/.github/workflows/test_python.yml b/.github/workflows/test_python.yml
@@ -10,6 +10,8 @@ jobs:
     steps:
       - uses: actions/checkout@v2
       - uses: actions/setup-python@v2
+        with:
+          python-version: '3.9'
       - run: sudo apt-get update
       - run: sudo apt-get install --fix-missing ffmpeg python3-soundfile
       - run: pip install pyre-check pytest torchvision

diff --git a/augly/assets/tests/audio/speech_commands_expected_output/mono/test_loop.wav b/augly/assets/tests/audio/speech_commands_expected_output/mono/test_loop.wav
diff --git a/augly/assets/tests/audio/speech_commands_expected_output/stereo/test_loop.wav b/augly/assets/tests/audio/speech_commands_expected_output/stereo/test_loop.wav
diff --git a/augly/audio/__init__.py b/augly/audio/__init__.py
@@ -12,6 +12,7 @@
     high_pass_filter,
     insert_in_background,
     invert_channels,
+    loop,
     low_pass_filter,
     normalize,
     peaking_equalizer,
@@ -33,6 +34,7 @@
     high_pass_filter_intensity,
     insert_in_background_intensity,
     invert_channels_intensity,
+    loop_intensity,
     low_pass_filter_intensity,
     normalize_intensity,
     peaking_equalizer_intensity,
@@ -54,6 +56,7 @@
     HighPassFilter,
     InsertInBackground,
     InvertChannels,
+    Loop,
     LowPassFilter,
     Normalize,
     PeakingEqualizer,
@@ -77,6 +80,7 @@
     "high_pass_filter",
     "insert_in_background",
     "invert_channels",
+    "loop",
     "low_pass_filter",
     "normalize",
     "peaking_equalizer",
@@ -97,6 +101,7 @@
     "HighPassFilter",
     "InsertInBackground",
     "InvertChannels",
+    "Loop",
     "LowPassFilter",
     "Normalize",
     "OneOf",
@@ -117,6 +122,7 @@
     "high_pass_filter_intensity",
     "insert_in_background_intensity",
     "invert_channels_intensity",
+    "loop_intensity",
     "low_pass_filter_intensity",
     "normalize_intensity",
     "peaking_equalizer_intensity",

diff --git a/augly/audio/functional.py b/augly/audio/functional.py
@@ -617,6 +617,53 @@ def invert_channels(
     return audutils.ret_and_save_audio(aug_audio, output_path, sample_rate)
 
 
+def loop(
+    audio: Union[str, np.ndarray],
+    sample_rate: int = DEFAULT_SAMPLE_RATE,
+    n: int = 1,
+    output_path: Optional[str] = None,
+    metadata: Optional[List[Dict[str, Any]]] = None,
+) -> Tuple[np.ndarray, int]:
+    """
+    Loops the audio 'n' times
+
+    @param audio: the path to the audio or a variable of type np.ndarray that
+        will be augmented
+
+    @param sample_rate: the audio sample rate of the inputted audio
+
+    @param n: the number of times the video will be looped
+
+    @param output_path: the path in which the resulting audio will be stored. If None,
+        the resulting np.ndarray will still be returned
+
+    @param metadata: if set to be a list, metadata about the function execution
+        including its name, the source & dest duration, sample rates, etc. will be
+        appended to the inputted list. If set to None, no metadata will be appended
+
+    @returns: the augmented audio array and sample rate
+    """
+    assert isinstance(n, int) and n >= 0, "Expected 'n' to be a nonnegative integer"
+    audio, sample_rate = audutils.validate_and_load_audio(audio, sample_rate)
+
+    aug_audio = audio
+    for _ in range(n):
+        aug_audio = np.append(aug_audio, audio, axis=(0 if audio.ndim == 1 else 1))
+
+    audutils.get_metadata(
+        metadata=metadata,
+        function_name="loop",
+        audio=audio,
+        sample_rate=sample_rate,
+        dst_audio=aug_audio,
+        dst_sample_rate=sample_rate,
+        output_path=output_path,
+        n=n,
+    )
+
+    return audutils.ret_and_save_audio(aug_audio, output_path, sample_rate)
+
+
 def low_pass_filter(
     audio: Union[str, np.ndarray],
     sample_rate: int = DEFAULT_SAMPLE_RATE,

diff --git a/augly/audio/intensity.py b/augly/audio/intensity.py
@@ -100,6 +100,13 @@ def invert_channels_intensity(metadata: Dict[str, Any], **kwargs) -> float:
     return 0.0 if metadata["src_num_channels"] == 1 else 100.0
 
 
+def loop_intensity(n: int = 1, **kwargs) -> float:
+    assert isinstance(n, int) and n >= 0, "Expected 'n' to be a nonnegative integer"
+
+    max_num_loops = 100
+    return min((n / max_num_loops) * 100.0, 100.0)
+
+
 def low_pass_filter_intensity(cutoff_hz: float = 500.0, **kwargs) -> float:
     assert (
         isinstance(cutoff_hz, (float, int)) and cutoff_hz >= 0

diff --git a/augly/audio/transforms.py b/augly/audio/transforms.py
@@ -450,6 +450,39 @@ def apply_transform(
         return F.invert_channels(audio, sample_rate, metadata=metadata)
 
 
+class Loop(BaseTransform):
+    def __init__(self, n: int = 1, p: float = 1.0):
+        """
+        @param n: the number of times the audio will be looped
+
+        @param p: the probability of the transform being applied; default value is 1.0
+        """
+        super().__init__(p)
+        self.n = n
+
+    def apply_transform(
+        self,
+        audio: np.ndarray,
+        sample_rate: int,
+        metadata: Optional[List[Dict[str, Any]]] = None,
+    ) -> Tuple[np.ndarray, int]:
+        """
+        Loops the audio 'n' times
+
+        @param audio: the path to the audio or a variable of type np.ndarray that
+            will be augmented
+
+        @param sample_rate: the audio sample rate of the inputted audio
+
+        @param metadata: if set to be a list, metadata about the function execution
+            including its name, the source & dest duration, sample rates, etc. will be
+            appended to the inputted list. If set to None, no metadata will be appended
+
+        @returns: the augmented audio array and sample rate
+        """
+        return F.loop(audio, sample_rate, self.n, metadata=metadata)
+
+
 class LowPassFilter(BaseTransform):
     def __init__(self, cutoff_hz: float = 500.0, p: float = 1.0):
         """

diff --git a/augly/tests/audio_tests/functional_unit_test.py b/augly/tests/audio_tests/functional_unit_test.py
@@ -37,6 +37,9 @@ def test_insert_in_background(self):
     def test_invert_channels(self):
         self.evaluate_function(audaugs.invert_channels)
 
+    def test_loop(self):
+        self.evaluate_function(audaugs.loop, n=1)
+
     def test_low_pass_filter(self):
         self.evaluate_function(audaugs.low_pass_filter, cutoff_hz=500)
 

diff --git a/augly/tests/audio_tests/intensity_unit_test.py b/augly/tests/audio_tests/intensity_unit_test.py
@@ -52,6 +52,10 @@ def test_invert_channels_intensity(self):
         intensity = audaugs.invert_channels_intensity(metadata={"src_num_channels": 2})
         self.assertAlmostEqual(intensity, 100.0)
 
+    def test_loop_intensity(self):
+        intensity = audaugs.loop_intensity(metadata={}, n=1)
+        self.assertAlmostEqual(intensity, 1.0)
+
     def test_low_pass_filter_intensity(self):
         intensity = audaugs.low_pass_filter_intensity(metadata={}, cutoff_hz=500.0)
         self.assertAlmostEqual(intensity, 97.5)

diff --git a/augly/tests/audio_tests/transforms_unit_test.py b/augly/tests/audio_tests/transforms_unit_test.py
@@ -87,6 +87,9 @@ def test_InsertInBackground(self):
     def test_InvertChannels(self):
         self.evaluate_class(audaugs.InvertChannels(), fname="invert_channels")
 
+    def Loop(self):
+        self.evaluate_class(audaugs.Loop(n=1), fname="loop")
+
     def test_LowPassFilter(self):
         self.evaluate_class(
             audaugs.LowPassFilter(cutoff_hz=500), fname="low_pass_filter"

diff --git a/augly/utils/expected_output/audio_tests/expected_metadata.json b/augly/utils/expected_output/audio_tests/expected_metadata.json
@@ -377,6 +377,31 @@
         ]
       }
     ],
+    "loop": [
+      { 
+        "dst_duration": 1.70675,
+        "dst_num_channels": 1,
+        "dst_sample_rate": 32000,
+        "dst_segments": [
+          {
+            "start": 0.0,
+            "end": 0.853375
+          }
+        ],
+        "intensity": 1.0,
+        "n": 1,
+        "name": "loop",
+        "output_path": null,
+        "src_duration": 0.853375, 
+        "src_num_channels": 1, 
+        "src_sample_rate": 32000, 
+        "src_segments": [
+          {
+            "start": 0.0, "end": 0.853375
+          }
+        ]
+      }
+    ],
     "low_pass_filter": [
       {
         "alpha": 0.08939812957705753,