Add TorchScript-able "info" func to sox_io backend #728

mthrok · 2020-06-18T15:51:56Z

This is a part of PRs to add new "sox_io" backend #726, and depends on #718.

This PR adds info function to "sox_io" backend, which allows users to fetch some metadata of an audio file.
At this moment, the information retrieved are;

Number of samples in the audio file
Sampling rate
Number of channels

This implementation ended up being aligned with what was suggested in #618 (comment).

Compared to the original "sox" backend, which exposed all sox_signalinfo_t and sox_encodinginfo_t, this is limited but it was not helpful to expose all of sox internals either. For the details of these structures, see sox_signalinfo_t and sox_encodinginfo_t.

codecov · 2020-06-19T17:10:46Z

Codecov Report

Merging #728 into master will increase coverage by 0.02%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master     #728      +/-   ##
==========================================
+ Coverage   88.97%   88.99%   +0.02%     
==========================================
  Files          31       32       +1     
  Lines        2522     2527       +5     
==========================================
+ Hits         2244     2249       +5     
  Misses        278      278

Impacted Files	Coverage Δ
torchaudio/backend/sox_io_backend.py	`100.00% <100.00%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f8eac89...74459d9. Read the comment docs.

vincentqb

LGTM

vincentqb · 2020-06-19T17:11:35Z

.circleci/unittest/linux/scripts/setup_env.sh

@@ -34,4 +34,7 @@ printf "* Installing dependencies (except PyTorch)\n"
 conda env update --file "${this_dir}/environment.yml" --prune

 # 4. Build codecs
-build_tools/setup_helpers/build_third_party.sh
+# build_tools/setup_helpers/build_third_party.sh


nit: Can you elaborate on why you are changing this default mechanic as part of this pr?

The static build of libsox (which is what we ship as binary distribution) does not contain codecs for ogg/vorbis.
Using libsox-fmt-all we can test ogg/vorbis types too.

So we can still build third parties by using BUILD_SOX=1, but by default, we link with the static build. Is that correct?

When BUILD_SOX=0 _torchaudio is linked against a libsox found in the system. static/dynamic depends on what is found.

vincentqb · 2020-06-19T17:16:27Z

test/common_utils.py

@@ -87,6 +89,33 @@ def set_audio_backend(backend):
    torchaudio.set_audio_backend(be)


+class TempDirMixin:


not for this PR: just as is done below in a prior pr, what's the advantage of not inheriting PytorchTestCase here?

To avoid inheritance of PytorchTestCase multiple times at the test module.
TempDirMixin is designed to be composable so that each test case can choose use or not to use.
This decision should not collide with the fact we are requiring each test case to inherit PytorchTestCase.

vincentqb · 2020-06-19T19:11:02Z

test/sox_io_backend/test_info.py

+        ['float32', 'int32', 'int16', 'uint8'],
+        [8000, 16000],
+        [1, 2],


nit: this is nice :) is there a way to have pass keyword arguments with parameterized?

I do not know. Please refer to the doc https://pypi.org/project/parameterized/

vincentqb · 2020-06-19T19:21:55Z

test/kaldi_compatibility_impl.py

@@ -61,7 +55,7 @@ def assert_equal(self, output, *, expected, rtol=None, atol=None):
        expected = expected.to(dtype=self.dtype, device=self.device)
        self.assertEqual(output, expected, rtol=rtol, atol=atol)

-    @unittest.skipIf(_not_available('apply-cmvn-sliding'), '`apply-cmvn-sliding` not available')
+    @common_utils.skipIfNoExec('apply-cmvn-sliding')


nit: again, i'm not sure why this is changing as part of this pr?

So as to avoid two duplicated implementation.

kaldi_compatibility_impl had this original version of skipIfNoExec implementation (which is _not_available func + unittest.skipIf), and now that implementation is promoted to common utility because the new sox_io test uses it too.

mthrok · 2020-06-19T20:22:50Z

thanks!

In pytorch#728, linux unit test switches to libsox provided by apt. For CPU jobs this is fine because all the job steps share the same Docker container, but on CPU job, each job step runs a script in a new Docker container, so libsox installed in a step is not available to the subsequent steps. To fix this, this PR moves the installation of libsox and sox to Docker build.

In #728, linux unit test switches to libsox provided by apt. For CPU jobs this is fine because all the job steps share the same Docker container, but on CPU job, each job step runs a script in a new Docker container, so libsox installed in a step is not available to the subsequent steps. To fix this, this PR moves the installation of libsox and sox to Docker build.

This is a part of PRs to add new "sox_io" backend. #726 and depends on #718 and #728 . This PR adds `load` function to "sox_io" backend, which is tested on the following audio formats; - `wav` - `mp3` - `flac` - `ogg/vorbis` * By default, "sox_io" backend returns Tensor with `float32` dtype and the shape of `[channel, time]`. The samples are normalized to fit in the range of `[-1.0, 1.0]`. Unlike existing "sox" backend, the new `load` function can handle WAV file natively, when the input format is WAV with integer type, (such as 32-bit signed integer, 16-bit signed integer and 8-bit unsigned integer) by providing `normalize=False`, this function can return integer Tensor, where the samples are expressed within the whole range of the corresponding dtype, that is, `int32` tensor for `32-bit PCM`, `int16` for `16-bit PCM` and `uint8` for `8-bit PCM`. This behavior follows [scipy.io.wavfile.read](https://docs.scipy.org/doc/scipy/reference/generated/scipy.io.wavfile.read.html). `normalize` parameter has no effect for other formats and the load function always return normalized value with `float32` Tensor. __* Note__ The current binary distribution of torchaudio does not contain `ogg/vorbis` and `opus` codecs. To handle these files, one needs to build torchaudio from the source with proper codecs in the system. __Note 2__ Since this PR, `scipy` becomes required module for running test.

This is a part of PRs to add new "sox_io" backend. #726 and depends on #718, #728 and #731. This PR adds `save` function to "sox_io" backend, which can save Tensor to a file with the following audio formats; - `wav` - `mp3` - `flac` - `ogg/vorbis`

mthrok force-pushed the get-info branch from 8bdfdb7 to 2e7dd83 Compare June 18, 2020 15:52

mthrok changed the title ~~Add TorchScript-able info func to sox_io backend~~ Add TorchScript-able "info" func to sox_io backend Jun 18, 2020

mthrok force-pushed the get-info branch 2 times, most recently from 9b76f9f to 5e970ee Compare June 18, 2020 19:10

mthrok mentioned this pull request Jun 18, 2020

Add TorchScript-based SoX I/O backend #726

Merged

6 tasks

mthrok force-pushed the get-info branch from 5e970ee to f473bc4 Compare June 18, 2020 20:03

This was referenced Jun 18, 2020

Add TorchScript-able "load" func to sox_io backend #731

Merged

Add TorchScript-able "save" func to sox_io backend #732

Merged

Add SignalInfo typedef, and extension module #718

Merged

Add info for sox_io backend

5064139

mthrok force-pushed the get-info branch from f473bc4 to 5064139 Compare June 18, 2020 22:06

mthrok marked this pull request as ready for review June 18, 2020 22:59

mthrok requested a review from vincentqb June 18, 2020 23:00

mthrok mentioned this pull request Jun 18, 2020

Info length and rate returns different values for different backends #618

Closed

Improve test readability

76ad592

mthrok added 3 commits June 19, 2020 17:13

rename test module

8a6717f

Rename test

93c63f4

Remove unused imports

d2b323d

vincentqb approved these changes Jun 19, 2020

View reviewed changes

Fix test docstring

74459d9

mthrok merged commit 88fccd1 into pytorch:master Jun 19, 2020

mthrok deleted the get-info branch June 19, 2020 20:22

mthrok mentioned this pull request Jun 22, 2020

Bake libsox in test base Docker image #739

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add TorchScript-able "info" func to sox_io backend #728

Add TorchScript-able "info" func to sox_io backend #728

mthrok commented Jun 18, 2020 •

edited

codecov bot commented Jun 19, 2020 •

edited

vincentqb left a comment

vincentqb Jun 19, 2020

mthrok Jun 19, 2020

vincentqb Jun 19, 2020

mthrok Jun 19, 2020

vincentqb Jun 19, 2020

mthrok Jun 19, 2020

vincentqb Jun 19, 2020

mthrok Jun 19, 2020

vincentqb Jun 19, 2020

mthrok Jun 19, 2020

mthrok commented Jun 19, 2020

		@@ -87,6 +89,33 @@ def set_audio_backend(backend):
		torchaudio.set_audio_backend(be)


		class TempDirMixin:

Add TorchScript-able "info" func to sox_io backend #728

Add TorchScript-able "info" func to sox_io backend #728

Conversation

mthrok commented Jun 18, 2020 • edited

codecov bot commented Jun 19, 2020 • edited

Codecov Report

vincentqb left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mthrok commented Jun 19, 2020

mthrok commented Jun 18, 2020 •

edited

codecov bot commented Jun 19, 2020 •

edited