Add TorchScript-able "load" func to sox_io backend #731

mthrok · 2020-06-18T18:38:34Z

This is a part of PRs to add new "sox_io" backend. #726 and depends on #718 and #728 .

This PR adds load function to "sox_io" backend, which is tested on the following audio formats;

wav
mp3
flac
ogg/vorbis *

By default, "sox_io" backend returns Tensor with float32 dtype and the shape of [channel, time]. The samples are normalized to fit in the range of [-1.0, 1.0].

Unlike existing "sox" backend, the new load function can handle WAV file natively, when the input format is WAV with integer type, (such as 32-bit signed integer, 16-bit signed integer and 8-bit unsigned integer) by providing normalize=False, this function can return integer Tensor, where the samples are expressed within the whole range of the corresponding dtype, that is, int32 tensor for 32-bit PCM, int16 for 16-bit PCM and uint8 for 8-bit PCM. This behavior follows scipy.io.wavfile.read. normalize parameter has no effect for other formats and the load function always return normalized value with float32 Tensor.

* Note The current binary distribution of torchaudio does not contain ogg/vorbis and opus codecs. To handle these files, one needs to build torchaudio from the source with proper codecs in the system.

Note 2 Since this PR, scipy becomes required module for running test.

codecov · 2020-06-19T15:31:34Z

Codecov Report

Merging #731 into master will increase coverage by 0.02%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master     #731      +/-   ##
==========================================
+ Coverage   89.21%   89.24%   +0.02%     
==========================================
  Files          32       32              
  Lines        2513     2519       +6     
==========================================
+ Hits         2242     2248       +6     
  Misses        271      271

Impacted Files	Coverage Δ
torchaudio/backend/sox_io_backend.py	`100.00% <100.00%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0f0d0af...036bf8a. Read the comment docs.

test/sox_io_backend/test_load.py

vincentqb · 2020-06-23T19:45:09Z

test/sox_io_backend/test_torchscript.py

+
+        script_path = self.get_temp_path('load_func')
+        torch.jit.script(py_load_func).save(script_path)
+        ts_load_func = torch.jit.load(script_path)


woohoo!! :D

vincentqb · 2020-06-23T19:46:09Z

torchaudio/backend/sox_io_backend.py

+    return tensor, signal_info.get_sample_rate()
+
+
+load_wav = load


woohoo!! :D

torchaudio/csrc/sox_utils.cpp

vincentqb · 2020-06-24T22:29:54Z

test/sox_io_backend/common.py

+        pass
+    elif tensor.dtype == torch.int32:
+        tensor = tensor.to(torch.float32)
+        tensor[tensor > 0] /= 2147483647.


nit: is there a way of fetching these numbers instead of hard coding them? It's in the tests, so I don't worry much about it though.

torchaudio/csrc/sox_utils.cpp

torchaudio/backend/sox_io_backend.py

torchaudio/csrc/register.cpp

vincentqb · 2020-06-24T22:49:48Z

torchaudio/csrc/sox_utils.cpp

+  // So make sure to create a new copy after processing samples.
+  if (normalize || dtype == torch::kFloat32) {
+    t = t.to(torch::kFloat32);
+    t *= (t > 0) / 2147483647. + (t < 0) / 2147483648.;


nit: this one is not in the tests though :)

What do you mean? The combination of float32 and normalize=true|false are tested everywhere.

I meant the 2147483647 :)

It is tested.

I meant something different: I have seen this hardcoded values appear in a few places in this PR (e.g. above). Most of them are in tests, so I don't worry much about them. This hardcoded value is not in a test file though, so I would have preferred if it were not hardcoded.

vincentqb

Overall, LGTM!

nit: there's a C++ lint error :)

vincentqb · 2020-06-25T22:27:26Z

test/sox_io_backend/sox_utils.py

@@ -51,4 +54,17 @@ def gen_audio_file(
        command += ['vol', f'-{attenuation}dB']
    print(' '.join(command))
    subprocess.run(command, check=True)
-    subprocess.run(['soxi', path], check=True)


nit: why is this removed at this time?

The command above it, sox -V prints the same information, so it turned out redundant.

mthrok · 2020-06-25T22:55:50Z

nit: there's a C++ lint error :)

Yeah I was waiting for the approval first as I have six branches depend on this and had to rebase all of them.

mthrok · 2020-06-25T22:55:56Z

thanks!

This is a part of PRs to add new "sox_io" backend. #726 and depends on #718, #728 and #731. This PR adds `save` function to "sox_io" backend, which can save Tensor to a file with the following audio formats; - `wav` - `mp3` - `flac` - `ogg/vorbis`

loss detach

This reverts commit 391be73.

mthrok force-pushed the load branch from 42ce2c6 to a38dc98 Compare June 18, 2020 19:10

mthrok mentioned this pull request Jun 18, 2020

Add TorchScript-based SoX I/O backend #726

Merged

6 tasks

mthrok force-pushed the load branch 2 times, most recently from eeaa2c7 to 6da4ff4 Compare June 18, 2020 20:52

mthrok mentioned this pull request Jun 18, 2020

Add TorchScript-able "save" func to sox_io backend #732

Merged

mthrok force-pushed the load branch 4 times, most recently from cae9321 to df42c5f Compare June 19, 2020 15:18

mthrok force-pushed the load branch 20 times, most recently from 99462ec to 500733f Compare June 23, 2020 18:02

mthrok force-pushed the load branch 5 times, most recently from 6b1134b to 1c7af65 Compare June 23, 2020 19:25

vincentqb reviewed Jun 23, 2020

View reviewed changes

test/sox_io_backend/test_load.py Outdated Show resolved Hide resolved

vincentqb reviewed Jun 23, 2020

View reviewed changes

mthrok marked this pull request as ready for review June 23, 2020 19:45

vincentqb reviewed Jun 23, 2020

View reviewed changes

torchaudio/backend/sox_io_backend.py

return tensor, signal_info.get_sample_rate()

load_wav = load

Copy link

Contributor

vincentqb Jun 23, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

woohoo!! :D

vincentqb reviewed Jun 23, 2020

View reviewed changes

torchaudio/csrc/sox_utils.cpp Outdated Show resolved Hide resolved

mthrok force-pushed the load branch from 1c7af65 to fe3d546 Compare June 23, 2020 23:37

mthrok requested a review from cpuhrsch June 24, 2020 00:37

mthrok force-pushed the load branch from 01c657d to 577dfcf Compare June 24, 2020 21:27

vincentqb reviewed Jun 24, 2020

View reviewed changes

torchaudio/csrc/sox_utils.cpp Show resolved Hide resolved

vincentqb reviewed Jun 24, 2020

View reviewed changes

torchaudio/backend/sox_io_backend.py Show resolved Hide resolved

vincentqb reviewed Jun 24, 2020

View reviewed changes

torchaudio/csrc/register.cpp Show resolved Hide resolved

vincentqb reviewed Jun 24, 2020

View reviewed changes

vincentqb approved these changes Jun 25, 2020

View reviewed changes

Add load function

036bf8a

mthrok force-pushed the load branch from 4d0f7c0 to 036bf8a Compare June 25, 2020 23:00

mthrok merged commit 793eeab into pytorch:master Jun 25, 2020

mthrok deleted the load branch June 25, 2020 23:17

mpc001 pushed a commit to mpc001/audio that referenced this pull request Aug 4, 2023

loss detach (pytorch#731)

391be73

loss detach

mpc001 pushed a commit to mpc001/audio that referenced this pull request Aug 4, 2023

Revert "loss detach (pytorch#731)" (pytorch#758)

69d2798

This reverts commit 391be73.

Add TorchScript-able "load" func to sox_io backend #731

Add TorchScript-able "load" func to sox_io backend #731

Uh oh!

Conversation

mthrok commented Jun 18, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Jun 19, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

vincentqb Jun 23, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

vincentqb Jun 23, 2020

Choose a reason for hiding this comment

Uh oh!

Uh oh!

vincentqb Jun 24, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

vincentqb Jun 24, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mthrok Jun 25, 2020

Choose a reason for hiding this comment

Uh oh!

vincentqb Jun 25, 2020

Choose a reason for hiding this comment

Uh oh!

mthrok Jun 25, 2020

Choose a reason for hiding this comment

Uh oh!

vincentqb Jun 25, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

vincentqb left a comment

Choose a reason for hiding this comment

Uh oh!

vincentqb Jun 25, 2020

Choose a reason for hiding this comment

Uh oh!

mthrok Jun 25, 2020

Choose a reason for hiding this comment

Uh oh!

mthrok commented Jun 25, 2020

Uh oh!

mthrok commented Jun 25, 2020

Uh oh!

Uh oh!

mthrok commented Jun 18, 2020 •

edited

Loading

codecov bot commented Jun 19, 2020 •

edited

Loading

vincentqb Jun 23, 2020 •

edited

Loading

vincentqb Jun 24, 2020 •

edited

Loading

vincentqb Jun 24, 2020 •

edited

Loading

vincentqb Jun 25, 2020 •

edited

Loading