Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Backend switch #355

Merged
merged 33 commits into from
Dec 19, 2019
Merged

Backend switch #355

merged 33 commits into from
Dec 19, 2019

Conversation

vincentqb
Copy link
Contributor

@vincentqb vincentqb commented Nov 26, 2019

Introduce a backend switch in a similar way to torchvision from pytorch/vision#153.

  • Offer an option to change backend to load files, as in torchvision (https://pytorch.org/docs/stable/torchvision/index.html).
  • Import sox only when sox is used at runtime, e.g. sox_effects or loading files.
  • Offer wrapper function to switch between backends for load/save. (Maintained current interface to avoid BC-breaking change.)
  • Move sox functions from __init__ to torchaudio.sox_backend. (torchaudio.sox_effects does not change.)
  • Add deprecation warning if calling sox functions using for instance torchaudio.initialize_sox.
  • Offer pysoundfile to experiment with new interface.
  • Add tests for soundfile backend.
  • Add test to load/save with different backends
  • Fix librosa test appearing here if not flaky? deactivate failing test #372

For later:

  • Add libsndfile directly (pysoundfile uses numpy arrays when wrapping libsndfile).
  • Add decorator to restrict which backend is supported by given function, see comment.
  • Make sure mechanism works in parallel context (e.g. could add a backend parameter to functions needing it)
  • Make sure mechanism is torchscriptable (e.g. torchscript currently does not support global variable)

Add to release notes:

  • SoxEffectsChain.EFFECTS_AVAILABLE replaced by SoxEffectsChain().EFFECTS_AVAILABLE

Comparison of backends
Internal doc

Fixes #329 by offering other backend as options. As such, this also is a first step in addressing #357.

@vincentqb vincentqb changed the title Sox Backend switch Nov 27, 2019
test/test.py Show resolved Hide resolved
encodinginfo,
filetype)
if get_audio_backend() == "sox":
waveform, sample_rate = sox_backend.load(
Copy link
Contributor Author

@vincentqb vincentqb Dec 3, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another way to branch would be by doing a conditional import but with the same name

if get_audio_backend() == "sox":
  from sox_backend import load
elif get_audio_backend() == "soundfile":
  from _soundfile_backend import load
else:
  raise NotImplementedError

waveform, sample_rate = load(
            filepath,
            out=out,
            normalization=normalization,
            channels_first=channels_first,
            num_frames=num_frames,
            offset=offset,
            filetype=filetype,
        )

Thoughts?

torchvision uses the backend switch like so but most of the time simply uses PIL.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might be able to save on these if statements if you overwrite the load functions etc. when switching backends, but i'd almost chalk that up under a performance optimization, so it's not necessary yet

@vincentqb
Copy link
Contributor Author

@cpuhrsch -- For this PR, I'd focus on the interface for the user, and leave a new backend for later. This still addresses the main issue of completely blocking the import of torchaudio when there is an issue with sox. Thoughts?

test/test.py Show resolved Hide resolved
"""
Specifies the package used to load.
Args:
backend (string): Name of the backend. one of {'sox'}.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: add soundfile

Having references to the sources of those backends in the doctstring could be useful too

encodinginfo,
filetype)
if get_audio_backend() == "sox":
waveform, sample_rate = sox_backend.load(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could still assign to a local load function and then move these branches higher (which will save on indentations) and make the code more readable

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point

@cpuhrsch
Copy link
Contributor

cpuhrsch commented Dec 3, 2019

Looks good so far!

From what I gather from the PR description there are no BC-breaking changes introduced here?

test/test.py Outdated
x_sine_part, _ = torchaudio.load(
input_sine_path, num_frames=num_frames, offset=offset
)
l1_error = (
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't want to be "that guy", but these code format changes make this harder to review. You could make them the last commit and continue to maintain that, or you could send a separate PR later on. They also introduce a lot of meaningless git blame changes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, (1) this was definitely not meant to be in, and (2) this was not quite ready for review :)

filetype=None):
r"""Saves a tensor of an audio signal to disk as a standard format like mp3, wav, etc.
if get_audio_backend() == "sox":
from torchaudio import sox_backend
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd carefully make sure that these repeated imports aren't expensive due to some kind of initialization code in sox.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or soundfile for that matter

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One way of avoiding repeated import is to import at the beginning and catch import errors, as done in vision. However, local test on my mac don't seem to see a cost to repeated import and seem to properly fetch the cached version:

In [1]: %timeit import torchaudio                                                                  
The slowest run took 17.73 times longer than the fastest. This could mean that an intermediate result is being cached.
637 ns ± 1.07 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [2]: %timeit import _torch_sox                                                                  
76.3 ns ± 1.84 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

src = src.contiguous()
_torch_sox.write_audio_file(filepath, src, signalinfo, encodinginfo, filetype)

def save_encinfo(*args, **kwargs):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current docs for encinfo don't even reference the function in the example. This is a strange one.


def sox_encodinginfo_t(*args, **kwargs):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

encoding information, signal information etc. are very useful functions in general. we could also think about adding some backend independent interfaces, but maybe after the release.

num_frames=0,
offset=0,
filetype=None,
):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I notice you're repeating the docstring here. But you call into the load backends separately. I don't think the user will be able to see this unless she looks at this function which is part of a private backend. Here assigning a backend function to the module unction could help use actually choose the correct doc string at runtime.

However, the static documentation won't be able to do that.

So instead I'd say it makes sense to have a single docstring for our load function and then reference it here.

Copy link
Contributor Author

@vincentqb vincentqb Dec 13, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When you say "reference", I assume you mean

"""See torchaudio.save"""

Unless you meant copying docstring from another with something like functools.wraps() or @functools.docs-decorator as you mentioned in this comment?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes that's correct.

@vincentqb
Copy link
Contributor Author

I can't reproduce locally the librosa error.

❯ conda create -n librosa-conda python=3.7
❯ conda activate librosa-conda
❯ conda install -c pytorch pytorch
❯ conda install -c conda-forge sox pysoundfile librosa
❯ conda install backports.tempfile
❯ python setup.py clean --all
❯ MACOSX_DEPLOYMENT_TARGET=10.9 CC=clang CXX=clang++ NO_CUDA=1 python setup.py install

❯ python test/test_transforms.py
----------------------------------------------------------------------
Ran 23 tests in 0.845s

OK
❯ python test/test.py                   
----------------------------------------------------------------------
Ran 7 tests in 0.164s

OK

@vincentqb vincentqb marked this pull request as ready for review December 5, 2019 23:14
test/test.py Outdated Show resolved Hide resolved
test/test.py Show resolved Hide resolved
test/test.py Outdated Show resolved Hide resolved
test/test.py Outdated
@@ -171,5 +300,21 @@ def test_5_get_info(self):
self.assertEqual(si.rate, rate)
self.assertEqual(ei.bits_per_sample, precision)

torchaudio.set_audio_backend(self.default_audio_backend)

def _test_5_get_info_soundfile(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there's repetition again?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There isn't because info returns a different struct depending on the backend.

test/test.py Outdated
self.assertEqual(si.channels, channels)
self.assertEqual(si.frames, samples)
self.assertEqual(si.samplerate, rate)
si_precision = _extract_digits(si.subtype)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is necessary we should at least write out as a todo to make this consistent. That can be done via a wrapper class that uses getattribute etc to align these attributes consistently.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should make sense to standardize on sox since otherwise we'll introduce BC-breaking changes to support this new backend.

@@ -242,6 +280,11 @@ def sox_signalinfo_t():
>>> si.precision = 16
>>> si.length = 0
"""

if get_audio_backend() != "sox":
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could create a generic decorator to write this less often

def _backend_guard(backends):
    def decorator(fn):
        @functools.docs-decorator(fn) # not sure about the name
        def _fn(*args, **kwargs):
            if get_audio_backend() not in backends:
                raise Runtime("fn {} requires backend to be one of".format(fn.__name__, backends)
           fn(*args, **kwargs)
       return _fn

@backend_support(['sox'])
def sox_signalinfo_t():
      ....

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, but let's make that a separate PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, added to standardize import error. :)

return save_encinfo(filepath, src, channels_first, si)

if get_audio_backend() == "sox":
func = _sox_backend.save
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could do

getattr(get_audio_backend_module() , 'save')

Do we want get_audio_backend() to return a module or a string that is the name?

A module can yield the name as well via introspection

test/test.py Outdated
with AudioBackendScope(backend2):
tensor2, sample_rate2 = torchaudio.load(output_path)

# tensor1 = tensor1.type(torch.FloatTensor)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: maybe you wanted to remove these?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for catching this!

Copy link
Contributor

@cpuhrsch cpuhrsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great! Added a small nit

@vincentqb vincentqb merged commit 774ebc7 into pytorch:master Dec 19, 2019
@tadas-subonis
Copy link

Is there any reason why

    # normalize if needed
    # _audio_normalization(out, normalization)

was commented out?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Inconsistent length between info() and load() for MP3 files
4 participants