[Announcement] Improving I/O for correct and consistent experience #903

mthrok · 2020-09-10T23:12:11Z

tl;dr: how to migrate to new backend/interface in 0.7

If you are using torchaudio in Linux/macOS environments, please use torchaudio.set_audio_backend("sox_io") to adopt to the upcoming changes.
If you are in Windows environment, please set torchaudio.USE_SOUNDFILE_LEGACY_INTERFACE = False and reload backend to use the new interface.
Note that this ships with some bug-fixes for formats other than 16bit signed integer WAV, so you might experience some BC-breaking changes as described in the section below.

News
[UPDATE] 2021/03/06

All the migration works have been completed on master branch.

[UPDATE] 2021/02/12

Added bits_per_sample and encoding argument (replaced dtype) to save function.

[UPDATE] 2021/01/29

Added encoding to AudioMetaData

[UPDATE] 2021/01/22

Added format argument to load/info/save function.
bits_per_sample to AudioMetaData

[UPDATE] 2020/10/21

Added Description of "soundfile" backend legacy interface.

[UPDATE] 2020/09/18

Added migration guide for "soundfile" backend.
Moved the phase when "soundfile" backend signatures change from 0.9.0 to 0.8.0 so that they match with "sox_io" backend, which becomes default in 0.8.0.

[UPDATE] 2020/09/17

Added information on deprecation of native libsox structures such as signalinfo_t and encoding_t.

Improving I/O for correct and consistent experience

This is an announcement for users that we are making backward-incompatible changes to I/O functions of torchaudio backends from 0.7.0 release throughout 0.9.0 release.

What is affected?

Public APIs
- torchaudio.load
  - [Linux/macOS] By switching the default backend from "sox" backend to "sox_io" backend in 0.8.0, loading audio formats other than 16bit signed integer WAV returns the correct tensor.
  - [Linux/macOS/Windows] The signature of "soundfile" backend will be change in 0.8.0 to match that of "sox_io" backend.
- torchaudio.save
  - [Linux/macOS] By switching to "sox_io" backend, saving audio files will no longer degrade the data. The supported format will be restricted to the tested formats only. (please refer to the doc for the supported formats.)
  - [Linux/macOS/Windows] The signature of "soundfile" backend will be change in 0.8.0 to match that of "sox_io" backend.
- torchaudio.info
  - [Linux/macOS/Windows] The signature of "soundfile" backend will be change in 0.8.0 to match that of "sox_io" backend.
- torchaudio.load_wav
  - will be removed in 0.9.0. (load function with normalize=False will provide the same functionality)
Internal APIs
The following functions/classes of "sox" backend were accidentally exposed and will be removed in 0.9.0. There is no replacement for them. Please use save/load/info functions.
- torchaudio.save_encinfo
  - will be removed in 0.9.0
- torchaudio.get_sox_signalinfo_t
  - will be removed in 0.9.0
- torchaudio.get_sox_encodinginfo_t
  - will be removed in 0.9.0
- torchaudio.get_sox_option_t
  - will be removed in 0.9.0
- torchaudio.get_sox_bool
  - will be removed in 0.9.0

The signatures of the other backends are not planned to be changed within this overhaul plan.

Classes
- torchaudio.SignalInfo and torchaudio.EncodingInfo
  - will be replaced with AudioMetaData in 0.8.0 for "soundfile" backend
  - will be removed in 0.9.0

Why

There are currently three backends in torchaudio. (Please refer to the documentation for the detail.)

"sox" backend is the original backend, which binds libsox with pybind11. The functionalities (load / save / info) of this backend are not well-tested and have number of issues. (See #726).

Fixing these issues in backward-compatible manner is not straightforward. Therefore while we were adding TorchScript-compatible I/O functions, we decided to deprecate this original "sox" backend and replace it with the new backend ("sox_io" backend), which is confirmed not to have those issues.

When we are switching the default backend for Linux/macOS from "sox" to "sox_io" backend, we would like to align the interface of "soundfile" backend, therefore, we introduced the new interface (not a new backend to reduce the number of public API) to "soundfile" backend.

When / What Changes

The following is the timeline for the planned changes;

Phase	Expected Release	Expected Changes
1	0.7.0 (Oct 2020)	`"sox"` backend issues deprecation warning. ~~Add deprecation warning to sox backend #904~~ `"soundfile"` backend issues warning of expected signature change. ~~Add expected BC-breaking change warning to soundfile #906~~ Add the new interface to `"soubdfile"` backend. ~~Add soundfile compatibility backend #922~~ `load_wav` function of all backends are marked as deprecated. ~~Add deprecation warnings to load_wav functions #905~~
2	0.8.0 (March 2021)	[BC-Breaking] `"sox_io"` backend becomes default backend. Function signatures of `"soundfile"` backend are aligned with `"sox_io"` backend. ~~Switch the default backend to the ones with new interfaces #978~~ `get_sox_XXX` functions issue deprecation warning. ~~Add deprecation warnings to libsox specific functions #975~~
3	0.9.0	`"sox"` backend is removed. ~~Removed legacy backends from torchaudio #1311~~ The legacy interface of `"soundfile"` backend is removed. ~~Removed legacy backends from torchaudio #1311~~ [BC-Breaking] `load_wav` functions are removed from all backends. ~~BC-Breaking: Remove deprecated load_wav functions from backends #1362~~

Planned signature changes of `"soundfile"` backend in 0.8.0

The following is the planned signature change of "soundfile" backend functions in 0.8.0 release.

`info` function

AudioMetaData implementation can be found here. The placement of the AudioMetaData might be changed.

~0.7.0	0.8.0
def info( filepath: str, ) -> Tuple[SignalInfo, EncodingInfo]	def info( filepath: str, format: Optional[str], ) -> AudioMetaData

Migration

The values returned from info function will be changed. Please use the corresponding new attributes.

~0.7.0

0.8.0

si, ei = torchaudio.info(filepath)
sample_rate = si.rate
num_frames = si.length
num_channels = si.channels
precision = si.precision
bits_per_sample = ei.bits_per_sample
encoding = ei.encoding

metadata = torchaudio.info(filepath)
sample_rate = metadata.sample_rate
num_frames = metadata.num_frames
num_channels = metadata.num_channels
bits_per_sample = metadata.bits_per_sample
encoding = metadata.encoding

Note If the attribute you are using is missing, file a Feature Request issue.

`load` function

~0.7.0

0.8.0

def load(
  filepath: str,
  # out: Optional[Tensor] = None,
      # To be removed.
      # Currently not used
      # Raise AssertionError if given
  normalization: Optional[bool] = True,
      # To be renamed to normalize.
      # Currently only accept True
      # Raise AssertionError if given
  channels_first: Optional[bool] = True,
  num_frames: int = 0,
  offset: int = 0,
      # To be renamed to frame_offset
  # signalinfo: SignalInfo = None,
      # To be removed
      # Currently not used
      # Raise AssertionError if given
  # encodinginfo: EncodingInfo = None,
      # To be removed
      # Currently not used
      # Raise AssertionError if given
  filetype: Optional[str] = None
      # To be removed
      # Currently not used
) -> Tuple[Tensor, int]

def load(
  filepath: str,
  frame_offset: int = 0,
  num_frames: int = -1,
  normalize: bool = True,
  channels_first: bool = True,
  format: Optional[str] = None,  # only required for file-like object input
) -> Tuple[Tensor, int]

Migration

Please change the argument names;

normalization -> normalize
offset -> frame_offst

~0.7.0

0.8.0

waveform, sample_rate = torchaudio.load(
    filepath,
    normalization=normalization,
    channels_first=channels_first,
    num_frames=num_frames,
    offset=offset,
)

waveform, sample_rate = torchaudio.load(
    filepath,
    frame_offset=frame_offset,
    num_frames=num_frames,
    normalize= normalization,
    channels_first=channels_first,
)

`save` function

~0.7.0

0.8.0

def save(
  filepath: str,
  src: Tensor,
  sample_rate: int,
  precision: int = 16,
    # moved to `bits_per_sample` argument
  channels_first: bool = True
)

def save(
  filepath: str,
  src: Tensor,
  sample_rate: int,
  channels_first: bool = True,
  compression: Optional[float] = None,
    # Added only for compatibility.
    # soundfile does not support compression option
    # Raises Warning if not None
  format: Optional[str] = None,
  encoding: Optoinal[str] = None,
  bits_per_sample: Optional[int] = None,
)

Migration

~0.7.0

0.8.0

torchaudio.save(
    filepath,
    waveform,
    sample_rate,
    channels_first
)

torchaudio.save(
    filepath,
    waveform,
    sample_rate,
    channels_first,
    bits_per_sample=16,
)
# You can also designate audio format with `format` and configure the encoding with `compression` and `encoding`. See https://pytorch.org/audio/master/backend.html#save for the detail

BC-breaking changes

Read and write operations on the formats other than WAV 16-bit signed integer were affected by small bugs.

The text was updated successfully, but these errors were encountered:

* Add deprecation warning to sox backend Refer to #903

As a part of the "sox" backend sunset plan (#903), we add a "soundfile" backend that is compatible with the "sox_io" backend. No new public backend name is added. We provide a switch to change the interface/behavior of "soundfile" backend. This commit contains; - The implementation of the new "soundfile" backend. - The flag to switch the behavior of "soundfile" backend. (`torchaudio.USE_SOUNDFILE_LEGACY_INTERFACE`) - Test for the new backend and switching mechanism. The default behavior of "soundfile" backend is not changed. The users who want to opt-in the new "soundfile" interface can do so by `torchaudio.USE_SOUNDFILE_LEGACY_INTERFACE = False` before changing the backend to "soundfile". In 0.8.0 release, the "soundfile" backend will use this interface by default, and users can still use the legacy one with `torchaudio.USE_SOUNDFILE_LEGACY_INTERFACE = True`. In 0.9.0, the legacy interface is removed and `torchaudio.USE_SOUNDFILE_LEGACY_INTERFACE` flag will be eventually removed.

snakers4 · 2020-10-22T06:03:15Z

Fixing these issues in backward-compatible manner is not straightforward. Therefore while we were adding TorchScript-compatible I/O functions, we decided to deprecate this original "sox" backend and replace it with the new backend ("sox_io" backend), which is confirmed not to have those issues.

When we are switching the default backend for Linux/macOS from "sox" to "sox_io" backend, we would like to align the interface of "soundfile" backend, therefore, we introduced the new interface (not a new backend to reduce the number of public API) to "soundfile" backend.

Just a quick question, does it mean that since 0.7 or 0.8 we can include torchaudio.load inside of our jit-traced modules? Are you planning to support only Linux, or will you also have a list of binaries for some other platforms (i.e. mobile, raspberry pi)? With soundfile backend?

mthrok · 2020-10-22T19:39:17Z

Hi @snakers4

does it mean that since 0.7 or 0.8 we can include torchaudio.load inside of our jit-traced modules?

Yes. Technically, you can do it already with 0.6, however, the corresponding library is not available in any form yet, so you cannot run it outside Python application.
I have a prototype C++ app in my branch which depends on refactored torchaudio. The model I used can be found here

I plan to propose this to the team after the release work, but no fixed time frame for landing it yet or even I am not sure if I can land this.
This was an exercise to learn how much we can do with TorchScript, and I have found that the I/O-capability is very limited. It can only load audio data from files. I intend to look into other ways to get tensor data (like pass memory objects to TorchScript) but it's not in the top priority in my list.

Are you planning to support only Linux, or will you also have a list of binaries for some other platforms (i.e. mobile, raspberry pi)?

We are considering the possibility to add an I/O module (not another backend but something like torchaudio.io), that works not just on Linux/macOS, but also on Windows. We are thinking to bind a correction of codecs libraries that are cross-platform. Mobile is not necessarily in our scope, because we do not have an infrastructure to test them, or we have not seen a demand for it yet. Hypothetically, if the refactored torchaudio is landed, the build-process will be CMake, so it will be easier for those familiar with CMake, but again, these plans are not finalized. We are trying to figure out a good "research to production" usecase.

With soundfile backend?

The Python "soudfile" package is not TorchScript compatible, so one of the thing we are considering as a part of the I/O module described above is to bind libsnd directly.

snakers4 · 2020-10-24T12:52:01Z

Nice! This is probably months from becoming actually useful by end users like us, but this increases the value of pytorch ecosystem quite a bit

Btw, currently a vad in torch audio seems to be a port of some energy based algorithm

We are planning to make a public general torch-scriptable noise / voise / music VAD pre-trained on large voice / noise / music corpora

Guess we could collaborate on that

mthrok · 2020-10-26T20:36:02Z

@snakers4

Nice! This is probably months from becoming actually useful by end users like us,

Ah, that's very optimistic view, although that's what I am aiming for. I am working on a RFC with example usage, so that community can respond. Then we will finalize the interface and will start working on the implementation.

but this increases the value of pytorch ecosystem quite a bit

Thanks, that's a nice reaction to have. One of the things we struggle is to get a signal from the community, so feedback like that is really helpful. (and motivating for me ;) )

Btw, currently a vad in torch audio seems to be a port of some energy based algorithm

The current VAD is basically, the port of sox implementation.

We are planning to make a public general torch-scriptable noise / voise / music VAD pre-trained on large voice / noise / music corpora

Guess we could collaborate on that

That's very interesting. Please keep us updated!

snakers4 · 2020-10-27T05:29:41Z

One of the things we struggle is to get a signal from the community, so feedback like that is really helpful. (and motivating for me ;) )

the current state of audio is that there are no go-to tools / components, that would work on all platforms
there is record.js for browsers, but porting models to js is a pain now (looks like the only decent option is re-implementing from scratch in tf.js, onnx.js has very poor layer support)
ofc, you can go low-level and compile everything for each platform, but usually you care about your algorithms working properly in real life first

in real projects you basically need a VAD + STT + some post-processing
VAD ideally should be served on edge to improve user experience, whereas STT can be better served via an API (if you use OPUS e.g. traffic is negligible)
there is nothing stopping us from making our own VAD in PyTorch, but the actual audio reading part will be outside as well

for edge deployments we still need 2-4x size reduction in model size (which is already achievable) but as I mentioned there still is no easy way to run a pytorch model in a browser

That's very interesting. Please keep us updated!

I will post an update here

Refer to #903 for the overview of planned I/O changes. * Change the default backend from `"sox"(deprecated)` to `"sox_io"` * Change the default interface of `"soundfile"` backend to the one identical to `"sox_io"` backend. * Deprecate torchaudio.USE_SOUNDFILE_LEGACY_INTERFACE * Update documentations * Re-order backends (default first) * Update overhaul timeline (removed 0.7.0) * Simplify `"soundfile"` backend description

tbazin · 2020-11-05T16:20:41Z

This is great news, this will definitely improve trust and adoption of torchaudio 🙂 !

In line [151-160](https://github.com/pytorch/audio/blob/master/examples/pipeline_wav2letter/main.py#L151) and Line [437](https://github.com/pytorch/audio/blob/fb3ef9ba427acd7db3084f988ab55169fab14854/examples/pipeline_wav2letter/main.py#L437) of main.py, the default value of `dataset-root` and `dataset-folder-in-archive` will be None, which prevents `main.py` from knowing where the dataset is actually in the computer and loading it. Moreover, `n-hidden-channels 2000` has not been defined in `main.py`, so it needs to be removed. Erro log: ```bash python main.py \ --reduce-lr-valid \ --dataset-train train-clean-100 train-clean-360 train-other-500 \ --dataset-valid dev-clean \ --batch-size 128 \ --learning-rate .6 \ --momentum .8 \ --weight-decay .00001 \ --clip-grad 0. \ --gamma .99 \ --hop-length 160 \ --win-length 400 \ --n-bins 13 \ --normalize \ --optimizer adadelta \ --scheduler reduceonplateau \ --epochs 30 /home/hoangtnm/anaconda3/envs/dl/lib/python3.7/site-packages/torchaudio/backend/utils.py:54: UserWarning: "sox" backend is being deprecated. The default backend will be changed to "sox_io" backend in 0.8.0 and "sox" backend will be removed in 0.9.0. Please migrate to "sox_io" backend. Please refer to pytorch#903 for the detail. '"sox" backend is being deprecated. ' INFO:root:Namespace(batch_size=128, checkpoint='', clip_grad=0.0, dataset_folder_in_archive=None, dataset_root=None, dataset_train=['train-clean-100', 'train-clean-360', 'train-other-500'], dataset_valid=['dev-clean'], decoder='greedy', distributed=False, epochs=30, eps=1e-08, freq_mask=0, gamma=0.99, hop_length=160, jit=False, learning_rate=0.6, momentum=0.8, n_bins=13, normalize=True, optimizer='adadelta', progress_bar=False, reduce_lr_valid=True, rho=0.95, scheduler='reduceonplateau', seed=0, start_epoch=0, time_mask=0, type='mfcc', weight_decay=1e-05, win_length=400, workers=0, world_size=8) INFO:root:Start time: 2020-11-28 21:18:22.337478 /home/hoangtnm/anaconda3/envs/dl/lib/python3.7/site-packages/torchaudio/backend/utils.py:64: UserWarning: The interface of "soundfile" backend is planned to change in 0.8.0 to match that of "sox_io" backend and the current interface will be removed in 0.9.0. To use the new interface, do `torchaudio.USE_SOUNDFILE_LEGACY_INTERFACE = False` before setting the backend to "soundfile". Please refer to pytorch#903 for the detail. 'The interface of "soundfile" backend is planned to change in 0.8.0 to ' Traceback (most recent call last): File "main.py", line 670, in <module> spawn_main(main, args) File "main.py", line 663, in spawn_main main(0, args) File "main.py", line 454, in main root=args.dataset_root, File "/media/aiteam/DATA/workspace/hoangtnm/audio/examples/pipeline_wav2letter/src/datasets.py", line 65, in split_process_vlsp2020asr return tuple(create(dataset) for dataset in datasets) File "/media/aiteam/DATA/workspace/hoangtnm/audio/examples/pipeline_wav2letter/src/datasets.py", line 65, in <genexpr> return tuple(create(dataset) for dataset in datasets) File "/media/aiteam/DATA/workspace/hoangtnm/audio/examples/pipeline_wav2letter/src/datasets.py", line 57, in create for tag, transform in zip(tags, transform_list) File "/media/aiteam/DATA/workspace/hoangtnm/audio/examples/pipeline_wav2letter/src/datasets.py", line 57, in <listcomp> for tag, transform in zip(tags, transform_list) File "/media/aiteam/DATA/workspace/hoangtnm/audio/examples/pipeline_wav2letter/src/datasets.py", line 15, in __init__ self._path = os.path.join(root, url) File "/home/hoangtnm/anaconda3/envs/dl/lib/python3.7/posixpath.py", line 80, in join a = os.fspath(a) TypeError: expected str, bytes or os.PathLike object, not NoneType ```

expectopatronum · 2020-12-14T10:24:27Z

This might be a stupid question, but should the warning UserWarning: "sox" backend is being deprecated. The default backend will be changed to "sox_io" backend in 0.8.0 and "sox" backend will be removed in 0.9.0. Please migrate to "sox_io" backend. Please refer to https://github.com/pytorch/audio/issues/903 for the detail. disappear after setting the backend?

I import torchaudio in the following way:

import torchaudio
torchaudio.set_audio_backend("sox_io")

but still get the above warning.

mthrok · 2020-12-14T13:27:33Z

Hi @expectopatronum

The warning is issued at the time import torchaudio is executed, where the default backend is set. I get that it's annoying and sorry for the confusion, but I really needed to raise a strong awareness as the sox backend was not handling data correctly.

ketanhdoshi · 2021-03-09T03:39:11Z

With torchaudio.load() in v0.8, the sox_io backend does not support 24-bit signed PCM audio files. Right now the only workaround is to switch back to the sox backend using torchaudio.set_audio_backend("sox").
Is 24-bit signed going to be supported in 0.9 before removing sox? Thanks!
It is not possible to convert the dataset I'm using to 16-bit or 32-bit.

Hi @ketanhdoshi

Thanks for the report. If it's causing you the trouble, we will definitely support it.
Since PyTorch does not have 24-bit int type. I need to think of a behavior when normalize=False.
In your use case, are you loading data in float32 type?
Also if you can tell us a command to generate the same type you are dealing with (with tools like ffmpeg or sox), that will be helpful.

Thanks @mthrok. Yes, data is being loaded as float32. Here's an example of a dataset that has many sound files that I'm using that are in 24-bit signed format.

aelimame · 2021-03-13T23:36:57Z

With torchaudio.load() in v0.8, the sox_io backend does not support 24-bit signed PCM audio files. Right now the only workaround is to switch back to the sox backend using torchaudio.set_audio_backend("sox").
Is 24-bit signed going to be supported in 0.9 before removing sox? Thanks!
It is not possible to convert the dataset I'm using to 16-bit or 32-bit.

Hi @ketanhdoshi
Thanks for the report. If it's causing you the trouble, we will definitely support it.
Since PyTorch does not have 24-bit int type. I need to think of a behavior when normalize=False.
In your use case, are you loading data in float32 type?
Also if you can tell us a command to generate the same type you are dealing with (with tools like ffmpeg or sox), that will be helpful.

Thanks @mthrok. Yes, data is being loaded as float32. Here's an example of a dataset that has many sound files that I'm using that are in 24-bit signed format.

I'm running into the same issue. I'm loading some 24bit audio files and sox_io fails to load them. I can use sox backend for now but would appreciate if 24bit format can be supported too in sox_io.

A good way to handle the normalize=False is to make it unsupported for this specific format given most of the time people would use normalize=True (at least that's what I do almost always). Another idea would be to convert the 24bit format automatically/internally to 32bit even if normalize=False.

Thanks

aelimame · 2021-03-17T01:26:33Z

@ketanhdoshi 24-bit support seems to have been added a couple days ago to the master branch #1389
I tested it (Nightly build) and seems to work for me!

mthrok · 2021-03-17T15:13:15Z

@aelimame @ketanhdoshi Sorry I forgot to let you know but we added 24-bit support.

It's nice to learn that it is working for you @aelimame.
@ketanhdoshi , please try the nightly build and see if it works. If not let us know.

mthrok · 2021-04-07T13:59:01Z

FYI: @ketanhdoshi @aelimame 24-bit support has been ported to release 0.8.1.

mthrok · 2021-06-15T17:31:58Z

Closing the issue as 0.9 is released which concludes the migration.
Thank you for all the people who gave feedback.

mthrok pinned this issue Sep 10, 2020

This was referenced Sep 10, 2020

Add deprecation warning to sox backend #904

Merged

Add deprecation warnings to load_wav functions #905

Merged

Add expected BC-breaking change warning to soundfile #906

Closed

Add tedlium dataset (all 3 releases) #882

Merged

mthrok added a commit that referenced this issue Sep 15, 2020

Add deprecation warning to sox backend (#904)

92b027b

* Add deprecation warning to sox backend Refer to #903

mthrok mentioned this issue Sep 17, 2020

torchaudio.sox_signalinfo_t is going to be deprecated. lhotse-speech/lhotse#79

Closed

mthrok mentioned this issue Sep 28, 2020

Add soundfile compatibility backend #922

Merged

1 task

vincentqb mentioned this issue Oct 2, 2020

[doc] Update backend docstring/documentation #935

Merged

This was referenced Oct 21, 2020

Expected changes in torchaudio.load snakers4/silero-models#25

Closed

Switch the default backend to the ones with new interfaces #978

Merged

mthrok changed the title ~~[Announcement] Replacing "sox" backend with "sox_io" backend~~ [Announcement] Overhauling I/O for correct and consistent experience Oct 21, 2020

vincentqb mentioned this issue Oct 21, 2020

Add deprecation warnings to libsox specific functions #975

Merged

mthrok changed the title ~~[Announcement] Overhauling I/O for correct and consistent experience~~ [Announcement] Improving I/O for correct and consistent experience Oct 21, 2020

mravanelli mentioned this issue Oct 29, 2020

Fix pooling segfault with pytorch1.7 speechbrain/speechbrain#378

Merged

vincentqb mentioned this issue Nov 5, 2020

1D tensor not supported in "sox_io" and new "soundfile" by save #1010

Closed

shanguanma mentioned this issue Nov 16, 2020

training model error k2-fsa/snowfall#17

Closed

iver56 mentioned this issue Nov 30, 2020

torchaudio compatibility fixes hbredin/torch-audiomentations#1

Merged

raj713335 mentioned this issue Dec 9, 2020

Getting an Assertion Error , i guess there is no function torchaudio.load() LearnedVector/A-Hackers-AI-Voice-Assistant#32

Closed

GregoryBetsey mentioned this issue Mar 26, 2021

Transcription error: wav file is empty BenAAndrew/Voice-Cloning-App#11

Closed

arthur465 mentioned this issue Apr 18, 2021

Training: list index out of range BenAAndrew/Voice-Cloning-App#24

Closed

myhrbeu mentioned this issue May 10, 2021

colab use upstream model could not run s3prl/s3prl#114

Closed

mthrok closed this as completed Jun 15, 2021

mthrok unpinned this issue Jun 15, 2021

hrahamim mentioned this issue Jul 1, 2021

convert_graph_to_onnx.py failing to run on Wav2Vec2 models huggingface/transformers#12456

Closed

2 tasks

Amels404 mentioned this issue Jul 9, 2021

Hyperparameter tuning with Conversational AI Models ray-project/ray#16878

Closed

myhrbeu mentioned this issue Jul 15, 2021

ModuleNotFoundError: No module named 'optimizers' s3prl/s3prl#150

Closed

meghmak13 mentioned this issue Sep 16, 2021

Quantization for CitriNet-1024-Gamma-0.25 NVIDIA/NeMo#2830

Closed

ghost mentioned this issue Sep 29, 2021

Errors in running unit test after development installation lhotse-speech/lhotse#408

Closed

Yingz-e mentioned this issue Sep 29, 2021

EOFError: Ran out of input ksanjeevan/crnn-audio-classification#22

Open

resurgo97 mentioned this issue Nov 29, 2021

Stuck in trainer.fit() openspeech-team/openspeech#127

Closed

adsf0078 mentioned this issue Dec 16, 2021

How to change cuda version espnet/espnet#3865

Closed

NathanJHLee mentioned this issue Feb 22, 2022

uing shard data type for librispeech, I got errors " WARNING error to parse" wenet-e2e/wenet#954

Closed

AdamMayor2018 mentioned this issue Apr 14, 2022

How to use diffrent obj model? facebookresearch/meshtalk#13

Closed

NK990 mentioned this issue May 30, 2022

Tacotron2.ipynb error Hydra NVIDIA/NeMo#4287

Closed

alexmehta added a commit to alexmehta/ABAW2020TNT-Modified that referenced this issue Jun 17, 2022

Fixed torchaudio for the newest version -- see pytorch/audio#903

8c52bb7

xinghua-qu mentioned this issue Jul 12, 2022

Could you please claim those augmentations that are differentiable? asteroid-team/torch-audiomentations#152

Closed

KajiMaCN mentioned this issue Oct 9, 2022

LazyFilter issues k2-fsa/icefall#608

Closed

Seoung-wook mentioned this issue Oct 13, 2022

I'm try to train with docker Alibaba-MIIL/AudioClassfication#13

Closed

jiangjin1999 mentioned this issue Oct 15, 2022

[Fairseq-Librispeech] 'AudioMetaData' object is not iterable" facebookresearch/fairseq#4434

Open

wangsheng3 mentioned this issue Oct 11, 2023

关于musan数据集的问题 TaoRuijie/ECAPA-TDNN#65

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Announcement] Improving I/O for correct and consistent experience #903

[Announcement] Improving I/O for correct and consistent experience #903

mthrok commented Sep 10, 2020 •

edited

Loading

snakers4 commented Oct 22, 2020

mthrok commented Oct 22, 2020

snakers4 commented Oct 24, 2020

mthrok commented Oct 26, 2020

snakers4 commented Oct 27, 2020 •

edited

Loading

tbazin commented Nov 5, 2020

expectopatronum commented Dec 14, 2020

mthrok commented Dec 14, 2020

ketanhdoshi commented Mar 9, 2021

aelimame commented Mar 13, 2021

aelimame commented Mar 17, 2021 •

edited

Loading

mthrok commented Mar 17, 2021

mthrok commented Apr 7, 2021

mthrok commented Jun 15, 2021

[Announcement] Improving I/O for correct and consistent experience #903

[Announcement] Improving I/O for correct and consistent experience #903

Comments

mthrok commented Sep 10, 2020 • edited Loading

Improving I/O for correct and consistent experience

What is affected?

Why

When / What Changes

Planned signature changes of "soundfile" backend in 0.8.0

info function

Migration

load function

Migration

save function

Migration

snakers4 commented Oct 22, 2020

mthrok commented Oct 22, 2020

snakers4 commented Oct 24, 2020

mthrok commented Oct 26, 2020

snakers4 commented Oct 27, 2020 • edited Loading

tbazin commented Nov 5, 2020

expectopatronum commented Dec 14, 2020

mthrok commented Dec 14, 2020

ketanhdoshi commented Mar 9, 2021

aelimame commented Mar 13, 2021

aelimame commented Mar 17, 2021 • edited Loading

mthrok commented Mar 17, 2021

mthrok commented Apr 7, 2021

mthrok commented Jun 15, 2021

mthrok commented Sep 10, 2020 •

edited

Loading

Planned signature changes of `"soundfile"` backend in 0.8.0

`info` function

`load` function

`save` function

snakers4 commented Oct 27, 2020 •

edited

Loading

aelimame commented Mar 17, 2021 •

edited

Loading