Skip to content

torchaudio.load to optionally accept a target sample_rate (and maybe backend=) #2586

@vadimkantorov

Description

@vadimkantorov

🚀 The feature

E.g. OPUS format supports resampling as part of reading. There is no standard and uniform way of setting sample rate at decoding.

E.g. sox sets it always as 48khz: https://github.com/dmkrepo/libsox/blob/master/src/opus.c#L114 (unofficial repo)
while original opusdec itself tries to first copy it from original source sample rate stored in stream header: https://github.com/xiph/opus-tools/blob/master/src/opusdec.c#L897

Fixing sox to do what opusdec does probably should be a feature request to sox and to ffmpeg. But probably torchaudio should support passing some forced sample_rate and built-in resampling if decoder supports it

It may also be a good idea to directly accept a backend= argument as well. This would avoid maintaining it as a global variable and eliminate the need for dataloader worker init code for setting the backend. (Personally, I would even think that the global variable should be phased out in favor of an explicit argument with a default argument)

Motivation, pitch

N/A

Alternatives

No response

Additional context

No response

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions