Audio decoding format issues

### 🐛 Describe the bug

Hello torchcodec team !

Thank you for the last 0.3 release and the long-awaited support for ✨audio decoding✨ ! For now, `AudioDecoder` works like a charm, but I have a few remarks that would make it even more pleasant to use :

1) `AudioSamples` shape format

As for now, audio data stored in `AudioSamples` is stored in a tensor of shape `[num_channels, num_samples]` whereas most audio library (including `sounddevice` and `soundfile`) opt for a `[num_samples, num_channels]` shape. It's just a transposition, and it would be nice to match the reference shape !

2) Output format when `start_seconds` and `stop_seconds` are identical

When `start_seconds` and `stop_seconds` are set to the same value the audio data stored in `AudioSamples` "loses" the `num_channels` dimension, which is set to 0. This actually led to a bunch of issues on my side, and I think it would be nice to either raise the `Invalid start seconds` issue when `start_seconds` equals `stop_seconds` or keep the first dimension equals to the number of channels.

Thank you again for your work on audio decoding, and for taking my remarks into account !

### Versions

PyTorch version: 2.7.0
Is debug build: False
CUDA used to build PyTorch: None
ROCM used to build PyTorch: N/A

OS: macOS 15.4.1 (arm64)
GCC version: Could not collect
Clang version: 16.0.0 (clang-1600.0.26.6)
CMake version: version 4.0.0
Libc version: N/A

Python version: 3.12.2 | packaged by conda-forge | (main, Feb 16 2024, 20:54:21) [Clang 16.0.6 ] (64-bit runtime)
Python platform: macOS-15.4.1-arm64-arm-64bit
Is CUDA available: False
CUDA runtime version: No CUDA
CUDA_MODULE_LOADING set to: N/A
GPU models and configuration: No CUDA
Nvidia driver version: No CUDA
cuDNN version: No CUDA
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True

CPU:
Apple M4 Pro

Versions of relevant libraries:
[pip3] mypy-extensions==1.0.0
[pip3] numpy==2.2.4
[pip3] torch==2.7.0
[pip3] torchaudio==2.7.0
[pip3] torchcodec==0.3.0
[pip3] torchvision==0.22.0
[conda] libopenvino-pytorch-frontend 2024.4.0             h5833ebf_2    conda-forge
[conda] numpy                     2.2.4                    pypi_0    pypi
[conda] torch                     2.7.0                    pypi_0    pypi
[conda] torchaudio                2.7.0                    pypi_0    pypi
[conda] torchcodec                0.3.0                    pypi_0    pypi

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Audio decoding format issues #661

🐛 Describe the bug

Versions

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Audio decoding format issues #661

Description

🐛 Describe the bug

Versions

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions