Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: Support interleaved stereo data #407

Open
mcclure opened this issue Sep 24, 2023 · 4 comments
Open

Feature request: Support interleaved stereo data #407

mcclure opened this issue Sep 24, 2023 · 4 comments

Comments

@mcclure
Copy link
Contributor

mcclure commented Sep 24, 2023

Although python-soundfile is usually used with NumPy, it does support non-NumPy use. A great feature for non-numpy use would be an option to allow inputs to functions such as Soundfile.write() to be input in interleaved stereo format instead of multidimensional array format when the data is stereo. Often data is already in interleaved format for various reasons, and (I could be wrong about this) I think that generating a Python array by concatenating many values to a single list will generate less garbage than producing a 2xMany list of lists.

@bastibe
Copy link
Owner

bastibe commented Sep 30, 2023

There are the buffer_read, buffer_read_into, and buffer_write functions which do essentially that. And anyways, a C-order numpy array is internally in interleaved order as well.

@mcclure
Copy link
Contributor Author

mcclure commented Sep 30, 2023

That's very useful, thanks. Can you clarify—
https://python-soundfile.readthedocs.io/en/0.11.0/#soundfile.SoundFile.buffer_write
What endianness properties should I expect buffer_read/buffer_write to have? Will it depend on format?

@bastibe
Copy link
Owner

bastibe commented Sep 30, 2023

Yes, it entirely depends on the format. Have a look at https://libsndfile.github.io/libsndfile/api.html#raw for more information. I'd expect this to only work reasonably for uncompressed PCM-like formats.

@mgeier
Copy link
Contributor

mgeier commented Sep 30, 2023

The situation is admittedly a bit complicated, so it is easy to be confused.

The parametersformat and subtype (and endian as well) only matter for the storage of the audio data in the file.

The data that you are handling in your Python code is entirely independent of that. In the buffer_*() methods, the dtype argument specifies the data type that you are handling.

Here's a hopefully illustrative example:

>>> import soundfile as sf
>>> sf.write('myfile.aiff', [1.0], subtype='PCM_16', samplerate=48000)

This creates a 16-bit file containing the largest possible signal value.
The given float value is automatically converted to a 16-bit integer.

>>> f = sf.SoundFile('myfile.aiff')
>>> bytes(f.buffer_read(dtype='int16'))
b'\xff\x7f'

As you can see, you still have to specify the dtype when reading the data. In this case, we are reading the same data type that's stored in the file, but that is not required.

And to come back to the question about endianness: If you look at the file contents (e.g. with xxd myfile.aiff), you see the contents at the very last two bytes: 7fff.

AIFF files are stored in big-endian format, but as you can see above, we got the bytes in little-endian format. We are getting native endianness. If you run this on a big-endian system, you should get b'\x7f\xff' (but I didn't try this because I don't have a big-endian system).

We can continue exploring:

>>> f.seek(0)
0
>>> bytes(f.buffer_read(dtype='int32'))
b'\x00\x00\xff\x7f'

We can read the value as 32-bit integer, even though it is stored as 16-bit integer in the file.
And the important thing is that libsndfile scales the value to be the largest 32-bit integer!

In summary:

What endianness properties should I expect buffer_read/buffer_write to have?

Native endianness.

Will it depend on format?

Nope.

mcclure added a commit to mcclure/python-soundfile that referenced this issue Oct 2, 2023
* Clarifies range of python integer input to write(), addressing issue bastibe#405.
* Clarifies, which came up in issue bastibe#407.
* Clarifies how to build the docs, which confused me while preparing this PR
  (Simply installing Sphinx is not enough)
mcclure added a commit to mcclure/python-soundfile that referenced this issue Oct 2, 2023
* Clarifies range of python integer input to write(), addressing issue bastibe#405.
* Clarifies endianness of buffer_write input, which came up in issue bastibe#407.
* Clarifies how to build the docs, which confused me while preparing this PR
  (Simply installing Sphinx is not enough)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants