Writing integer values through the non-numpy interface results in a wav of all 0s #405

mcclure · 2023-09-24T21:44:22Z

Consider this simple use of soundfile:

import soundfile as sf
SAMPLE_RATE = 48000
SHRT_MAX = 32767
with sf.SoundFile("testfile2.wav", mode = 'w', samplerate=SAMPLE_RATE, channels=2, subtype='PCM_16') as outfile:
    frames = []
    for i in range(SAMPLE_RATE):
        frames.append([i%SHRT_MAX, SHRT_MAX-(i%SHRT_MAX)])
    outfile.write(frames)

I expect this to create a wav containing a very slow sawtooth wave (it will not be audible but the waveform will be visible in an editor such as audacity) with a DC offset. Instead, it creates a wav in which all values are zero.

If I pass the values in as numpy rather than as python lists, it works and creates a usable, correct wav file:

import numpy as np
import soundfile as sf
SAMPLE_RATE = 48000
SHRT_MAX = 32767
with sf.SoundFile("testfile2.wav", mode = 'w', samplerate=SAMPLE_RATE, channels=2, subtype='PCM_16') as outfile:
    frames = []
    for i in range(SAMPLE_RATE):
        frames.append([i%SHRT_MAX, SHRT_MAX-(i%SHRT_MAX)])
    array = np.array(frames, dtype=np.int16)
    outfile.write(array)
    print(array)

It is possible that when using non-numpy input, Soundfile is expecting the values in some format (floats?) other than ints. If so, this is not documented. I assume PCM_16 values because I have specified PCM_16 subtype.

I have a non-numpy app in which I adopted python-soundfile because it had the ability to append to wavs. I understand python-soundfile is mostly intended for use with numpy, but it does advertise a non-numpy interface. Using without numpy, I encountered several inconvences, but this was the biggest problem because I could not resolve it by consulting the documentation. I ultimately had to just add numpy to my program.

The text was updated successfully, but these errors were encountered:

bastibe · 2023-09-30T11:29:23Z

In general, soundfile expects normalized floating point data between -1 and 1. In integer formats this is clearly not possible, so it scales between the appropriate minimum and maximum.

This is documented in soundfile.read, but not mentioned as clearly in soundfile.write. Would you like to contribute an improved documentation as a pull request?

mcclure · 2023-09-30T13:30:04Z

I am open to contributing a PR, but if I am going to write docs I need to be sure I understand the behavior.

When you say "scales between the appropriate minimum and maximum", you mean it scales from the range -1…1 to the minimum and maximum values of the output sound file's type?

This description surprises me because in my failed test above I was giving values outside the range -1…1 and they were being snapped to 0 rather than scaled or clipped to the -1…1 range. Is there special case handling for values outside -1…1?

bastibe · 2023-09-30T14:10:17Z

Integer values scale between the minimum and maximum allowable integer value. That's -2^15 ... (2^15 - 1) for int16, and -2^31 ... (2^31 - 1) for int32. Ignore the asymmetry between positive and negative values; IIRC, the scaling factor is actually always symmetrically 2^15 or 2^31, with the odd 2^15 clipped to 2^15-1.

Thus integer-1 scales 1/2^15, which is very close to zero.

But this only applies to numpy arrays with dtype int16 or int32. Python lists are converted to a float numpy array, which must be between -1 and 1.

mgeier · 2023-09-30T16:22:50Z

What @bastibe mentioned above is mostly correct, but Python lists are not necessarily converted to float arrays.

The code uses np.asarray(), which converts (nested) Python lists to NumPy arrays. If only integers are used, the dtype is chosen to be int64 by default (and we don't specify a dtype explicitly):

>>> import numpy as np
>>> np.asarray([32767]).dtype
dtype('int64')

This means that in your example, you are essentially writing 16-bit data to the least significant bytes of a 64-bit array.
Later, this 64-bit array is scaled down to PCM_16, losing all those insignificant bytes and leaving you with only zeros.

For comparison, in your NumPy example, you are writing your data to a 16-bit array, which is then not scaled down, leading to your expected result.

Using without numpy, I encountered several inconvences

I think the misunderstanding is that you think you are using it without NumPy, but as soon as you use the .write() method, you are indirectly using NumPy, even if you don't import NumPy in your own code.

I think this has been clarified already in #407: If you don't want to use NumPy, you should use the .buffer_*() methods.

mgeier · 2023-09-30T16:39:50Z

When you say "scales between the appropriate minimum and maximum", you mean it scales from the range -1…1 to the minimum and maximum values of the output sound file's type?

To get a clearer idea about the scaling and clipping behavior, you can have a look at the tests:

python-soundfile/tests/test_soundfile.py

Lines 256 to 271 in 0a8071d

    
           def test_write_float_data_to_pcm_file(file_inmemory): 
        
               float_to_clipped_int16 = [ 
        
                   (-1.0 - 2**-15, -2**15    ), 
        
                   (-1.0         , -2**15    ), 
        
                   (-1.0 + 2**-15, -2**15 + 1), 
        
                   ( 0.0         ,  0        ), 
        
                   ( 1.0 - 2**-14,  2**15 - 2), 
        
                   ( 1.0 - 2**-15,  2**15 - 1), 
        
                   ( 1.0         ,  2**15 - 1), 
        
               ] 
        
               written, expected = zip(*float_to_clipped_int16) 
        
               sf.write(file_inmemory, written, 44100, format='WAV', subtype='PCM_16') 
        
               file_inmemory.seek(0) 
        
               read, fs = sf.read(file_inmemory, dtype='int16') 
        
               assert np.all(read == expected) 
        
               assert fs == 44100

It also happens in the other direction, but that is a very exotic use case:

python-soundfile/tests/test_soundfile.py

Lines 952 to 974 in 0a8071d

    
           def test_read_int_data_from_float_file(file_inmemory): 
        
               """This is a very uncommon use case.""" 
        
               unnormalized_float_to_clipped_int16 = [ 
        
                   (-2.0**15 - 1 , -2**15), 
        
                   (-2.0**15     , -2**15), 
        
                   (-2.0**15 + 1 , -2**15 + 1), 
        
                   (-1.0         , -1), 
        
                   (-0.51        , -1), 
        
                   (-0.5         ,  0), 
        
                   ( 0.0         ,  0), 
        
                   ( 0.5         ,  0), 
        
                   ( 0.51        ,  1), 
        
                   ( 1.0         ,  1), 
        
                   ( 2.0**15 - 2 , 2**15 - 2), 
        
                   ( 2.0**15 - 1 , 2**15 - 1), 
        
                   ( 2.0**15     , 2**15 - 1), 
        
               ] 
        
               file_data, expected = zip(*unnormalized_float_to_clipped_int16) 
        
               sf.write(file_inmemory, file_data, 44100, format='WAV', subtype='FLOAT') 
        
               file_inmemory.seek(0) 
        
               read, fs = sf.read(file_inmemory, always_2d=False, dtype='int16') 
        
               assert np.all(read == expected) 
        
               assert fs == 44100

bastibe · 2023-09-30T17:05:21Z

Thank you for the clarification, @mgeier. I wasn't aware that numpy.ndarray converts an all-int array to int64. That's a surprising gotcha for our use case, actually.

So integer lists must be treated as int64, and scale between -2^63 and 2^63. That's kind of annoying.

mcclure · 2023-09-30T17:58:04Z

So integer lists must be treated as int64, and scale between -2^63 and 2^63. That's kind of annoying.

Hm, yeah, from a user perspective this is an unusual enough range that if it cannot be changed to something less surprising, it might actually be better for it to error on integer lists.

bastibe · 2023-10-01T08:20:04Z

I suppose we could work around this issue by forcing any non-numpy-array iterables to float64 in soundfile.py:1019. This would technically be a backwards-incompatible change, though, which I am somewhat weary to make.

Also, this is the first time I have heard of this issue, so I'd assume it is a rare occurrence. Nevertheless, it should be documented at the very least.

mcclure · 2023-10-01T14:17:19Z

I am going to attempt to send you one or more docs PRs today.

* Clarifies range of python integer input to write(), addressing issue bastibe#405. * Clarifies, which came up in issue bastibe#407. * Clarifies how to build the docs, which confused me while preparing this PR (Simply installing Sphinx is not enough)

* Clarifies range of python integer input to write(), addressing issue bastibe#405. * Clarifies endianness of buffer_write input, which came up in issue bastibe#407. * Clarifies how to build the docs, which confused me while preparing this PR (Simply installing Sphinx is not enough)

mcclure · 2023-10-04T19:37:45Z

I need clarification on something before I continue with #410.

The code uses np.asarray(), which converts (nested) Python lists to NumPy arrays. If only integers are used, the dtype is chosen to be int64 by default (and we don't specify a dtype explicitly)

This is not exactly what I see in testing.

Consider this sample program:

import soundfile
s = soundfile.SoundFile("outtest.wav", mode = 'w', samplerate=48000, channels=1, subtype='PCM_16')
s.write([0,1,(1<<16)-1,-(1<<31),(1<<31)-1])
print(s)

Result: Saves a wav file with 4 samples, first 2 are 0, third is minimum intensity, final is maximum intensity

import soundfile
s = soundfile.SoundFile("outtest.wav", mode = 'w', samplerate=48000, channels=1, subtype='PCM_16')
s.write([0,1,(1<<16)-1,-(1<<31),(1<<31)])
print(s)

Result:

Traceback (most recent call last):
  File "C:\Users\Andi\work\g\other\python-soundfile\soundfile.py", line 1328, in _check_dtype
    return _ffi_types[dtype]
           ~~~~~~~~~~^^^^^^^
KeyError: 'int64'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Users\Andi\work\g\other\python-soundfile\64TEST.py", line 3, in <module>
    s.write([0,1,(1<<16)-1,-(1<<31),(1<<31)])
  File "C:\Users\Andi\work\g\other\python-soundfile\soundfile.py", line 1020, in write
    written = self._array_io('write', data, len(data))
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Andi\work\g\other\python-soundfile\soundfile.py", line 1342, in _array_io
    ctype = self._check_dtype(array.dtype.name)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Andi\work\g\other\python-soundfile\soundfile.py", line 1330, in _check_dtype
    raise ValueError("dtype must be one of {0!r} and not {1!r}".format(
ValueError: dtype must be one of ['float32', 'float64', 'int16', 'int32'] and not 'int64'

In other words, from this and other tests I conclude the behavior for python int inputs is:

If all values are in the range of 32-bit INT_MIN to INT_MAX inclusive, the values are scaled FROM int32s TO the output format
If at least one value is outside the 32-bit signed integer range, there is an error

This behavior is non-ideal, but it seems to at least be consistent (I did not test on non-win10 systems).

I will change my documentation in #410 to match this behavior, but before I do, does this sound right? Am I missing anything?

mgeier · 2023-10-05T18:48:55Z

What does this return?

>>> import numpy as np
>>> np.asarray([1]).dtype

mcclure · 2023-10-06T01:03:42Z

Okay.
I have some very bad news.
In my Windows 10 venv:

(VENV) C:\Users\Andi\work\g\other\python-soundfile>python
Python 3.11.5 (tags/v3.11.5:cce6ba9, Aug 24 2023, 14:38:34) [MSC v.1936 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy as np
>>> np.asarray([1]).dtype
dtype('int32')
>>> np.version.version
'1.26.0'

In my WSL (linux VM) venv on the same machine:

(WSLVENV) Andi@Alita:/mnt/c/Users/Andi/work/g/other/python-soundfile$ python3
Python 3.8.10 (default, May 26 2023, 14:05:08)
[GCC 9.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy as np
>>> np.asarray([1]).dtype
dtype('int64')
>>> np.version.version
'1.24.4'

I can't explain this. What does it do on your machine?

If the scale is potentially different (32 or 64 bit) depending on system/configuration, this suggests to me the int form simply should not be used and the documentation should say something like "if the input is a python int array the behavior is undefined".

Alternately, before @bastibe said he didn't want to change the behavior saying

This would technically be a backwards-incompatible change, though, which I am somewhat weary to make.

But, if it was actually broken before (per the tests I pasted above, even if np manages to assign the array an int64, python-soundfile rejects it) maybe you don't need to worry about backward compatibility. If it never worked it seems like logically there must not be existing code to break?

bastibe · 2023-10-06T15:46:02Z

Oh, this is indeed terrible news. Thank you very much for the analysis!

So essentially, there is no consistent option for handling lists of integers across platforms. We could force any non-numpy type to float, but this would break lists of integers entirely. I don't really see a good way forward beyond documenting that "any non-numpy data will be converted to numpy using np.asarray", and noting that using non-numpy integer data is a very bad idea indeed.

In fact, we should consider issuing a warning whenever np.asarray returns an integer type (and it wasn't already a numpy array beforehand). What are your thoughts on that?

mcclure mentioned this issue Oct 2, 2023

Documentation clarifications #410

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Writing integer values through the non-numpy interface results in a wav of all 0s #405

Writing integer values through the non-numpy interface results in a wav of all 0s #405

mcclure commented Sep 24, 2023 •

edited

Loading

bastibe commented Sep 30, 2023

mcclure commented Sep 30, 2023

bastibe commented Sep 30, 2023

mgeier commented Sep 30, 2023

mgeier commented Sep 30, 2023

bastibe commented Sep 30, 2023 •

edited

Loading

mcclure commented Sep 30, 2023

bastibe commented Oct 1, 2023

mcclure commented Oct 1, 2023

mcclure commented Oct 4, 2023 •

edited

Loading

mgeier commented Oct 5, 2023

mcclure commented Oct 6, 2023

bastibe commented Oct 6, 2023 •

edited

Loading

Writing integer values through the non-numpy interface results in a wav of all 0s #405

Writing integer values through the non-numpy interface results in a wav of all 0s #405

Comments

mcclure commented Sep 24, 2023 • edited Loading

bastibe commented Sep 30, 2023

mcclure commented Sep 30, 2023

bastibe commented Sep 30, 2023

mgeier commented Sep 30, 2023

mgeier commented Sep 30, 2023

bastibe commented Sep 30, 2023 • edited Loading

mcclure commented Sep 30, 2023

bastibe commented Oct 1, 2023

mcclure commented Oct 1, 2023

mcclure commented Oct 4, 2023 • edited Loading

mgeier commented Oct 5, 2023

mcclure commented Oct 6, 2023

bastibe commented Oct 6, 2023 • edited Loading

mcclure commented Sep 24, 2023 •

edited

Loading

bastibe commented Sep 30, 2023 •

edited

Loading

mcclure commented Oct 4, 2023 •

edited

Loading

bastibe commented Oct 6, 2023 •

edited

Loading