Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Soundfile read/write wav is not symmetric with default arguments #413

Open
jon-petter opened this issue Oct 16, 2023 · 3 comments
Open

Soundfile read/write wav is not symmetric with default arguments #413

jon-petter opened this issue Oct 16, 2023 · 3 comments

Comments

@jon-petter
Copy link

jon-petter commented Oct 16, 2023

I came across some unexpected behavior in soundfile (version 0.12.1) read/write.

If you have the following float array:

import numpy as np
data = np.array([0, -1, 0, 1, 0, -1, 0, 1, 0, -1, 0], dtype=np.float32)
print((data*(2**15 - 1)))

[     0. -32767.      0.  32767.      0. -32767.      0.  32767.      0.      -32767.      0.]

If you now write and read this data to a wav file using soundfile write and read (with default arguments), you get:

import soundfile as sf
sf.write('test.wav', data, 44100)
data, sample_rate = sf.read('test.wav')
print((data*(2**15 - 1)))

[     0.         -32767.              0.          32766.00003052
      0.         -32767.              0.          32766.00003052
      0.         -32767.              0.        ]

So my pure, max amplitude sin wave has now been reduced in amplitude, and a tiny DC offset has been introduced.

I understand that, when writing to PCM16, there would be quantization artifacts, but I was not expecting the positive and negative sides of the signal to be scaled differently (to this extent).

Is this scaling applied in soundfile code, or in some of the libs it builds upon?

My main question is why this asymmetric scaling is not reversed when using soundfile.read() with dtype="float64"?

@bastibe
Copy link
Owner

bastibe commented Oct 19, 2023

This is the unfortunate reality of integer numbers. The lowest possible 16-bit number is -2^15, but the highest possible is 2^15-1. When dealing with float inputs, you have to apply some scaling, and there is no correct answer.

  • Do you scale positive numbers differently from negative numbers? There will be (tiny) discontinuities at the zero crossings.
  • Do you scale to 2^15-1? Then you lose one value for negative numbers.
  • Do you scale to 2^15? Then you lose one value for positive numbers.

There's no right answer. But in reality, the differences between these is imperceptible.

Soundfile does not implement this, but merely passes the data on to libsndfile, which does the transformation.

If you need a perfect float representation, you could always use a native float format, such as MAT5, or (IIRC) Flac or WAV with the FLOAT subtype.

@jon-petter
Copy link
Author

jon-petter commented Oct 19, 2023

Yes. I understand that. I was mostly wondering why the scaling is different on write and read, but it is a problem with libsndfile then?

At least, this is the behavior I observe:

  • Write: different scaling factor for negative and positive values
  • Read: Equal scaling factor for all values

Anyhow, I understand that I'm complaining about a 1/2**15 max quantization error vs a1/2**16 max error, and these differences, as you say, are probably imperceptible.

@bastibe
Copy link
Owner

bastibe commented Oct 21, 2023

The problem is not that read and write are different, but that +1 is not representable. If you use values <1, it should be symmetric.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants