By default, return float64, but provide options for reading different formats. #17

mgeier · 2014-03-05T13:01:18Z

I think it's a good idea to return np.float32 by default (although I'm not 100% sure because the default type in NumPy is np.float64).

If the file has 64bit values, however, they shouldn't be truncated by default.

So probably the default argument to read() should be format=None which should be changed appropriately within the function.

BTW, format may not be the best name choice for this parameter, maybe dtype would be rather what people would expect?

The text was updated successfully, but these errors were encountered:

bastibe · 2014-03-05T13:16:05Z

Interesting point. Come to think of it, I don't see many reasons why we shouldn't just switch everything over to np.float64 by default. In Numpy, pretty much everything is np.float64 by default.

The only argument against that would be that there is no microphone able to record more than 24 bits, let alone ears cabable of hearing more, which makes anything more than 32 bits kind of a waste. I can see 64 bit being useful for non-audio data though.

One could also make an argument that if the source data is, say, np.int16, read() should return exactly that. I don't think there is much merit to this, though, since numpy integers violate too many pythonic assumptions about numbers (no infinite precision and overflow). How about providing something like read_native(), which would return the native data format of the file?

mgeier · 2014-03-06T11:01:57Z

The more I think about it, the more I'm convinced that float64 is the way to go.
Anything else would be a surprise to a typical NumPy user (who doesn't have an audio processing background).

It's true that a sound card won't provide more than 24 meaningful bits, but we can't foresee what the user will be doing with the data. There are many DSP algorithms which only make sense when using double precision. And if a user wishes to save memory, they are free to explicitly choose float32 or even less.

I agree that it's not a good idea to return integers by default. In addition to your reasons it's also hard to keep track of the maximum range of the different integer formats.

I don't like the idea of read_native(). I think there is no need for a new method/function, this should be handled by an argument.
If float64 becomes the default this would have the added benefit that dtype=None would become available again and could be used to request the native type:

import pysoundfile as sf
data = sf.read('my16bit.wav')  # data.dtype == float64
data = sf.read('my16bit.wav', dtype='float32')  # data.dtype == float32
data = sf.read('my16bit.wav', dtype='int32')  # data.dtype == int32
data = sf.read('my16bit.wav', dtype=None)  # data.dtype == int16

Also, I'm not sure if there is a unique mapping from each subtype to one of the NumPy types.
But in case of doubt we can leave some mappings undefined and raise an error when dtype=None in these cases.
24-bit PCM files could be "upgraded" to int32 in this case ... or probably it would be safer to raise an error, too.

BTW, I implemented the change from the format argument to dtype in #18 (but float32 is still the default type).

bastibe · 2014-03-07T08:14:06Z

I agree. Let's go with float64 as a default.

Btw, I didn't know you could use 'float64' instead of np.float64. Thank you for showing me that!

For some reason, I have a bad feeling about going all float64 by default. But I can't see a reason for it. Strange. Probably a remnant of having had to work on some very limited DSPs.

mgeier · 2014-03-21T21:37:31Z

I tried to implement the select-dtype-based-on-subtype feature and found out that it's not really that great after all.

many subtypes don't have a meaningful default dtype, so they would have to be upgraded silently or an error would have to be raised, neither being really nice.
Also, the mapping would have to be defined in PySoundFile, because it cannot be obtained from libsndfile (if I didn't miss something).
I found out that dtype(None) creates an object dtype('float64'), so if we use dtype=None for automatic selection, this would be quite unexpected behavior.
If you really want to do this, you might just specify the desired dtype instead of dtype=None.

Long story short, I suggest to just make float64 the default type and drop the whole native-data-format thing.

bastibe · 2014-03-25T08:58:39Z

Agreed.

Closes #17

alexanderkainsky · 2021-08-19T22:16:39Z

What, then, is the proposed way to discover and load the native format? I am moving audio from a file system to a database and would like to store them in their native resolution.

bastibe · 2021-08-20T11:31:17Z

Essentially, check the subtype manually, then read with a matching dtype.

alexanderkainsky · 2021-08-28T00:38:24Z

Is there a function available for this manual check? Or do I need to inspect the header directly? Or use python's built-in io.wave package?

bastibe · 2021-08-29T11:34:41Z

subtype is a property of soundfile.SoundFile and can have any value of soundfile._subtypes.

Whistler7 mentioned this issue Mar 13, 2014

Support interface normalized to [-1,1) range? #20

Closed

bastibe assigned mgeier Mar 29, 2014

mgeier closed this as completed in fc956e9 Apr 20, 2014

mgeier added a commit that referenced this issue Dec 17, 2014

Return float64 by default

a2f67e2

Closes #17

bastibe unassigned mgeier Oct 21, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

By default, return float64, but provide options for reading different formats. #17

By default, return float64, but provide options for reading different formats. #17

mgeier commented Mar 5, 2014

bastibe commented Mar 5, 2014

mgeier commented Mar 6, 2014

bastibe commented Mar 7, 2014

mgeier commented Mar 21, 2014

bastibe commented Mar 25, 2014

alexanderkainsky commented Aug 19, 2021

bastibe commented Aug 20, 2021

alexanderkainsky commented Aug 28, 2021

bastibe commented Aug 29, 2021

By default, return float64, but provide options for reading different formats. #17

By default, return float64, but provide options for reading different formats. #17

Comments

mgeier commented Mar 5, 2014

bastibe commented Mar 5, 2014

mgeier commented Mar 6, 2014

bastibe commented Mar 7, 2014

mgeier commented Mar 21, 2014

bastibe commented Mar 25, 2014

alexanderkainsky commented Aug 19, 2021

bastibe commented Aug 20, 2021

alexanderkainsky commented Aug 28, 2021

bastibe commented Aug 29, 2021