ENH: improve dtype check in wavfile.write #18828

JuanFMontesinos · 2023-07-05T14:57:40Z

What does this implement/fix?

scipy.io.wavfile.write has a wrong data type check. It checks a numpy array to be "float" or "integer" in a generic way. However float16 is not supported by any codec but accepted by the function, leading to a silent error.
Likewise, exotic data types such as float128, int128 and so on are neither supported.

Additional information

While it's complicated to find what dtypes are exactly supported, I rely on ffmpeg formats as they are a superset of scipy's ones.
int64 is also supported despite not appearing.

From your references

tylerjereddy · 2023-07-05T20:22:18Z

scipy/io/wavfile.py

-                                                 data.dtype.itemsize == 1)):
+        allowed_dtypes = ['float32', 'float64',
+                          'uint8', 'int8', 'int16', 'int32', 'int64']
+        if data.dtype.name not in allowed_dtypes:


In case the team wants a regression test I noticed that this fails on main and passes here:

--- a/scipy/io/tests/test_wavfile.py +++ b/scipy/io/tests/test_wavfile.py @@ -414,3 +414,15 @@ def test_write_roundtrip(realfile, mmap, rate, channels, dt_str, tmpdir): # in PyPy; since the filename gets reused in this test, clean this up break_cycles() break_cycles() + + +@pytest.mark.parametrize("dtype", [ + np.float16, + ]) +def test_wavfile_dtype_unsupported(tmpdir, dtype): + tmpfile = str(tmpdir.join('temp.wav')) + rng = np.random.default_rng(1234) + data = rng.random((100, 5)).astype(dtype) + rate = 8000 + with pytest.raises(ValueError, match="Unsupported"): + wavfile.write(tmpfile, rate, data)

tylerjereddy · 2023-07-05T20:24:38Z

scipy/io/wavfile.py

-        if not (dkind == 'i' or dkind == 'f' or (dkind == 'u' and
-                                                 data.dtype.itemsize == 1)):
+        allowed_dtypes = ['float32', 'float64',
+                          'uint8', 'int8', 'int16', 'int32', 'int64']


I know int8 was already allowed in the old check, but I'm wondering about it specifically because in scipy/io/tests/test_wavfile.py above test_write_roundtrip I see signed 8-bit integer PCM is not allowed and a similar statement in the docs indicating that Note that 8-bit PCM is unsigned.

Well, I don't know the intrinsecs but the docs only show common data types, not all data types.
The second table is extracted from the references provided in the docs:
IBM Corporation and Microsoft Corporation, “Multimedia Programming Interface and Data Specifications 1.0”, section “Data Format of the Samples”, August 1991 http://www.tactilemedia.com/info/MCI_Control_Info.html

I need to recheck but I think I did a quick test and ffprobe was accepting the output for int8.

You are right, according to ffprobe info it seems it's processed as a uint8. Fixed at 33b8a51

j-bowhay · 2023-08-24T20:49:18Z

@JuanFMontesinos could you add the suggested regression test, please?

JuanFMontesinos · 2023-09-04T09:18:13Z

@j-bowhay

Done at 37f0d17! Although tests are failing due to an import error out of my control.

lucascolley · 2024-01-19T12:24:33Z

Hey @JuanFMontesinos , would you mind re-triggering CI here? Pushing an empty commit will do it.

JuanFMontesinos · 2024-01-19T18:09:19Z

Hey @JuanFMontesinos , would you mind re-triggering CI here? Pushing an empty commit will do it.

Done!

lucascolley · 2024-01-19T19:24:07Z

Okay, the docs build should pass if you merge main into this branch. Then a maintainer can approve the GHA workflows.

[skip cirrus]

lucascolley

CI is happy, the change is simple enough and the regression test is there, so pulling this in. Thanks for your first contribution to SciPy @JuanFMontesinos ! And thanks for the test @tylerjereddy

JuanFMontesinos · 2024-02-29T13:12:25Z

@lucascolley @tylerjereddy Thank you very much and thanks all the developers for such a nice work with SciPy.

BUG: wrongf dtype check in wavfile.write

998859f

github-actions bot added the scipy.io label Jul 5, 2023

tupui added the enhancement A new feature or improvement label Jul 5, 2023

tupui changed the title ~~BUG: wrong dtype check in wavfile.write~~ ENH: improve dtype check in wavfile.write Jul 5, 2023

tylerjereddy reviewed Jul 5, 2023

View reviewed changes

Removing int8 from allowed dtypes in wavfile.write

33b8a51

float16 dtype check io.wavfile.write

37f0d17

trigger CICD

2803214

lucascolley self-requested a review January 19, 2024 19:24

lucascolley added 2 commits February 29, 2024 11:01

Merge remote-tracking branch 'upstream/main' into JuanFMontesinos/main

4a99741

[skip cirrus]

poke ci

4096611

[skip cirrus]

lucascolley added this to the 1.13.0 milestone Feb 29, 2024

lucascolley approved these changes Feb 29, 2024

View reviewed changes

lucascolley merged commit b815895 into scipy:main Feb 29, 2024
27 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ENH: improve dtype check in wavfile.write #18828

ENH: improve dtype check in wavfile.write #18828

JuanFMontesinos commented Jul 5, 2023 •

edited

tylerjereddy Jul 5, 2023

tylerjereddy Jul 5, 2023

JuanFMontesinos Jul 5, 2023

JuanFMontesinos Jul 6, 2023 •

edited

j-bowhay commented Aug 24, 2023

JuanFMontesinos commented Sep 4, 2023

lucascolley commented Jan 19, 2024

JuanFMontesinos commented Jan 19, 2024

lucascolley commented Jan 19, 2024

lucascolley left a comment

JuanFMontesinos commented Feb 29, 2024

ENH: improve dtype check in wavfile.write #18828

ENH: improve dtype check in wavfile.write #18828

Conversation

JuanFMontesinos commented Jul 5, 2023 • edited

What does this implement/fix?

Additional information

tylerjereddy Jul 5, 2023

Choose a reason for hiding this comment

tylerjereddy Jul 5, 2023

Choose a reason for hiding this comment

JuanFMontesinos Jul 5, 2023

Choose a reason for hiding this comment

JuanFMontesinos Jul 6, 2023 • edited

Choose a reason for hiding this comment

j-bowhay commented Aug 24, 2023

JuanFMontesinos commented Sep 4, 2023

lucascolley commented Jan 19, 2024

JuanFMontesinos commented Jan 19, 2024

lucascolley commented Jan 19, 2024

lucascolley left a comment

Choose a reason for hiding this comment

JuanFMontesinos commented Feb 29, 2024

JuanFMontesinos commented Jul 5, 2023 •

edited

JuanFMontesinos Jul 6, 2023 •

edited