Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support files with frame rates over 48KHz and don't use temporary filenames #262

Merged
merged 32 commits into from May 24, 2018

Conversation

antlarr
Copy link
Contributor

@antlarr antlarr commented Mar 20, 2018

This allows pydub to open files having more than 48KHz
and/or 32-bit data if scipy is available.

If scipy is not available, it falls back to using the standard
wave module as before.

Also, add a new from_file function that does all the reading on memory with
pipes, not using any temporary file, which is faster and doesn't wear
down disks for heavy usages.

The new from_file function reads the input file and passes it to ffmpeg
using a pipe and then reads ffmpeg output using another pipe directly
to memory.

Since wav files have the file length in the header and ffmpeg can't
write it since it's working on a stream, we modify the resulting raw data
from ffmpeg before reading it using the standard method.

Also, rename from_file to from_file_using_temporary_files just in case there's
any case in which the new from_file function doesn't work (I couldn't find any,
but just in case, I guess it would be nice to keep it maybe as
deprecated).

Fixes #134
Fixes #237
Might also fix #209

@jiaaro
Copy link
Owner

jiaaro commented Mar 22, 2018

This is great!

Would you mind adding some short clips to the test data at these higher sample rates and a few short tests verifying they can be read and exported correctly?

Also, if you pull from master it should fix your test failures here.

@antlarr
Copy link
Contributor Author

antlarr commented Mar 22, 2018

Sure, I'll add some tests for high sample rates in the following days. I haven't worked on export yet since I don't need to export audio on my workflow with pydub, but I'll try to work on that too.

Btw, I just noticed you just dropped python 2.6 support. The python2 travis build is failing because I tested on python2.7 (where struct.unpack_from accepts a bytearray parameter) and I was wondering how to test such an old python version to fix it. So dropping 2.6 support is good news :)

This allows pydub to open files having more than 48KHz
and/or 32-bit data.

If scipy is not available, falls back to using the standard
wave module as before.

Fixes jiaaro#134
…m_file

Rename from_file to from_file_using_temporary_files just in case there's
any case in which the new from_file doesn't work (I couldn't find any,
but just in case, I guess it would be nice to keep it maybe as
deprecated).

Add a new from_file function that does all the reading on memory with
pipes, not using any temporary file, which is faster and doesn't wear
down disks for heavy usages.

The new from_file function reads the input file and passes it to ffmpeg
using a pipe and then reads ffmpeg output using another pipe directly
to memory.

Since wav files have the file length in the header and ffmpeg can't
write it since it's working on a stream, we modify the resulting raw data
from ffmpeg before reading it using the standard method.

Fixes jiaaro#237
Might also fix jiaaro#209
Instead of always using pipes to feed ffmpeg the input file,
if we have a file on disk, just let ffmpeg open and read it by itself.

ffmpeg needs to be able to seek on certain formats to read them, so
we can't use pipes. This commit fixes reading those formats.
scipy only supports wav files with a bit depth in (8, 16, 32, 64, 96, 128)
so in other cases (most notably, 24), it raises a ValueError("Unsupported "
" bit depth: the wav file has {}-bit data.".format(bit_depth)) exception.

Also, When the data doesn't start with the standard RIFF header, scipy
raises a ValueError("File format {}... not understood.".format(repr(data)))
exception.

In those cases, as well as when scipy is not available, fallback to
using the python standard wave module, just in case it does support
reading that data (for some cases of 24 bit depth data, it does).
avconv doesn't seem to set the return code correctly and might return no
data, print an error message and still give a return code of 0, so we
handle having "no data" also as an error
It seems avconv is lazier than ffmpeg when initializing the wav headers.
ffmpeg initialized the data subchunk header and only the RIFF chunk
had to be tampered, but avconv also leaves the data subchunk header
uninitialized, so we also have to set it to the correct value.
Running the tests under python3 shows over 60 warnings like
ResourceWarning: unclosed file <_io.BufferedRandom name='/tmp/tmpwff3iqay.mp3'>

With this commit, the warnings are reduced to 33.
As opposed to unpack, which works on strings in python2
@yelantf
Copy link

yelantf commented Mar 23, 2018

Awesome! So could we have a new release of this version in pip as soon as possible?

@antlarr
Copy link
Contributor Author

antlarr commented Mar 23, 2018

Glad that you like it, but note I still have to add tests and export support :)

@jiaaro
Copy link
Owner

jiaaro commented Mar 23, 2018

I will be happy to issue a release as soon as it's ready :)

The current mediainfo function mixes information from different streams so
mediainfo_json is a replacement which just asks avprobe/ffprobe to
return the information as a json dictionary and parses it, thus giving
separate information for all streams

Additionally, avprobe doesn't return as much information as ffprobe on
the json data, so we extract that missing information from its stderr
output.

Finally, also, remove an unneeded import from the code of mediainfo.
By default, avconv/ffmpeg generates pcm_s16le data when converting
to a wav file, even if the input is using 24/32 bits. For this
reason, we have to find out the bits_per_sample of the input data
and then request the same value in the output file.

Note that the wave module doesn't support a frequency over 44 KHz
and scipy doesn't support 24 bits, so we convert to 32 bits in that
case (sample_fmt would be 's32' while bits_per_sample would be 24)
to not lose any information.
Instead of using sample_fmt, we just always generate
the codec name from the bits_per_sample value (and
hardcode it to be signed and little-endian)
ffmpeg might output text not using utf-8, in that case
we shouldn't be parsing it anyway, so we just set decode
to ignore the wrong encoded characters.
So let's not try to parse the output if there's no output from ffmpeg.
This can happen for example for non-existant files.
For format_test.m4a, ffprobe is returning:
    Stream #0:0(und): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, mono, fltp, 63 kb/s (default)
So in order to get the stream id, we have to split on '(' and ')' characters too
Test files in 192kHz having 16, 24, 32 and 64 bit depth
I also added a 192kHz/32bit flac file for completion,
since it's a common format for that kind of files.
There are certain file formats which use float samples, so now we use 32
bits for those, and thus, require scipy to use those tests (using m4a, ogg,
webm extensions).
The new 192kHz tests require scipy, but also other tests that worked
before, just because ffmpeg/avconv converted files automatically to
16 bits samples, but now, if the file format uses float samples,
we use 32 bit samples.
3.3.1 is not <= 3.3, so better compare for < 3.4
It seems the output of avprobe is slightly different in ubuntu
than what I'm getting in openSUSE, so let's try if we can parse
it like this.
@Crojav
Copy link

Crojav commented Apr 9, 2018

Would like to help you with testing. I benefit from getting it working. I work with High Rez audio from 24bit 48000 96000 176400.

I also deal with DSF audio files
dsf2flac -i 01_OpeningSong.dsf -r176400 - b24 -o 01_OpeningSong.flac

The Flac 24bit 48000 96000 176400 files converted with Ffmpeg and Sox to Wav give all the Error: unknown format: 65534 of wave.py. As determined by @antlarr: By default, avconv / ffmpeg generates pcm_s16le data when converting to a wav file, even if the input is using 24/32 bits.

The working solution is:

For dsf to flac:
dsf2flac -i 01_OpeningSong.dsf -r176400 - b24 -o 01_OpeningSong.flac
https://code.google.com/archive/p/dsf2flac/wikis/TestingRev3.wiki

For flac to wav
flac -d 01_OpeningSong.flac give you 01_OpeningSong.wav and accepted by wave.py

@antlarr
Copy link
Contributor Author

antlarr commented Apr 9, 2018

Hi @Crojav, I don't have any dsf file to test, but it would be nice if you could test my changes with your files. Please, check that you have scipy installed, as that's required for >48kHz support.

Also, there are file formats that use a float as sample_fmt (like m4a or ogg vorbis). Those formats were supported because ffmpeg automatically converted them to 16 bits, but are actually 32 bits. My patches currently convert them to pcm_s32le (or pcm_s64le in case of samples using double as sample_fmt) which I think can be improved in the future to use actual float/double values, but for now I think it's already an improvement.

The problem is that this means that to open those formats, now we require scipy. @jiaaro, is that ok for you?

@antlarr
Copy link
Contributor Author

antlarr commented Apr 9, 2018

Btw, I don't know why the appveyor tests don't seem to honor the skipIf decorators (or maybe it's check_module_availability which is failing), but I don't have any windows machine to test, so I'd need help with that. Also, I wouldn't know how to install scipy on windows (is just using pip ok?) which would fix it, but still, it would be nice to get the skipIf decorators working (as it does on Linux)

numpy.ndarray.tobytes was included in 1.9.0, so in case it's not
available, just use bytes(array)
This way we don't require any external dependency and still can
read all kind of bit depths and frame rates.

I tested that all format fields and raw_data read using the scipy/wave
modules and this new function are the same for test data.

Since we now don't require scipy for >48kHz files anymore, I removed all
conditions to skip tests, since we can run all of them without any condition.
This way we treat the same all kind of data and recognize
24/32 bit audio either if it's read from a file or if it's
stored in memory.
@Crojav
Copy link

Crojav commented Apr 16, 2018

@antlarr @jiaaro

Test with 24b - 32b / 44100 48000 88000 96000 176400 192000 hz samples

trim_01a_32-44100.wav play
trim_02a_24-48000.wav play
trim_03a_32-48000.wav play

trim_03d_32-176400.wav wave.Error: unknown format: 65534
trim_03c_32-96000.wav wave.Error: unknown format: 65534
trim_03e_32-192000.wav wave.Error: unknown format: 65534
trim_02b_24-88200.wav wave.Error: unknown format: 65534
trim_02d_24-176400.wav wave.Error: unknown format: 65534
trim_02c_24-96000.wav wave.Error: unknown format: 65534
trim_02e_24-192000.wav wave.Error: unknown format: 65534
trim_03b_32-88200.wav wave.Error: unknown format: 65534

@antlarr
Copy link
Contributor Author

antlarr commented Apr 16, 2018

@Crojav: Please, make sure you're using the latest code from my fork or wait until this pull request is accepted to test it.

There are certain ffprobe versions (mostly on windows) that show
that mp3/mp4/aac/webm/ogg files contain fltp samples. But afaik,
those formats always use 16 bit samples, so this commit add a
workaround to force those cases to use 16 bit depth.

If for some reason, those format contain s24 or s32 samples, they'll
be used correctly, but until ffprobe is "fixed", let's just convert fltp
samples to s16le.
@Crojav
Copy link

Crojav commented Apr 16, 2018

@antlarr @jiaaro

O.k now the latest code from @antlarr fork:

01a_32-44100.wav not playing
02a_24-48000.wav distord
02b_24-88200.wav distord
02c_24-96000.wav not playing
02d_24-176400.wav distord
02e_24-192000.wav distord
03a_32-48000.wav distord
03b_32-88200.wav not playing
03d_32-176400.wav distord
03e_32-192000.wav distord
03c_32bit-96000.wav not playing

@antlarr
Copy link
Contributor Author

antlarr commented Apr 16, 2018

Thanks for testing! Exactly how are you processing those files in your tests? Can you send me those files (or a download link) to test them myself? Are you testing on linux or windows? if linux, what distribution? using ffmpeg or avconv?

@Crojav
Copy link

Crojav commented Apr 16, 2018

@antlarr

https://github.com/Crojav/pytori/tree/master/test-wav_24b-32b

https://github.com/Crojav/pytori/blob/master/pytori-playlist-pitch.py

I am not aware of using ffmpeg or avconv for playing the test-wav_24b-32b

ffmpeg is installed

LinuxMint

os.fsdecode was introduced in python 3.2, so I implemented fsdecode in
utils.py to be able to use it on previous versions. fsdecode thus
returns the filename passed as parameter as a unicode string (either
if it's str or bytes), and if a PathLike object is used, it returns
its filename. In other case, it raises a TypeError exception.

Changed code in from_file and mediainfo_json to use it, which simplifies
the code a bit and hopefully fixes the last windows appveyor error.
@antlarr
Copy link
Contributor Author

antlarr commented Apr 17, 2018

@Crojav, I did a quick test opening the files and it seems they're read correctly. Your script doesn't work since you first open them with stdlib's wave module but it raises an "Error: unknown format: 65534" exception, so it doesn't even reach the code that uses pydub.

@Crojav
Copy link

Crojav commented Apr 17, 2018

@antlarr

O.k good to know! I will leave out Wave to get framerate sampwidth and handle this with Scipy

@jiaaro jiaaro merged commit 149a81f into jiaaro:master May 24, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Reading audio requires a copy Permission denied Support audio with frame rates over 48k
4 participants