Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generate proper test WAV files #55

Merged
merged 3 commits into from
Jul 15, 2014
Merged

Generate proper test WAV files #55

merged 3 commits into from
Jul 15, 2014

Conversation

mgeier
Copy link
Contributor

@mgeier mgeier commented Jul 10, 2014

Originally, I wanted to avoid creating the test files with the system under test, this just didn't seem right.
Therefore, I created the test WAV files with scipy.io.wavfile.write().

Recently, I found out that the 32-bit floating point WAV file is malformed.
See libsndfile/libsndfile#70 and scipy/scipy#3778.

I tried to find an alternative way to generate a floating point WAV file, but it's surprisingly hard to find something that doesn't involve libsndfile.

Sox produces correct files, but I didn't find a way to write specific floating-point values (like 1.0 and 0.75) to a file.

Is there a better way to create our test WAV files?

If not, I guess we'll have to create them with PySoundFile.

Another possibility would be to use only PCM test files (which can be generated correctly with many tools), but then we would have to add dtype='float16' in many places in the test code, which I originally wanted to avoid.

@bastibe
Copy link
Owner

bastibe commented Jul 10, 2014

A simple solution would be to use MAT files instead. The disadvantage is that MAT files do not have a sample rate or a number of channels. This is probably not important for the tests though.

@mgeier
Copy link
Contributor Author

mgeier commented Jul 10, 2014

I managed to create a (hopefully) proper WAV file with sox (see db6e428).

Appending data still doesn't work correctly.
Strangely, the behavior with a file-like object is different from the other two modes.
But probably that's an error in the fixtures ...?

For some inexplicable reason I get these values after appending in test_rw_append_data():

[[  0.00000000e+00   2.27795078e-41]
 [  6.86972559e-41   2.26898247e-41]
 [  6.86075728e-41   2.26001416e-41]
 [  6.85178897e-41   2.24207754e-41]
 [  6.83385235e-41   2.26001416e-41]
 [  6.85178897e-41   2.25104585e-41]
 [  6.84282066e-41   2.24207754e-41]
 [  6.83385235e-41   2.22414092e-41]]

Could you please have a look?

@bastibe
Copy link
Owner

bastibe commented Jul 11, 2014

This file was formatted quite weirdly.

  • It had a non-standard fmt chunk length, thus data was misaligned. This is probably the reason for the weird values you were seeing.
  • It had a non-standard fact chunk before the data chunk (more details here). This chunk was probably no problem for libsndfile, but it is not necessary either.

I removed the fact chunk and truncated the fmt chunk to its default length of 16 bytes (and updated the file length entry accordingly). The file is now read correctly by sndfile and seems to contain correct data. However, the tests still don't pass for some other reason.

It's been a while since I last modified binary data with a hex editor!

@mgeier
Copy link
Contributor Author

mgeier commented Jul 11, 2014

Thanks for shedding some more light on this.

I removed the fact chunk

But according to Erik, the 'fact' chunk is necessary for float WAVs: libsndfile/libsndfile#70

and truncated the fmt chunk to its default length of 16 bytes (and updated the file length entry accordingly).

How did you do that?

I think it's not good to have an undocumented binary blob in the repository.
That's why I added the script tests/generate_soundfiles.py to document how the test files were created.

I never thought it was so hard to get correct floating-point WAV files!
I only want a way to generate a floating-point WAV with the given exact values using only open-source tools except libsndfile.

We can probably create our simple test files with Python and the struct module, but I don't have enough knowledge about the WAV format to do that.

However, the tests still don't pass for some other reason.

This is the original error (because of the missing 'fact' chunk) which I tried to overcome by converting the malformed file with sox.
But obviously sox did also some strange things ...

@bastibe
Copy link
Owner

bastibe commented Jul 11, 2014

How did you do that?

Using a hex editor. RIFF/WAV files are pretty simple. The chunk length of the fmt chunk was set to 0x12 for some reason, where it should be 0x10 normally (the last two bytes were empty). This shifted the whole content by two bytes, and thus the data chunk and its four-byte floats were misaligned.

I can put that chunk back if you want, and only truncate the fmt chunk to its correct length and alignment.

Seeing how difficult this is, I again propose to use MAT files instead.

@mgeier
Copy link
Contributor Author

mgeier commented Jul 12, 2014

I can put that chunk back if you want, and only truncate the fmt chunk to its correct length and alignment.

Yes please, it would be interesting to see if the tests work then.

However, I think it would be much nicer if we could provide a script that generates our test files.

I don't really know the different parts of a WAV file, but if you can tell me what byte (or group of bytes) means what, I could create a script that creates our two WAV files (and one RAW file) using the struct module.
Then we wouldn't even need SciPy or sox or any other external library or program to generate the test files.

Seeing how difficult this is, I again propose to use MAT files instead.

I don't really like MAT files, because they are strongly associated with a certain proprietary software environment. And the whole reason why I'm supporting PySoundFile (and scientific Python in general) is to get away from it. Using MAT files now would feel like a personal defeat.
I know that WAV also has a proprietary background, but nowadays it is so widely used that this is hardly recognizable anymore.

Also, I don't see MAT as a proper sound file format. Of course it can be used like that (and libsndfile supports that), but to me this seems like an abuse of the MAT format.

BTW, I just tried saving a MAT with PySoundFile (I've never done that before), and to my surprise the channels are stored along the rows!

Are you using MAT files like that?

@bastibe
Copy link
Owner

bastibe commented Jul 14, 2014

Wave is a really simple format. At a basic level, it is a RIFF file.

Riff files start with the letters 'RIFF' in ASCII, followed by the remaining file length as a 32 bit integer. Since the RIFF header is 8 bytes long, this is the file length minus 8.

char[4] riff = 'RIFF';
uint32_t file_length = 80;

Next, the file type has to be specified:

char[4] format = 'WAVE';

After that, RIFF files contain one or more named chunks. Each named chunk starts with four characters as chunk identifier, then the chunk length as 32 bit integer (again excluding the chunk header itself), then the chunk content.

For WAVE files, the first chunk is the 'fmt ' chunk (more info here and Wikipedia (DE)):

char[4] chunk_name = 'fmt '; // note the space
uint32_t chunk_len = 16;
uint16_t data_type = 3; // Float Data; 1 would be PCM
uint16_t channels = 2;
uint32_t samplerate = 44100;
uint32_t bytes_per_second = samplerate*channels*4;
uint16_t framesize = 8;
uint16_t bits_per_sample = 32;

As far as I know, the next chunk is optional. At least, I have never seen it before this PM (documentation here). The linked website says that this should contain the number of samples, but sox put a 4 here, which would be number of frames:

char[4] chunk_name = 'fact';
uint32_t chunk_len = 4;
uint32_t number_of_frames = 4; // or 8?

Lastly, the data itself:

char[4] chunk_name = 'data';
uint32_t chunk_len = framesize*4;
float[8] data = { 1.0, -1.0,
                   0.75, -0.75,
                   0.5, -0.5,
                   0.25, -0.25 };

The fact chunk does not seem to contain important information for a float file. I don't think it is necessary. In the original sox file, the fmt chunk length was set to 18, which leaves the rest of the file aligned to 2 bytes. Since the file later contains 4-byte data, this violates alignment rules for C structures (N-byte data must be N-byte aligned).

@mgeier
Copy link
Contributor Author

mgeier commented Jul 14, 2014

Thanks for the detailed listing!
This way I didn't have to read tons of documentation ... I changed the file generation script to write the test files using the struct module, now there are no external dependencies.
See 763a7ae.

I think having this script to generate the test files is better than just uploading un-documented binary files.

The generated WAV-files are bit-wise identical to the ones we had before (with your manual changes) and the tests finally pass.

Whether or not the 'fact' chunk is obligatory for float WAVs seems to be a grey area in the specification.

I guess this is the official "standard" document: http://download.microsoft.com/download/9/8/6/9863C72A-A3AA-4DDB-B1BA-CA8D17EFD2D4/RIFFNEW.pdf

On page 12 it says:

The fact chunk is required for all new WAVE formats. The chunk is not required for the
standard WAVE_FORMAT_PCM files.

The question is whether WAVE_FORMAT_IEEE_FLOAT is considered a "new WAVE format".
I guess it is.

The information stored within the 'fact' chunk is of course completely redundant in the case of an uncompressed file, but the "standard" seems to require this information nevertheless.

The meaning of the first number within the 'fact' chunk is also a little confusing. The first part ...

... represents the length of the data in samples.

... doesn't help to much, but the second part clarifies things:

[The sample rate] is used in conjunction with the [first field in the 'fact' chunk] to determine the length of the data in seconds.

... therefore we must be talking about the number of frames.

If there are no objections, I'd like to combine the last few (temporary) commits into one proper commit and then merge this PR.

@bastibe
Copy link
Owner

bastibe commented Jul 14, 2014

Sounds good!

Generating the files by hand using the struct module has the advantage that we can control their content very precisely without relying on any third-party tool!

Come to think of it, we could use this fact to test the write methods of pysoundfile as well. But this is a discussion for another day/PR.

@mgeier
Copy link
Contributor Author

mgeier commented Jul 15, 2014

I combined some commits to a new commit 37756c8.

Now that we don't have external dependencies, we could also create the test files on-the-fly before running the tests (e.g. with setup_module() or with a module-scope fixture).
But that's probably too much?
I think it may be helpful to have the actual WAV-files in the repository for reference.

Come to think of it, we could use this fact to test the write methods of pysoundfile as well. But this is a discussion for another day/PR.

That's what I wanted to do originally.
I tried to compare the generated file with an existing file using filecmp.cmp(), but the problem was that libsndfile added a 'peak' chunk which wasn't there in the original file (which was generated by SciPy).

Now this should be possible, we just have to add a 'peak' chunk to our float test file.

I created a new issue for that: #56.

If the changes in this PR are OK (and if you think we shouldn't create the test files on-the-fly), please merge.

bastibe added a commit that referenced this pull request Jul 15, 2014
Generate proper test WAV files
@bastibe bastibe merged commit 1828cd0 into master Jul 15, 2014
@mgeier mgeier deleted the rw-problem branch July 15, 2014 11:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants