Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Downmixing to mono behaves differently depending on whether FFMPEG is used for audio loading #508

Open
fdlm opened this issue Oct 11, 2022 · 0 comments

Comments

@fdlm
Copy link
Contributor

fdlm commented Oct 11, 2022

Expected behaviour

When loading a stereo audio file and downmixing it to mono, I expect the resulting amplitudes to not depend on the audio file format, but only on the content.

Actual behaviour

Currently, if a wave file has the the same sample type as the one desired when loading, madmom will use scipy to load it; then, to downmix the signal to mono, it will use its own madmom.audio.signal.remix function, which computes the arithmetic mean of the channels.

If the there is a mismatch in sample types (eg. the file is stored as float32 but loaded as float, or stored as 16-bit integers and loaded as float), madmom will use ffmpeg to load the file, and, in the same step, use ffmpeg to downmix to mono.

Now, the downmixing logic of ffmpeg apparantly uses a normalizing factor of 2 / sqrt(2) when downmixing. This results in different amplitudes.

Steps needed to reproduce the behaviour

import madmom
import numpy as np

# chirp.wav is stored as stereo 32-bit float
read_wave = madmom.io.load_audio_file('chirp.wav', num_channels=1, dtype=np.float32)[0]
read_ffmpeg = madmom.io.load_audio_file('chirp.wav', num_channels=1, dtype=np.float)[0]

print(np.nanmedian(read_wave / read_ffmpeg))  # 0.7071...
print(np.nanmedian(((2 * read_wave / np.sqrt(2)) / read_ffmpeg))  # 1.0

Information about installed software

madmom master branch
ffmpeg version 4.4.2-0ubuntu0.22.04.1
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant