Bug in wave normalization. #49

sweetcocoa · 2020-01-16T01:33:07Z

Hi, thank you for sharing this awesome work.

I've encountered this code below, and I dont' think this is correct.

crepe/core.py

    frames = as_strided(audio, shape=(1024, n_frames),
                        strides=(audio.itemsize, hop_length * audio.itemsize))
    frames = frames.transpose()

    # normalize each frame -- this is expected by the model
    frames -= np.mean(frames, axis=1)[:, np.newaxis]
    frames /= np.std(frames, axis=1)[:, np.newaxis]

This code is meant to normalize wave by each frame, but numpy's as_strided function does not allocate new memory, but does just create a new view of the array. (for example, the memory of first frame and the memory of second frame is not separated.)

See numpy documentation for more details.

So when do "in-place" calculation on that array, unintented result may raise. Here is an example.

np.random.seed(42)
model_srate,step_size = 16000, 10
audio = np.random.randn(16000) # (16000, )

# below code is from capre/core.py#get_activation
hop_length = int(model_srate * step_size / 1000)
n_frames = 1 + int((len(audio) - 1024) / hop_length)
frames = as_strided(audio, shape=(1024, n_frames),
                    strides=(audio.itemsize, hop_length * audio.itemsize))
frames = frames.transpose()

# normalize each frame -- this is expected by the model
frames -= np.mean(frames, axis=1)[:, np.newaxis]
frames /= np.std(frames, axis=1)[:, np.newaxis]

print(frames.sum())

146.7856165123937

Since frame is normalized, its sum should be about zero. but it's not.

To avoid this behavior, target array should be copied or something, to do operation correctly as intended. here is an example. only one line is changed. (.copy())

np.random.seed(42)
model_srate,step_size = 16000, 10
audio = np.random.randn(16000) # (16000, )

# below code is from capre/core.py#get_activation
hop_length = int(model_srate * step_size / 1000)
n_frames = 1 + int((len(audio) - 1024) / hop_length)
frames = as_strided(audio, shape=(1024, n_frames),
                    strides=(audio.itemsize, hop_length * audio.itemsize))
frames = frames.transpose().copy()

# normalize each frame -- this is expected by the model
frames -= np.mean(frames, axis=1)[:, np.newaxis]
frames /= np.std(frames, axis=1)[:, np.newaxis]

print(frames.sum())

-2.384759056894836e-13

This code is tested with numpy version 1.17.4

jongwook · 2020-01-17T23:05:11Z

Thank you for the detailed report! This is a great catch, and it's striking that the code works regardless.

I'll push the fix shortly.

angusturner · 2020-02-20T06:23:35Z

Just a quick question, was the model trained on audio normalized the old / incorrect way?

I am using a PyTorch port of this repo with your weights, and am wondering whether I should change or update the normalization code? Or if the model requires re-training with the fixed normalization?

jongwook · 2020-02-20T18:37:09Z

The pre-processing code I used for training is different from this, which did not have the normalization bug in this issue. So it shouldn't require retraining and you can use the same pre-trained model, unless you want to fine-tune the model with some other dataset.

adarob · 2020-03-25T22:06:18Z

@jongwook will you be releasing an updated pip package with this fix?

justinsalamon · 2020-03-25T22:14:44Z

Yes :) @jongwook let me know if you're slammed right now and I can make a release. Cheers!

jongwook · 2020-03-26T02:40:25Z

@adarob @justinsalamon Thanks for the heads up! Just pushed v0.0.11 :)

Btw you can use:

git+https://github.com/marl/crepe.git

in requirements.txt to pull the latest commit from Github, or:

git+https://github.com/marl/crepe.git@7f0bf0593847f38df47d2760934d585ad2f618c0

for a specific commit.

HudsonHuang · 2020-07-28T17:36:16Z

Just a quick question, was the model trained on audio normalized the old / incorrect way?

I am using a PyTorch port of this repo with your weights, and am wondering whether I should change or update the normalization code? Or if the model requires re-training with the fixed normalization?

@angusturner
Hi, I‘m looking for the pytorch version of CREPE, but found nothing. Does the "PyTorch port" you mentioned is public available? or would you mind about releasing it? Thank you very much in advance.

jongwook closed this as completed in ffb74d8 Jan 17, 2020

dhgrs mentioned this issue Mar 18, 2020

f0 in gansynth_subset is wrong magenta/ddsp#54

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug in wave normalization. #49

Bug in wave normalization. #49

sweetcocoa commented Jan 16, 2020 •

edited

Loading

jongwook commented Jan 17, 2020

angusturner commented Feb 20, 2020 •

edited

Loading

jongwook commented Feb 20, 2020

adarob commented Mar 25, 2020

justinsalamon commented Mar 25, 2020

jongwook commented Mar 26, 2020 •

edited

Loading

HudsonHuang commented Jul 28, 2020

Bug in wave normalization. #49

Bug in wave normalization. #49

Comments

sweetcocoa commented Jan 16, 2020 • edited Loading

jongwook commented Jan 17, 2020

angusturner commented Feb 20, 2020 • edited Loading

jongwook commented Feb 20, 2020

adarob commented Mar 25, 2020

justinsalamon commented Mar 25, 2020

jongwook commented Mar 26, 2020 • edited Loading

HudsonHuang commented Jul 28, 2020

sweetcocoa commented Jan 16, 2020 •

edited

Loading

angusturner commented Feb 20, 2020 •

edited

Loading

jongwook commented Mar 26, 2020 •

edited

Loading