Add test for InverseMelScale #448

mthrok · 2020-02-27T22:19:54Z

This PR follows up #366 and adds test for InverseMelScale (and MelScale) for librosa compatibility.

For MelScale compatibility test;

Generate spectrogram
Feed the spectrogram to torchaudio.transforms.MelScale instance
Feed the spectrogram to librosa.feature.melspectrogram function.
Compare the result from 2 and 3 elementwise.
Element-wise numerical comparison is possible because under the hood their implementations use the same algorith.

For InverseMelScale compatibility test, it is more elaborated than that.

Generate the original spectrogram
Convert the original spectrogram to Mel scale using torchaudio.transforms.MelScale instance
Reconstruct spectrogram using torchaudio implementation
3.1. Feed the Mel spectrogram to torchaudio.transforms.InverseMelScale instance and get reconstructed spectrogram.
3.2. Compute the sum of element-wise P1 distance of the original spectrogram and that from 3.1.
Reconstruct spectrogram using librosa
4.1. Feed the Mel spectrogram to librosa.feature.inverse.mel_to_stft function and get reconstructed spectrogram.
4.2. Compute the sum of element-wise P1 distance of the original spectrogram and that from 4.1. (this is the reference.)
Check that resulting P1 distance are in a roughly same value range.

Element-wise numerical comparison is not possible due to the difference algorithms used to compute the inverse. The reconstructed spectrograms can have some values vary in magnitude.
Therefore the strategy here is to check that P1 distance (reconstruction loss) is not that different from the value obtained using librosa. For this purpose, threshold was empirically chosen

print('p1 dist (orig <-> ta):', torch.dist(spec_orig, spec_ta, p=1))
print('p1 dist (orig <-> lr):', torch.dist(spec_orig, spec_lr, p=1))
>>> p1 dist (orig <-> ta): tensor(1482.1917)
>>> p1 dist (orig <-> lr): tensor(1420.7103)

This value can vary based on the length and the kind of the signal being processed, so it was handpicked.

Closes #366

mthrok · 2020-02-27T22:21:09Z

@vincentqb Once we settle on the test, we can do;

Squash merge InverseMelScale Implementation #366 to master
Rebase this branch to master
Squash merge this.
That way the credit of @jaeyeun97 is preserved properly.

test/test_transforms.py

mthrok · 2020-02-28T15:18:42Z

For the record;
the following is what the reconstructed spectrograms look like in test_InverseMelScale test.
As mentioned in my comment and as in the original PR, these values are mostly close but in some places they can be extremely different.

code for plot

# based on https://gist.github.com/jaeyeun97/8651dff509d5b084636ac6c3a7547108
# thanks @jaeyeun97

def _plot_melspecs(sample_rate, spec_original, spec_ta, spec_lr):
    def log_mag(spec):
        ref = spec.max().clamp(1e-16).log10_()
        spec = spec.clamp(1e-16).log10().sub_(ref).mul_(20).clamp(min=-80)
        return spec.squeeze().numpy()

    spec_original = log_mag(spec_original)
    spec_ta = log_mag(spec_ta)
    spec_lr = log_mag(spec_lr)

    import librosa.display
    import matplotlib
    import matplotlib.pyplot as plt

    matplotlib.use('TkAgg')
    plt.figure(figsize=(20, 10))

    plt.subplot(3, 1, 1)
    librosa.display.specshow(spec_original, sr=sample_rate)
    plt.title('Original')

    plt.subplot(3, 1, 2)
    librosa.display.specshow(spec_lr, sr=sample_rate)
    plt.title('Librosa')

    plt.subplot(3, 1, 3)
    librosa.display.specshow(spec_ta, sr=sample_rate)
    plt.title('Torchaudio')
    plt.savefig(f'test_InverseMelScale.png')

test/test_transforms.py

torchaudio/transforms.py

This PR follows up pytorch#366 and adds test for `InverseMelScale` (and `MelScale`) for librosa compatibility. For `MelScale` compatibility test; 1. Generate spectrogram 2. Feed the spectrogram to `torchaudio.transforms.MelScale` instance 3. Feed the spectrogram to `librosa.feature.melspectrogram` function. 4. Compare the result from 2 and 3 elementwise. Element-wise numerical comparison is possible because under the hood their implementations use the same algorith. For `InverseMelScale` compatibility test, it is more elaborated than that. 1. Generate the original spectrogram 2. Convert the original spectrogram to Mel scale using `torchaudio.transforms.MelScale` instance 3. Reconstruct spectrogram using torchaudio implementation 3.1. Feed the Mel spectrogram to `torchaudio.transforms.InverseMelScale` instance and get reconstructed spectrogram. 3.2. Compute the sum of element-wise P1 distance of the original spectrogram and that from 3.1. 4. Reconstruct spectrogram using librosa 4.1. Feed the Mel spectrogram to `librosa.feature.inverse.mel_to_stft` function and get reconstructed spectrogram. 4.2. Compute the sum of element-wise P1 distance of the original spectrogram and that from 4.1. (this is the reference.) 5. Check that resulting P1 distance are in a roughly same value range. Element-wise numerical comparison is not possible due to the difference algorithms used to compute the inverse. The reconstructed spectrograms can have some values vary in magnitude. Therefore the strategy here is to check that P1 distance (reconstruction loss) is not that different from the value obtained using `librosa`. For this purpose, threshold was empirically chosen ``` print('p1 dist (orig <-> ta):', torch.dist(spec_orig, spec_ta, p=1)) print('p1 dist (orig <-> lr):', torch.dist(spec_orig, spec_lr, p=1)) >>> p1 dist (orig <-> ta): tensor(1482.1917) >>> p1 dist (orig <-> lr): tensor(1420.7103) ``` This value can vary based on the length and the kind of the signal being processed, so it was handpicked.

test/test_transforms.py

torchaudio/transforms.py

vincentqb · 2020-02-28T21:25:12Z

As you noticed, test is failing.

        # This threshold was choosen empirically, based on the following observations
        #
        # torch.dist(spec_orig, spec_ta, p=1)
        # >>> tensor(1482.1917)
        # torch.dist(spec_orig, spec_lr, p=1)
        # >>> tensor(1420.7103)
        # torch.dist(spec_lr, spec_ta, p=1)
        # >>> tensor(881.7889)
>       assert torch.dist(spec_orig, spec_ta, p=1) < threshold
E       AssertionError: assert tensor(1658.8206) < 1500.0
E        +  where tensor(1658.8206) = <built-in method dist of type object at 0x7fe5e2937880>(tensor([[[1.5051e+00, 7.9520e-01, 3.0737e-01,  ..., 6.0686e-01,\n          8.8888e-01, 2.8056e+00],\n         [7.7557e-0...e-04, 2.0129e-03],\n         [2.0623e-04, 4.6134e-05, 6.3226e-05,  ..., 6.4552e-05,\n          1.0691e-03, 2.0032e-03]]]), tensor([[[0.3844, 0.4208, 0.1461,  ..., 0.9873, 0.7092, 0.8993],\n         [0.7756, 1.1142, 0.9477,  ..., 0.8303, 1.985...0466, 0.0493, 0.0557,  ..., 0.0246, 0.0529, 0.0000],\n         [0.8467, 0.9353, 0.5868,  ..., 0.6988, 0.8108, 0.9290]]]), p=1)
E        +    where <built-in method dist of type object at 0x7fe5e2937880> = torch.dist
test/test_transforms.py:620: AssertionError

cpuhrsch · 2020-02-28T21:46:50Z

test/test_transforms.py

        self.assertTrue(torch.allclose(computed, expected))

+    def test_batch_InverseMelScale(self):
+        n_fft = 8


I'm wondering if it makes sense to vary some of these parameters a bit in a few subsequent tests. In particular to also include edge cases to see what the error behavior is. Unless this is verified through other tests.

I agree with running test on multiple parameters.
However, I think that it can be accomplished better with reorganizing the whole test suite.

Right now Tester class contains all kinds of test, (like batch test, torch script test, librosa compatibility test etc...) and it was hard to tell what type of test I should add and where to add.
By creating separate test suite for different test, it will be much easier.
So I think, I would add that kind parameterized test later, and add similar things to the existing ones.

cpuhrsch · 2020-02-28T21:49:14Z

test/test_transforms.py

+            S=spec_lr, sr=sample_rate, n_fft=n_fft, hop_length=hop_length,
+            win_length=n_fft, center=True, window='hann', n_mels=n_mels, htk=True, norm=None)
+        # Note: Using relaxed rtol instead of atol
+        assert torch.allclose(melspec_ta, torch.from_numpy(melspec_lr[None, ...]), rtol=1e-3)


using self.assertTrue might yield a nicer message if this fails.

In the future and in a separate PR we might want to look into introducing some of the Unittest extensions that PyTorch implements that'll enable things such as self.assertAllClose and also does torch.Tensor specific checks such as dtype,memory layout etc. . allclose might do upcasting, broadcasting etc., but actually we care that those properties match. cc @vincentqb

using self.assertTrue might yield a nicer message if this fails.

I found it opposite. Using assertTrue on torch.allclose only says

> self.assertTrue(torch.allclose(spec_ta, spec_lr, atol=threshold)) E AssertionError: False is not true test/test_transforms.py:618: AssertionError

whereas assert says
(although this is still hard to read due to combination of multiple line messages and pytest's annotation)

> assert torch.allclose(spec_ta, spec_lr, atol=threshold) E AssertionError: assert False E + where False = <built-in method allclose of type object at 0x121552eb0>(tensor([[[0.8752, 0.8655, 0.6858, ..., 0.7232, 0.3609, 0.2115],\n [0.7756, 1.1142, 0.9477, ..., 0.8303, 1.985...0338, 0.0434, 0.0437, ..., 0.0581, 0.0294, 0.0445],\n [0.4310, 0.7263, 0.4167, ..., 0.1131, 0.5628, 0.8183]]]), tensor([[[0.0000e+00, 0.0000e+00, 0.0000e+00, ..., 0.0000e+00,\n 0.0000e+00, 0.0000e+00],\n [7.7557e-0...e-04, 2.5366e-04],\n [3.0709e-11, 5.1357e-12, 3.0357e-12, ..., 0.0000e+00,\n 4.2634e-11, 1.0426e-10]]]), atol=1.0) E + where <built-in method allclose of type object at 0x121552eb0> = torch.allclose test/test_transforms.py:618: AssertionError

Combined with your comment on parameterized test, I think reorganizing test structure and using PyTorch's helper functions to show a good example of how to write a test will be great benefit for all developers.

vincentqb

The greater discussion about testing is good to have. I'd say we should definitely reorganizing the tests, and use standard pytorch tools for this, and open issues upstream to improve them too.

Given the original scope of this PR, this is ready to merge. Thanks for working on this @jaeyeun97 and @mthrok!

Fix typo in word_embeddings_tutorial.py. Thanks Zhiqiang.

jaeyeun97 added 9 commits February 25, 2020 17:48

Inverse Mel Scale Implementation

ca2dcd5

Inverse Mel Scale Docs

92ab3cd

Better working version.

90602f0

GPU fix

aa819e1

These shouldn't go on git..

227db11

Even better one, but does not support JITability.

d1e2cd2

Remove JITability test

c57e01b

Flake8

b5cec06

n_stft is a must

297d80a

vincentqb suggested changes Feb 27, 2020

View reviewed changes

jaeyeun97 mentioned this pull request Feb 28, 2020

InverseMelScale Implementation #366

Closed

vincentqb reviewed Feb 28, 2020

View reviewed changes

test/test_transforms.py Outdated Show resolved Hide resolved

vincentqb reviewed Feb 28, 2020

View reviewed changes

torchaudio/transforms.py Outdated Show resolved Hide resolved

jaeyeun97 and others added 5 commits February 28, 2020 14:01

Support arbitrary batch dimensions.

7f44b62

minor clean up of initialization

2e80d9f

Address review feedbacks

e985299

Add batch test

de3e426

mthrok force-pushed the inv-mel-spec branch from 2b8d9db to de3e426 Compare February 28, 2020 20:05

vincentqb reviewed Feb 28, 2020

View reviewed changes

test/test_transforms.py Show resolved Hide resolved

mthrok added 2 commits February 28, 2020 15:51

Use view for batch

b36568a

fix sgd

668f0f6

mthrok force-pushed the inv-mel-spec branch from 83955d0 to 668f0f6 Compare February 28, 2020 20:53

mthrok commented Feb 28, 2020

View reviewed changes

torchaudio/transforms.py Show resolved Hide resolved

vincentqb reviewed Feb 28, 2020

View reviewed changes

torchaudio/transforms.py Outdated Show resolved Hide resolved

cpuhrsch reviewed Feb 28, 2020

View reviewed changes

mthrok added 2 commits February 28, 2020 17:17

Use negative indices and update docstring

6137916

Update threshold

e142f6b

vincentqb approved these changes Feb 28, 2020

View reviewed changes

vincentqb merged commit babc24a into pytorch:master Feb 28, 2020

vincentqb mentioned this pull request Feb 28, 2020

Reverse MelScale, Griffin-Lim transformation #351

Closed

mthrok deleted the inv-mel-spec branch March 2, 2020 15:14

mthrok mentioned this pull request Mar 19, 2020

Refactor tests into small test suits #466

Closed

jacobjwebber mentioned this pull request Jan 4, 2021

Issues with transforms.InverseMelScale #1149

Open

mthrok pushed a commit to mthrok/audio that referenced this pull request Feb 26, 2021

Merge pull request pytorch#448 from zhiqwang/master

0eec7fa

Fix typo in word_embeddings_tutorial.py. Thanks Zhiqiang.

Add test for InverseMelScale #448

Add test for InverseMelScale #448

Uh oh!

Conversation

mthrok commented Feb 27, 2020 • edited by vincentqb Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mthrok commented Feb 27, 2020

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mthrok commented Feb 28, 2020

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

vincentqb commented Feb 28, 2020

Uh oh!

cpuhrsch Feb 28, 2020

Choose a reason for hiding this comment

Uh oh!

mthrok Feb 28, 2020

Choose a reason for hiding this comment

Uh oh!

cpuhrsch Feb 28, 2020

Choose a reason for hiding this comment

Uh oh!

mthrok Feb 28, 2020

Choose a reason for hiding this comment

Uh oh!

mthrok Feb 28, 2020

Choose a reason for hiding this comment

Uh oh!

vincentqb left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

mthrok commented Feb 27, 2020 •

edited by vincentqb

Loading

vincentqb left a comment •

edited

Loading