Skip to content

Conversation

vincentqb
Copy link
Contributor

Fixes #287

f_max (float): Maximum frequency (Hz)
n_mels (int): Number of mel filterbanks
sample_rate (int): Sample rate of the audio waveform
norm (Optional[str]): If 'slaney', divide the triangular mel weights by the width of the mel band
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we expected further normalization schemes? Does it make sense to split these normalizations out into its own layer? Are they useful in other contexts (maybe for volume normalization and such)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we expected further normalization schemes?

Yes, we could, see librosa/librosa#1050. Librosa itself is in process of adding other normalization as mentioned in this pull request.

Does it make sense to split these normalizations out into its own layer?

Not according to this comment.

Are they useful in other contexts (maybe for volume normalization and such)?

The normalization is done against f_pts which is computed within create_fb_matrix. I'm not aware of other use case.

@vincentqb vincentqb force-pushed the slaney_normalization branch from 88913b2 to 9a1994f Compare April 28, 2020 21:16
@vincentqb vincentqb force-pushed the slaney_normalization branch from 924c37a to 4150097 Compare May 4, 2020 22:35
@vincentqb vincentqb marked this pull request as ready for review May 4, 2020 22:36
@vincentqb vincentqb requested review from cpuhrsch and mthrok May 5, 2020 18:25
n_mels = 10
sample_rate = 16000
return F.create_fb_matrix(n_stft, f_min, f_max, n_mels, sample_rate)
norm = None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you're exercising torchscript I'd pass the less trivial type which is a string instead of None.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean: default to empty string? We don't use that elsewhere, but sounds good to me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@codecov
Copy link

codecov bot commented May 5, 2020

Codecov Report

Merging #589 into master will increase coverage by 0.01%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #589      +/-   ##
==========================================
+ Coverage   88.99%   89.01%   +0.01%     
==========================================
  Files          21       21              
  Lines        2254     2257       +3     
==========================================
+ Hits         2006     2009       +3     
  Misses        248      248              
Impacted Files Coverage Δ
torchaudio/functional.py 95.53% <100.00%> (+0.01%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b499840...7270e6e. Read the comment docs.

@vincentqb
Copy link
Contributor Author

@mthrok -- do you have any feedback?

Copy link
Contributor

@mthrok mthrok left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the late reply. Looks good to me. One nit.

n_mels: int,
sample_rate: int
sample_rate: int,
norm: str = "",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a public API signature, I think Optional[str] looks cleaner.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was made due to comment. I'll leave it as it is for now. We can always extend the str to Optional[str] without BC breaking later :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks to me that, that comment was meant for the type of variable to pass when running Torchscript test, not about the function signature.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we allow None in the signature, then the code should work with/without jit when passing None. It wasn't though. Is that what you meant?

Copy link
Contributor

@mthrok mthrok May 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we allow None in the signature, then the code should work with/without jit when passing None.

Yes, it works.

from typing import Optional

import torch
from torch import Tensor


def bar(foo: Optional[str]=None) -> Tensor:
    if foo is None:
        return torch.zeros(1, 2)
    if foo == "a":
        return torch.ones(1, 1)

    return torch.empty(1, 1)


ts_bar = torch.jit.script(bar)

for v in [None, "a", "b"]:
    print(v)
    print(bar(v))
    print(ts_bar(v))

produces

None
tensor([[0., 0.]])
tensor([[0., 0.]])
a
tensor([[1.]])
tensor([[1.]])
b
tensor([[-2.8910e+12]])
tensor([[0.]])

also dcshift uses Optional[float] and it works fine. for both None and float input for torchscript.
#558

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's just when type is optional, it firstly needs to compare against None using if var is None or if var is not None.

Copy link
Contributor

@mthrok mthrok May 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://pytorch.org/docs/master/jit_language_reference.html#optional-type-refinement

TorchScript will refine the type of a variable of type Optional[T] when a comparison to None is made inside the conditional of an if-statement or checked in an assert. The compiler can reason about multiple None checks that are combined with and, or, and not. Refinement will also occur for else blocks of if-statements that are not explicitly written.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alrighty, #641 :)

@vincentqb vincentqb merged commit 995b75f into pytorch:master May 14, 2020
bhargavkathivarapu pushed a commit to bhargavkathivarapu/audio that referenced this pull request May 19, 2020
* add slaney normalization.

* add torchscript.

* convert to string for torchscript compatibility.

* flake8.

* use string as default.
mthrok pushed a commit to mthrok/audio that referenced this pull request Feb 26, 2021
Update seq2seq_translation_tutorial.py
mpc001 pushed a commit to mpc001/audio that referenced this pull request Aug 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

amplitude normalization in create_fb_matrix

3 participants