Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add slaney normalization #589

Merged
merged 5 commits into from May 14, 2020
Merged

Conversation

vincentqb
Copy link
Contributor

Fixes #287

@@ -430,6 +431,8 @@ def create_fb_matrix(
f_max (float): Maximum frequency (Hz)
n_mels (int): Number of mel filterbanks
sample_rate (int): Sample rate of the audio waveform
norm (Optional[str]): If 'slaney', divide the triangular mel weights by the width of the mel band
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we expected further normalization schemes? Does it make sense to split these normalizations out into its own layer? Are they useful in other contexts (maybe for volume normalization and such)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we expected further normalization schemes?

Yes, we could, see librosa/librosa#1050. Librosa itself is in process of adding other normalization as mentioned in this pull request.

Does it make sense to split these normalizations out into its own layer?

Not according to this comment.

Are they useful in other contexts (maybe for volume normalization and such)?

The normalization is done against f_pts which is computed within create_fb_matrix. I'm not aware of other use case.

@vincentqb vincentqb marked this pull request as ready for review May 4, 2020 22:36
@vincentqb vincentqb requested review from cpuhrsch and mthrok May 5, 2020 18:25
@@ -99,7 +99,8 @@ def func(_):
f_max = 20.0
n_mels = 10
sample_rate = 16000
return F.create_fb_matrix(n_stft, f_min, f_max, n_mels, sample_rate)
norm = None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you're exercising torchscript I'd pass the less trivial type which is a string instead of None.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean: default to empty string? We don't use that elsewhere, but sounds good to me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

torchaudio/functional.py Outdated Show resolved Hide resolved
@codecov
Copy link

codecov bot commented May 5, 2020

Codecov Report

Merging #589 into master will increase coverage by 0.01%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #589      +/-   ##
==========================================
+ Coverage   88.99%   89.01%   +0.01%     
==========================================
  Files          21       21              
  Lines        2254     2257       +3     
==========================================
+ Hits         2006     2009       +3     
  Misses        248      248              
Impacted Files Coverage Δ
torchaudio/functional.py 95.53% <100.00%> (+0.01%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b499840...7270e6e. Read the comment docs.

@vincentqb
Copy link
Contributor Author

@mthrok -- do you have any feedback?

Copy link
Collaborator

@mthrok mthrok left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the late reply. Looks good to me. One nit.

@@ -424,7 +424,8 @@ def create_fb_matrix(
f_min: float,
f_max: float,
n_mels: int,
sample_rate: int
sample_rate: int,
norm: str = "",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a public API signature, I think Optional[str] looks cleaner.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was made due to comment. I'll leave it as it is for now. We can always extend the str to Optional[str] without BC breaking later :)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks to me that, that comment was meant for the type of variable to pass when running Torchscript test, not about the function signature.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we allow None in the signature, then the code should work with/without jit when passing None. It wasn't though. Is that what you meant?

Copy link
Collaborator

@mthrok mthrok May 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we allow None in the signature, then the code should work with/without jit when passing None.

Yes, it works.

from typing import Optional

import torch
from torch import Tensor


def bar(foo: Optional[str]=None) -> Tensor:
    if foo is None:
        return torch.zeros(1, 2)
    if foo == "a":
        return torch.ones(1, 1)

    return torch.empty(1, 1)


ts_bar = torch.jit.script(bar)

for v in [None, "a", "b"]:
    print(v)
    print(bar(v))
    print(ts_bar(v))

produces

None
tensor([[0., 0.]])
tensor([[0., 0.]])
a
tensor([[1.]])
tensor([[1.]])
b
tensor([[-2.8910e+12]])
tensor([[0.]])

also dcshift uses Optional[float] and it works fine. for both None and float input for torchscript.
#558

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's just when type is optional, it firstly needs to compare against None using if var is None or if var is not None.

Copy link
Collaborator

@mthrok mthrok May 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://pytorch.org/docs/master/jit_language_reference.html#optional-type-refinement

TorchScript will refine the type of a variable of type Optional[T] when a comparison to None is made inside the conditional of an if-statement or checked in an assert. The compiler can reason about multiple None checks that are combined with and, or, and not. Refinement will also occur for else blocks of if-statements that are not explicitly written.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alrighty, #641 :)

@vincentqb vincentqb merged commit 995b75f into pytorch:master May 14, 2020
bhargavkathivarapu pushed a commit to bhargavkathivarapu/audio that referenced this pull request May 19, 2020
* add slaney normalization.

* add torchscript.

* convert to string for torchscript compatibility.

* flake8.

* use string as default.
mthrok pushed a commit to mthrok/audio that referenced this pull request Feb 26, 2021
Update seq2seq_translation_tutorial.py
mpc001 pushed a commit to mpc001/audio that referenced this pull request Aug 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

amplitude normalization in create_fb_matrix
3 participants