-
Notifications
You must be signed in to change notification settings - Fork 759
BarkSpectrogram #2823
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BarkSpectrogram #2823
Conversation
|
I'm unsure if the failing tests are related to the functionalities I'm trying to add. |
carolineechen
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @jdariasl, thanks for working on this! I left a couple of comments.
I also wanted to note that as this is a new feature, the TorchAudio team would like to first support it as a prototype feature (feature classifications) available in our nightly builds, and will reevaluate moving it to beta in one of the future releases, pending use cases and API stability (it would be great if you could provide any additional context/existing implementations of this feature we could refer to!). If that sounds good to you, we can help move it to torchaudio.prototype after your PR is reviewed and merged, lmk if you have any questions. Thanks again!
| specgram = torch.randn(3, 2, 201, 256) | ||
|
|
||
| atol = 1e-6 if os.name == "nt" else 1e-8 | ||
| atol = 1e-4 if os.name == "nt" else 1e-6 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you revert back this change? these values should be fine for the original melscale tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes of course. When I ran the test locally, MelScale didn't satisfy these values, so I played a little bit with the values and forgot to revert the changes. You can test it on your own.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any feedback about this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey, yea it is also erroring out for me with those values locally, but seems to run okay on our GitHub CircleCI tests -- I think we can leave as is for now, and I can also test out to see if bark scale is passing on CircleCI tests with smaller tolerances after this PR is merged.
| def test_batch_BarkScale(self): | ||
| specgram = torch.randn(3, 2, 201, 256) | ||
|
|
||
| atol = 1e-4 if os.name == "nt" else 1e-6 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you know why these tolerances are increased from the working melscale ones above?
|
The bark scale is just another auditory scale used in speech and audio processing. Such scale is used to estimate RASTA-PLP coefficients (the acronym stands for Relative Spectral Transform - Perceptual Linear Prediction), which is also an alternative to MFCC used in different contexts, including biomedical applications. Most of the available implementations are developed in Matlab: https://www.ee.columbia.edu/~dpwe/resources/matlab/rastamat/, https://es.mathworks.com/help/audio/ref/designauditoryfilterbank.html, even though there is also an implementation in Praat. In this sense, I think a better integration of the Bark scale into torchaudio would be to develop an On the other hand, I'm unsure whether it is possible to propose a comparison with Matlab implementations. |
carolineechen
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a couple of small comments, and could you also add the new functions to docs/source/functional.rst and docs/source/transforms.rst to have it reflected in the pytorch documentation?
Otherwise, the implementation looks good, and should be ready to merge soon! Once merged, I'll create a PR to move this into prototype and investigate the bark spectrogram tolerance, thanks for the contribution :)
| specgram = torch.randn(3, 2, 201, 256) | ||
|
|
||
| atol = 1e-6 if os.name == "nt" else 1e-8 | ||
| atol = 1e-4 if os.name == "nt" else 1e-6 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey, yea it is also erroring out for me with those values locally, but seems to run okay on our GitHub CircleCI tests -- I think we can leave as is for now, and I can also test out to see if bark scale is passing on CircleCI tests with smaller tolerances after this PR is merged.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for the feature contribution, I think this implementation looks good!
follow ups on my end after merging
- move it to a prototype feature
- experiment with test tolerance
- add assets image
|
@carolineechen has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
|
@carolineechen has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
Summary: follow up to #2823 - move bark spectrogram to prototype - decrease autograd test tolerance (passing on circle ci) - add diagram for bark fbanks cc jdariasl Pull Request resolved: #2843 Reviewed By: nateanl Differential Revision: D41199522 Pulled By: carolineechen fbshipit-source-id: 8e6c2e20fb7b14f39477683b3c6ed8356359a213

I have added BarkScale transform, which can transform a regular Spectrogram into a BarkSpectrograms similar to MelScale. @ahmed-fau opened this requirement in December 2021 with the number (#2103). The new functionality includes three different well-known approximations of the Bark scale.