Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue 764: Switch Pitch Detection Test to use On the Fly Generation instead of file. #783

Merged
merged 17 commits into from Jul 16, 2020

Conversation

engineerchuan
Copy link
Contributor

  1. Switch Pitch Detection to on the fly generation.
  2. Refactor the integer encoding that whitenoise and sinusoid uses to be shared.
  3. Refactor test_compliance_kaldi.py. Cannot get rid of kaldi_file_8000.wav because we compare the output to corresponding ark files.

@engineerchuan
Copy link
Contributor Author

Maybe there is no point in trying to use on the fly generation if we have to keep kaldi_file_8000.wav around because we are keeping cached ark files around.

@mthrok
Copy link
Collaborator

mthrok commented Jul 14, 2020

Maybe there is no point in trying to use on the fly generation if we have to keep kaldi_file_8000.wav around because we are keeping cached ark files around.

Yes, the problem is that the code used to generate those ark files are not checked-in so we cannot make modifications to test. If we can recover the code used for those ark files, we can switch the test to completely on-the-fly data generation, which is done for other kaldi compatible tests. #672 #681 #687 #690 and #699.

This situation is very bad because resample is used in other places too resample function has to be tested thoroughly but the data stored as ark provides very limited number of use cases.

test/functional_cpu_test.py Outdated Show resolved Hide resolved
@@ -300,19 +300,23 @@ def test_linearity_of_istft4(self):

class TestDetectPitchFrequency(common_utils.TorchaudioTestCase):
def test_pitch(self):
test_filepath_100 = common_utils.get_asset_path("100Hz_44100Hz_16bit_05sec.wav")
test_filepath_440 = common_utils.get_asset_path("440Hz_44100Hz_16bit_05sec.wav")
SAMPLE_RATE = 44100
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this variable capitalized?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking this is a constant per https://www.python.org/dev/peps/pep-0008/#id48? What case would you prefer?

@engineerchuan
Copy link
Contributor Author

Maybe there is no point in trying to use on the fly generation if we have to keep kaldi_file_8000.wav around because we are keeping cached ark files around.

Yes, the problem is that the code used to generate those ark files are not checked-in so we cannot make modifications to test. If we can recover the code used for those ark files, we can switch the test to completely on-the-fly data generation, which is done for other kaldi compatible tests. #672 #681 #687 #690 and #699.

This situation is very bad because resample is used in other places too resample function has to be tested thoroughly but the data stored as ark provides very limited number of use cases.

Let's address this in a follow up.

2. Relax rtol from 1e-8 to 1e-7 for compliance kaldi
3. Switch to on the fly generation for batch pitch tests
@engineerchuan engineerchuan changed the title Issue 764: Switch Pitch Detection to On the Fly Generation Issue 764: Switch Pitch Detection Test to use On the Fly Generation instead of file. Jul 15, 2020
Copy link
Collaborator

@mthrok mthrok left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Added suggestions for further simplification.

test/test_batch_consistency.py Outdated Show resolved Hide resolved
def convert_tensor_encoding(
tensor: torch.tensor,
dtype: torch.dtype,
):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

@codecov
Copy link

codecov bot commented Jul 15, 2020

Codecov Report

Merging #783 into master will decrease coverage by 0.25%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #783      +/-   ##
==========================================
- Coverage   89.78%   89.53%   -0.26%     
==========================================
  Files          34       32       -2     
  Lines        2654     2617      -37     
==========================================
- Hits         2383     2343      -40     
- Misses        271      274       +3     
Impacted Files Coverage Δ
torchaudio/_internal/module_utils.py 85.18% <0.00%> (-11.12%) ⬇️
torchaudio/sox_effects/sox_effects.py 94.44% <0.00%> (-0.80%) ⬇️
torchaudio/__init__.py 73.33% <0.00%> (ø)
torchaudio/sox_effects/__init__.py 100.00% <0.00%> (ø)
torchaudio/utils/sox_utils.py
torchaudio/utils/__init__.py

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 8181a83...3482273. Read the comment docs.

test/common_utils/test_case_utils.py Outdated Show resolved Hide resolved
test/test_sox_effects.py Show resolved Hide resolved
Copy link
Collaborator

@mthrok mthrok left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks almost good. One clean up remaining.

@mthrok mthrok merged commit 02b898f into pytorch:master Jul 16, 2020
@mthrok
Copy link
Collaborator

mthrok commented Jul 16, 2020

Thanks!

@engineerchuan engineerchuan deleted the issue_764 branch July 16, 2020 22:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants