Added test for CMUArctic Dataset #829

subramen · 2020-07-27T17:26:50Z

PR linked to #821
I have used the same dummy utterance for all the emulated samples.

Requested review: @mthrok

mthrok

Thanks, overall looks good. Added some comments for improvement.

mthrok · 2020-07-27T22:04:57Z

test/datasets/cmuarctic_test.py

+    backend = "default"
+
+    root_dir = None
+    URL = "aew"  # default url in CMUARCTIC


this URL is used in only one place, and I do not see a need this to be class variable, so you can merge this with string literal.

mthrok · 2020-07-27T23:33:36Z

test/datasets/cmuarctic_test.py

+        utterance = "This is a test utterance."
+
+        base_dir = os.path.join(cls.root_dir, "ARCTIC", "cmu_us_" + cls.URL + "_arctic")
+        # Contains utterance ID & sentence prompts


This kind of comments, ones not adding more information than what code expresses are not necessary. You can remove them.

mthrok · 2020-07-27T23:34:52Z

test/datasets/cmuarctic_test.py

+        with open(txt_file, "w") as txt:
+            for i in range(10):
+                # Write audio file
+                utterance_id = f"arctic_a{i:04d}"


Can you also add some of f"arctic_b{i:04d}" patterns?

mthrok · 2020-07-27T23:39:18Z

test/datasets/cmuarctic_test.py

+            assert utterance == expected_sample[2]
+            assert utterance_id == expected_sample[3]
+            self.assertEqual(expected_sample[0], waveform, atol=5e-5, rtol=1e-8)
+        assert (i + 1) == len(self.samples)


Please do not use temporary variable outside of the loop in which the variable was defined and meant to be used. This works, but this is easy to miss for the other developers who work on this code later. Define a dedicated variable for this.

mthrok · 2020-07-27T23:39:56Z

test/datasets/datasets_test.py

@@ -32,10 +32,6 @@ def test_speechcommands(self):
        data = SPEECHCOMMANDS(self.path)
        data[0]

-    def test_cmuarctic(self):


Can you remove the corresponding asset and import statement??

mthrok · 2020-07-28T14:12:45Z

test/datasets/cmuarctic_test.py

+                        duration=3,
+                        n_channels=1,
+                        dtype="int16",
+                        seed=seed,


Please use different seed value for each generated samples, otherwise all the loaded Tensors have the exactly same shape and value, and that will lose the point of comparing loaded Tensor object.

mthrok · 2020-07-28T14:13:10Z

Could you rebase onto the latest master?

codecov · 2020-07-28T14:26:13Z

Codecov Report

Merging #829 into master will not change coverage.
The diff coverage is n/a.

@@           Coverage Diff           @@
##           master     #829   +/-   ##
=======================================
  Coverage   89.99%   89.99%           
=======================================
  Files          35       35           
  Lines        2719     2719           
=======================================
  Hits         2447     2447           
  Misses        272      272

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update db38fc8...d340fb5. Read the comment docs.

Co-authored-by: lawrencechen <lawrencechen@devvm3189.vll0.facebook.com>

* Add test for CommonVoice dataset * Migrate the existing tests for `bg_iterator` and `diskcache_iterator` to `test/datasets/utils_test.py` Co-authored-by: Leon Gao <legao@linkedin.com>

…to cmuarctic_test

mthrok

Almost good, but seed value is not quite right.

test/datasets/cmuarctic_test.py

mthrok

Looks good. Thanks!

subramen added 2 commits July 27, 2020 13:13

Add test for CMUArctic Dataset

d7a2777

Fixed style

da88d47

subramen marked this pull request as draft July 27, 2020 17:37

subramen added 4 commits July 27, 2020 14:07

Fixed failing test + lint

1a1b7ba

Typo

7bad4a6

Updated dtype to int16 for wav

5c276cd

Typo

8e9bab0

subramen marked this pull request as ready for review July 27, 2020 18:41

mthrok reviewed Jul 27, 2020

View reviewed changes

subramen added 3 commits July 28, 2020 09:56

Post-review improvements

49e3cff

Delete extra comments

f469729

Linter

011331c

mthrok reviewed Jul 28, 2020

View reviewed changes

subramen and others added 7 commits July 28, 2020 15:54

Merge branch 'master' into cmuarctic_test

a607710

Add test for CMUArctic Dataset

49be88f

Add unit test for LJSpeech dataset (pytorch#826)

920795d

Co-authored-by: lawrencechen <lawrencechen@devvm3189.vll0.facebook.com>

Add test for Speech Commands dataset (pytorch#824)

a00a5e4

Add test for CommonVoice dataset (pytorch#827)

d787e6f

* Add test for CommonVoice dataset * Migrate the existing tests for `bg_iterator` and `diskcache_iterator` to `test/datasets/utils_test.py` Co-authored-by: Leon Gao <legao@linkedin.com>

Unequal seed for white noise generation

4f8fbf7

Merge branch 'cmuarctic_test' of https://github.com/suraj813/audio in…

6c3cad5

…to cmuarctic_test

mthrok requested changes Jul 28, 2020

View reviewed changes

test/datasets/cmuarctic_test.py Outdated Show resolved Hide resolved

Unique seed values for white noise generation

d340fb5

mthrok approved these changes Jul 28, 2020

View reviewed changes

mthrok merged commit 33f762f into pytorch:master Jul 28, 2020

subramen deleted the cmuarctic_test branch July 29, 2020 00:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added test for CMUArctic Dataset #829

Added test for CMUArctic Dataset #829

subramen commented Jul 27, 2020 •

edited

mthrok left a comment

mthrok Jul 27, 2020

mthrok Jul 27, 2020

mthrok Jul 27, 2020

mthrok Jul 27, 2020

mthrok Jul 27, 2020

mthrok Jul 28, 2020

mthrok commented Jul 28, 2020

codecov bot commented Jul 28, 2020 •

edited

mthrok left a comment

mthrok left a comment

Added test for CMUArctic Dataset #829

Added test for CMUArctic Dataset #829

Conversation

subramen commented Jul 27, 2020 • edited

mthrok left a comment

Choose a reason for hiding this comment

mthrok Jul 27, 2020

Choose a reason for hiding this comment

mthrok Jul 27, 2020

Choose a reason for hiding this comment

mthrok Jul 27, 2020

Choose a reason for hiding this comment

mthrok Jul 27, 2020

Choose a reason for hiding this comment

mthrok Jul 27, 2020

Choose a reason for hiding this comment

mthrok Jul 28, 2020

Choose a reason for hiding this comment

mthrok commented Jul 28, 2020

codecov bot commented Jul 28, 2020 • edited

Codecov Report

mthrok left a comment

Choose a reason for hiding this comment

mthrok left a comment

Choose a reason for hiding this comment

subramen commented Jul 27, 2020 •

edited

codecov bot commented Jul 28, 2020 •

edited