Add Conv2dSubsampling1 module and test it in AphasiaBank ASR recipe #4892

tjysdsg · 2023-01-27T06:19:29Z

Conv2dSubsampling1 module is similar to Conv2dSubsampling except the subsampling ratio is 1:1.

AphasiaBank English ASR experiments using E-Branchformer+WavLM showed a CER decrease from 17.7 to 17.0 by switching from Conv2dSubsampling2 to Conv2dSubsampling1.

The user can specify input_layer: conv2d1 in the encoder configuration to use this module.

codecov · 2023-01-27T07:06:53Z

Codecov Report

Merging #4892 (c1dc65f) into master (3970558) will decrease coverage by 0.04%.
The diff coverage is 80.00%.

@@            Coverage Diff             @@
##           master    #4892      +/-   ##
==========================================
- Coverage   76.61%   76.57%   -0.04%     
==========================================
  Files         603      603              
  Lines       53677    53737      +60     
==========================================
+ Hits        41122    41151      +29     
- Misses      12555    12586      +31

Flag	Coverage Δ
test_integration_espnet1	`66.33% <22.22%> (-0.05%)`	⬇️
test_integration_espnet2	`47.59% <16.66%> (-0.08%)`	⬇️
test_python	`66.45% <80.00%> (-0.02%)`	⬇️
test_utils	`23.35% <22.22%> (-0.01%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
espnet2/asr/encoder/transformer_encoder.py	`92.04% <50.00%> (-0.98%)`	⬇️
...pnet2/asr/encoder/transformer_encoder_multispkr.py	`82.60% <50.00%> (-0.98%)`	⬇️
...et/nets/pytorch_backend/transformer/subsampling.py	`86.17% <77.77%> (-1.99%)`	⬇️
espnet2/asr/encoder/branchformer_encoder.py	`94.19% <100.00%> (+0.05%)`	⬆️
espnet2/asr/encoder/conformer_encoder.py	`96.35% <100.00%> (+0.05%)`	⬆️
espnet2/asr/encoder/e_branchformer_encoder.py	`95.15% <100.00%> (+0.05%)`	⬆️
espnet2/asr/encoder/longformer_encoder.py	`79.50% <100.00%> (+0.34%)`	⬆️
espnet2/asr/espnet_model.py	`77.69% <0.00%> (-3.36%)`	⬇️
espnet2/train/trainer.py	`76.40% <0.00%> (-0.94%)`	⬇️
espnet2/train/preprocessor.py	`27.14% <0.00%> (-0.26%)`	⬇️
... and 3 more

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

simpleoier

Better results!
Looks good to me in general!
Just one concern about the name of the new module, Conv2dSubsampling1 doesn't do subsampling actually.

tjysdsg · 2023-01-27T16:55:20Z

Better results! Looks good to me in general! Just one concern about the name of the new module, Conv2dSubsampling1 doesn't do subsampling actually.

Oh right, what's your suggestion? Should we just call it Conv2d1 or Conv2dInputLayer?

pyf98 · 2023-01-28T19:17:46Z

egs2/aphasiabank/asr1/README.md

@@ -9,6 +9,24 @@
 - Git hash: `39c1ec0509904f16ac36d25efc971e2a94ff781f`
    - Commit date: `Wed Dec 21 12:50:18 2022 -0500`

+## asr_train_asr_ebranchformer_small_wavlm_large1
+
+- [train_asr_ebranchformer_small_wavlm_large.yaml](conf/tuning/train_asr_ebranchformer_small_wavlm_large.yaml)


Can you add a short note here to explain the difference? E.g., this config uses xxx as the input layer which does not perform downsampling.

Are this config name and path correct? It looks same as the previous one.

Thanks! Fixed in 4734d75

sw005320 · 2023-01-29T01:06:35Z

Better results! Looks good to me in general! Just one concern about the name of the new module, Conv2dSubsampling1 doesn't do subsampling actually.

Oh right, what's your suggestion? Should we just call it Conv2d1 or Conv2dInputLayer?

I think Conv2dSubsampling1 is OK, but it is better to add a comment that this does not do any subsampling in appropriate places.

tjysdsg · 2023-01-29T04:13:16Z

@sw005320 I made the comment clearer about the functionality in c1dc65f

sw005320 · 2023-01-30T10:49:37Z

Thanks a lot!

tjysdsg added 2 commits January 23, 2023 16:35

Add Conv2dSubsampling1 module and relevant tests

22018d6

Use conv2d1 in AphasiaBank English ASR experiment

4c1770e

mergify bot added ESPnet1 ESPnet2 README labels Jan 27, 2023

Reformat code using black

3564b00

simpleoier approved these changes Jan 27, 2023

View reviewed changes

pyf98 reviewed Jan 28, 2023

View reviewed changes

Fix README description

4734d75

sw005320 added this to the v.202301 milestone Jan 29, 2023

sw005320 added the Enhancement Enhancement label Jan 29, 2023

Clearify the functionality of Conv2dSubsampling1 in python doc

c1dc65f

sw005320 merged commit e37ee27 into espnet:master Jan 30, 2023

tjysdsg deleted the conv2d1 branch January 30, 2023 21:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Conv2dSubsampling1 module and test it in AphasiaBank ASR recipe #4892

Add Conv2dSubsampling1 module and test it in AphasiaBank ASR recipe #4892

tjysdsg commented Jan 27, 2023 •

edited

codecov bot commented Jan 27, 2023 •

edited

simpleoier left a comment

tjysdsg commented Jan 27, 2023

pyf98 Jan 28, 2023

pyf98 Jan 28, 2023 •

edited

tjysdsg Jan 28, 2023

sw005320 commented Jan 29, 2023

tjysdsg commented Jan 29, 2023

sw005320 commented Jan 30, 2023

Add Conv2dSubsampling1 module and test it in AphasiaBank ASR recipe #4892

Add Conv2dSubsampling1 module and test it in AphasiaBank ASR recipe #4892

Conversation

tjysdsg commented Jan 27, 2023 • edited

codecov bot commented Jan 27, 2023 • edited

Codecov Report

simpleoier left a comment

Choose a reason for hiding this comment

tjysdsg commented Jan 27, 2023

pyf98 Jan 28, 2023

Choose a reason for hiding this comment

pyf98 Jan 28, 2023 • edited

Choose a reason for hiding this comment

tjysdsg Jan 28, 2023

Choose a reason for hiding this comment

sw005320 commented Jan 29, 2023

tjysdsg commented Jan 29, 2023

sw005320 commented Jan 30, 2023

tjysdsg commented Jan 27, 2023 •

edited

codecov bot commented Jan 27, 2023 •

edited

pyf98 Jan 28, 2023 •

edited