Dynamic quantization for decoding by xu-gaopeng · Pull Request #3210 · espnet/espnet

xu-gaopeng · 2021-05-12T10:01:28Z

This PR adds dynamic quantization for decoding.

sw005320 · 2021-05-12T11:47:24Z

Thanks!
Did you observe some improvement?

xu-gaopeng · 2021-05-13T03:17:10Z

Thanks!
Did you observe some improvement?
dataset:aishell
conf: train_pytorch_transformer.yaml
api:v2
mode-size: 117m->35M
cer:6.52%->6.59%
decoder-time:667s->465s

sw005320 · 2021-05-13T12:47:07Z

That's very impressive!
Thanks!
We'll definitely include this PR.
Can you fix some CI errors? e.g. https://github.com/espnet/espnet/pull/3210/checks?check_run_id=2565918709#step:8:8

sw005320

Can you also add it to the LM?
Can you also add it to espnet2? I believe you can do it at https://github.com/espnet/espnet/blob/master/espnet2/bin/asr_inference.py
This is a great function, and it would be better to mention it in our document, e.g., by adding "fast inference" section to https://github.com/espnet/espnet/blob/master/doc/tutorial.md. I think I'll do it after this PR is merged and will ask you to check it.

codecov · 2021-05-14T03:36:52Z

Codecov Report

Merging #3210 (fc20535) into master (aaa6ed5) will decrease coverage by 0.36%.
The diff coverage is 96.00%.

@@            Coverage Diff             @@
##           master    #3210      +/-   ##
==========================================
- Coverage   80.96%   80.60%   -0.37%     
==========================================
  Files         356      356              
  Lines       30773    30706      -67     
==========================================
- Hits        24916    24751     -165     
- Misses       5857     5955      +98

Impacted Files	Coverage Δ
espnet/asr/pytorch_backend/asr.py	`58.18% <90.90%> (+0.59%)`	⬆️
espnet/asr/pytorch_backend/recog.py	`85.88% <100.00%> (+2.09%)`	⬆️
espnet/bin/asr_recog.py	`84.07% <100.00%> (+0.28%)`	⬆️
espnet2/asr/ctc.py	`53.84% <0.00%> (-20.00%)`	⬇️
espnet/nets/pytorch_backend/ctc.py	`51.53% <0.00%> (-15.39%)`	⬇️
espnet/nets/pytorch_backend/transformer/mask.py	`87.50% <0.00%> (-12.50%)`	⬇️
espnet2/layers/stft.py	`85.29% <0.00%> (-5.89%)`	⬇️
espnet/nets/pytorch_backend/e2e_asr_maskctc.py	`85.83% <0.00%> (-5.16%)`	⬇️
...ets/pytorch_backend/fastspeech/length_regulator.py	`96.00% <0.00%> (-4.00%)`	⬇️
espnet/transform/transform_interface.py	`66.66% <0.00%> (-3.34%)`	⬇️
... and 64 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update aaa6ed5...fc20535. Read the comment docs.

ShigekiKarita · 2021-05-15T06:13:21Z

Very simple and good job! I think we can merge this without @sw005320 requests but we need tests. Can you add a new test config next to this?

espnet/ci/test_integration.sh

Line 20 in 08feae5

    
           ./run.sh --python "${python}" --stage 5 --lm-config conf/lm_transformer.yaml --decode-config "$(change_yaml.py conf/decode.yaml -a api=v2)"

I think it can be

./run.sh --python "${python}" --stage 5 --decode-config "$(change_yaml.py conf/decode.yaml -a quantize-model=true)"

And if you can submit your existing AISHELL config, it is super helpful.

espnet/bin/asr_recog.py

b-flo · 2021-05-16T08:05:33Z

For information: using quantization with transducer model will currently throw an error because JointNetwork's linear layers inputs during decoding have less than 2 dimensions.

xu-gaopeng · 2021-05-18T09:52:59Z

Can you also add it to the LM?

Can you also add it to espnet2? I believe you can do it at https://github.com/espnet/espnet/blob/master/espnet2/bin/asr_inference.py

This is a great function, and it would be better to mention it in our document, e.g., by adding "fast inference" section to https://github.com/espnet/espnet/blob/master/doc/tutorial.md. I think I'll do it after this PR is merged and will ask you to check it.
lm quantization has been added, espnet2 will support in a new PR.

ShigekiKarita · 2021-05-19T14:31:59Z

espnet/bin/asr_recog.py

+    parser.add_argument(
+        "--quantize-config",
+        type=set,
+        default={torch.nn.Linear},


I guess this does not work when non default value is set. It is OK to remove this and support only Linear at this moment.

nn.Linear, nn.LSTM, nn.GRU, etc.

ShigekiKarita

LGTM. Thanks!

ShigekiKarita · 2021-05-19T14:57:41Z

test/test_asr_quantize.py

+
+from espnet.nets.asr_interface import dynamic_import_asr
+
+torch = pytest.importorskip("torch")


Can you do importorskip if pytorch is too old for the feature? For example

quantization = pytest.importorskip("torch.quantization") def test_asr_quantize(...): ... quantization.quantize_dynamic(...)

ci/test_integration.sh

sw005320

LGTM.
Once we fix the CI issue and reflect @ShigekiKarita's comments, we can merge this PR.

espnet/asr/pytorch_backend/asr.py

ci/test_integration.sh

.github/workflows/ci.yaml

b-flo

As @ShigekiKarita pointed out, I'm not sure quantize-config works as intended outside default value? Or at least, I'm a bit confused how the set should be provided in yaml config to yield a proper set through argparse.
For example, quantize-config: { torch.nn.LSTM } or quantize-config: !!set { torch.nn.LSTM } yield the following set: { 'S', '{', 'h', 'M', 'n', 'L', 'T', '.', 'i', 't', ',', 'c', ' ', 'e', 'r', '}', 'o', 'a' }

I suppose I'm forgetting a conventional notation but if not, the following changes could be used for ASR case. Note that I took some liberty on the parameter usage compared to initial proposition.

espnet/asr/pytorch_backend/asr.py

espnet/bin/asr_recog.py

sw005320 · 2021-06-03T02:38:55Z

@xu-gaopeng, I think I can merge it if you fix the CI error https://github.com/espnet/espnet/pull/3210/checks?check_run_id=2674693218#step:10:268 and @b-flo's comments.
This PR is really valuable, and we want to use it as a default setup for our inference demo!

b-flo

Following-up to last comment/change (See #3210 (comment))

espnet/asr/pytorch_backend/asr.py

espnet/asr/pytorch_backend/recog.py

ShigekiKarita · 2021-06-08T13:56:50Z

@sw005320 Because no one can write codes with tied hands, I suggest @xu-gaopeng to submit a quick PR in advance to unlock the "first time contributer" limitation that restricts CI trials. For example, README.md update on this feature?

sw005320 · 2021-06-09T00:52:37Z

@sw005320 Because no one can write codes with tied hands, I suggest @xu-gaopeng to submit a quick PR in advance to unlock the "first time contributer" limitation that restricts CI trials. For example, README.md update on this feature?

Good idea.
@xu-gaopeng, could you make a separate small PR?
We’ll merge it soon and CI trials in this PR become very smooth.

mergify · 2021-06-09T12:26:21Z

This pull request is now in conflict :(

xu-gaopeng · 2021-06-10T08:25:09Z

@sw005320 The CI error has been fixed

xu-gaopeng · 2021-06-10T08:25:09Z

@sw005320 The CI error has been fixed

ShigekiKarita · 2021-06-10T09:32:15Z

espnet/bin/asr_recog.py

+    parser.add_argument(
+        "--quantize-config",
+        nargs="*",
+        help="Quantize config list. E.g.: --quantize-config=[Linear,LSTM,GRU]",


Can you mention that the values must be attributes of torch.nn?

ShigekiKarita

Thanks for your hard work! LGTM again.

sw005320 · 2021-06-10T10:57:49Z

@b-flo, is it OK for you?

sw005320 · 2021-06-15T19:11:18Z

I just merged this PR.
Many thanks, @xu-gaopeng for your great efforts!
@b-flo, if you are not OK with some parts, please continue the discussion here.

b-flo · 2021-09-16T17:22:07Z

@b-flo, if you are not OK with some parts, please continue the discussion here.

Sorry for the (extremely) late answer, I just saw the mention when re-reading the PR..!
I'm OK with the general scheme! I'm currently extending current work to :

Add support for Transducer
Handle some problematic cases: quantization can't be applied to some parts of our implementation depending on the torch versions.
Add missing doc (such as the one requested by @ShigekiKarita)

Also, I wanted to ask: do you know if the feature is widely used or at least useful for some of ESPnet member projects? I'm wondering if we should also support quantization-aware training and/or post-training quantization to handle the loss of information.

mergify bot added ESPnet1 ASR Automatic speech recogntion labels May 12, 2021

sw005320 requested a review from ShigekiKarita May 13, 2021 12:47

sw005320 reviewed May 13, 2021

View reviewed changes

sw005320 added the New Features label May 13, 2021

sw005320 added this to the v.0.9.10 milestone May 13, 2021

ShigekiKarita reviewed May 15, 2021

View reviewed changes

espnet/bin/asr_recog.py Show resolved Hide resolved

xu-gaopeng force-pushed the quantization branch from d7e999d to bb1e80e Compare May 18, 2021 09:42

mergify bot added the CI Travis, Circle CI, etc label May 18, 2021

ShigekiKarita reviewed May 19, 2021

View reviewed changes

ShigekiKarita approved these changes May 19, 2021

View reviewed changes

ShigekiKarita reviewed May 19, 2021

View reviewed changes

xu-gaopeng requested a review from sw005320 May 20, 2021 01:04

ShigekiKarita reviewed May 20, 2021

View reviewed changes

ci/test_integration.sh Outdated Show resolved Hide resolved

sw005320 approved these changes May 20, 2021

View reviewed changes

b-flo reviewed May 20, 2021

View reviewed changes

espnet/asr/pytorch_backend/asr.py Outdated Show resolved Hide resolved

ShigekiKarita reviewed May 22, 2021

View reviewed changes

ci/test_integration.sh Outdated Show resolved Hide resolved

ShigekiKarita reviewed May 22, 2021

View reviewed changes

.github/workflows/ci.yaml Outdated Show resolved Hide resolved

b-flo reviewed May 27, 2021

View reviewed changes

espnet/asr/pytorch_backend/asr.py Outdated Show resolved Hide resolved

espnet/bin/asr_recog.py Outdated Show resolved Hide resolved

kan-bayashi modified the milestones: v.0.9.10, v.0.9.11 May 29, 2021

b-flo requested changes Jun 3, 2021

View reviewed changes

espnet/asr/pytorch_backend/asr.py Outdated Show resolved Hide resolved

espnet/asr/pytorch_backend/recog.py Outdated Show resolved Hide resolved

espnet/asr/pytorch_backend/recog.py Outdated Show resolved Hide resolved

mergify bot added the conflicts label Jun 9, 2021

xu-gaopeng added 3 commits June 10, 2021 10:10

add asr quantize

d26d446

Quantize asr model

7930848

fix q_config

1761a04

xu-gaopeng force-pushed the quantization branch from f57ac97 to 1761a04 Compare June 10, 2021 02:35

mergify bot removed the conflicts label Jun 10, 2021

xu-gaopeng added 3 commits June 10, 2021 11:02

Update asr_recog.py

3ad0863

Add quantize model test api v2

9da2353

Update asr_recog

fc20535

ShigekiKarita reviewed Jun 10, 2021

View reviewed changes

ShigekiKarita approved these changes Jun 10, 2021

View reviewed changes

sw005320 merged commit bfb979c into espnet:master Jun 15, 2021

This was referenced Jun 16, 2021

Remove travis and add .github/workflows/doc.yml to deploy doc #3294

Merged

Upgrade pytorch version to 1.9.0 and drop old ones (< 1.3) #3300

Closed

sw005320 mentioned this pull request Jul 30, 2021

Improve inference speed of TTS #3366

Closed

sw005320 mentioned this pull request Jan 7, 2022

Quantization option #3945

Closed


		from espnet.nets.asr_interface import dynamic_import_asr

		torch = pytest.importorskip("torch")

Conversation

xu-gaopeng commented May 12, 2021

Uh oh!

sw005320 commented May 12, 2021

Uh oh!

xu-gaopeng commented May 13, 2021

Uh oh!

sw005320 commented May 13, 2021

Uh oh!

sw005320 left a comment

Choose a reason for hiding this comment

Uh oh!

codecov bot commented May 14, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

ShigekiKarita commented May 15, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

b-flo commented May 16, 2021

Uh oh!

xu-gaopeng commented May 18, 2021

Uh oh!

ShigekiKarita May 19, 2021

Choose a reason for hiding this comment

Uh oh!

xu-gaopeng May 20, 2021

Choose a reason for hiding this comment

Uh oh!

ShigekiKarita left a comment

Choose a reason for hiding this comment

Uh oh!

ShigekiKarita May 19, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

sw005320 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

b-flo left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

sw005320 commented Jun 3, 2021

Uh oh!

b-flo left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ShigekiKarita commented Jun 8, 2021

Uh oh!

sw005320 commented Jun 9, 2021

Uh oh!

mergify bot commented Jun 9, 2021

Uh oh!

xu-gaopeng commented Jun 10, 2021

Uh oh!

xu-gaopeng commented Jun 10, 2021

Uh oh!

ShigekiKarita Jun 10, 2021

Choose a reason for hiding this comment

Uh oh!

ShigekiKarita left a comment

Choose a reason for hiding this comment

Uh oh!

sw005320 commented Jun 10, 2021

Uh oh!

sw005320 commented Jun 15, 2021

Uh oh!

b-flo commented Sep 16, 2021

Uh oh!

codecov bot commented May 14, 2021 •

edited

Loading

ShigekiKarita commented May 15, 2021 •

edited

Loading

ShigekiKarita May 19, 2021 •

edited

Loading

b-flo left a comment •

edited

Loading