Skip to content

Dynamic quantization for decoding#3210

Merged
sw005320 merged 6 commits intoespnet:masterfrom
xu-gaopeng:quantization
Jun 15, 2021
Merged

Dynamic quantization for decoding#3210
sw005320 merged 6 commits intoespnet:masterfrom
xu-gaopeng:quantization

Conversation

@xu-gaopeng
Copy link
Contributor

This PR adds dynamic quantization for decoding.

@mergify mergify bot added ESPnet1 ASR Automatic speech recogntion labels May 12, 2021
@sw005320
Copy link
Contributor

Thanks!
Did you observe some improvement?

@xu-gaopeng
Copy link
Contributor Author

Thanks!
Did you observe some improvement?
dataset:aishell
conf: train_pytorch_transformer.yaml
api:v2
mode-size: 117m->35M
cer:6.52%->6.59%
decoder-time:667s->465s

@sw005320
Copy link
Contributor

That's very impressive!
Thanks!
We'll definitely include this PR.
Can you fix some CI errors? e.g. https://github.com/espnet/espnet/pull/3210/checks?check_run_id=2565918709#step:8:8

@sw005320 sw005320 requested a review from ShigekiKarita May 13, 2021 12:47
Copy link
Contributor

@sw005320 sw005320 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sw005320 sw005320 added this to the v.0.9.10 milestone May 13, 2021
@codecov
Copy link

codecov bot commented May 14, 2021

Codecov Report

Merging #3210 (fc20535) into master (aaa6ed5) will decrease coverage by 0.36%.
The diff coverage is 96.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #3210      +/-   ##
==========================================
- Coverage   80.96%   80.60%   -0.37%     
==========================================
  Files         356      356              
  Lines       30773    30706      -67     
==========================================
- Hits        24916    24751     -165     
- Misses       5857     5955      +98     
Impacted Files Coverage Δ
espnet/asr/pytorch_backend/asr.py 58.18% <90.90%> (+0.59%) ⬆️
espnet/asr/pytorch_backend/recog.py 85.88% <100.00%> (+2.09%) ⬆️
espnet/bin/asr_recog.py 84.07% <100.00%> (+0.28%) ⬆️
espnet2/asr/ctc.py 53.84% <0.00%> (-20.00%) ⬇️
espnet/nets/pytorch_backend/ctc.py 51.53% <0.00%> (-15.39%) ⬇️
espnet/nets/pytorch_backend/transformer/mask.py 87.50% <0.00%> (-12.50%) ⬇️
espnet2/layers/stft.py 85.29% <0.00%> (-5.89%) ⬇️
espnet/nets/pytorch_backend/e2e_asr_maskctc.py 85.83% <0.00%> (-5.16%) ⬇️
...ets/pytorch_backend/fastspeech/length_regulator.py 96.00% <0.00%> (-4.00%) ⬇️
espnet/transform/transform_interface.py 66.66% <0.00%> (-3.34%) ⬇️
... and 64 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update aaa6ed5...fc20535. Read the comment docs.

@ShigekiKarita
Copy link
Member

ShigekiKarita commented May 15, 2021

Very simple and good job! I think we can merge this without @sw005320 requests but we need tests. Can you add a new test config next to this?

./run.sh --python "${python}" --stage 5 --lm-config conf/lm_transformer.yaml --decode-config "$(change_yaml.py conf/decode.yaml -a api=v2)"

I think it can be

./run.sh --python "${python}" --stage 5 --decode-config "$(change_yaml.py conf/decode.yaml -a quantize-model=true)"

And if you can submit your existing AISHELL config, it is super helpful.

@b-flo
Copy link
Member

b-flo commented May 16, 2021

For information: using quantization with transducer model will currently throw an error because JointNetwork's linear layers inputs during decoding have less than 2 dimensions.

@mergify mergify bot added the CI Travis, Circle CI, etc label May 18, 2021
@xu-gaopeng
Copy link
Contributor Author

parser.add_argument(
"--quantize-config",
type=set,
default={torch.nn.Linear},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this does not work when non default value is set. It is OK to remove this and support only Linear at this moment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nn.Linear, nn.LSTM, nn.GRU, etc.

Copy link
Member

@ShigekiKarita ShigekiKarita left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks!


from espnet.nets.asr_interface import dynamic_import_asr

torch = pytest.importorskip("torch")
Copy link
Member

@ShigekiKarita ShigekiKarita May 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you do importorskip if pytorch is too old for the feature? For example

quantization = pytest.importorskip("torch.quantization")

def test_asr_quantize(...):
  ...
  quantization.quantize_dynamic(...)

@xu-gaopeng xu-gaopeng requested a review from sw005320 May 20, 2021 01:04
Copy link
Contributor

@sw005320 sw005320 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.
Once we fix the CI issue and reflect @ShigekiKarita's comments, we can merge this PR.

Copy link
Member

@b-flo b-flo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As @ShigekiKarita pointed out, I'm not sure quantize-config works as intended outside default value? Or at least, I'm a bit confused how the set should be provided in yaml config to yield a proper set through argparse.
For example, quantize-config: { torch.nn.LSTM } or quantize-config: !!set { torch.nn.LSTM } yield the following set: { 'S', '{', 'h', 'M', 'n', 'L', 'T', '.', 'i', 't', ',', 'c', ' ', 'e', 'r', '}', 'o', 'a' }

I suppose I'm forgetting a conventional notation but if not, the following changes could be used for ASR case. Note that I took some liberty on the parameter usage compared to initial proposition.

@kan-bayashi kan-bayashi modified the milestones: v.0.9.10, v.0.9.11 May 29, 2021
@sw005320
Copy link
Contributor

sw005320 commented Jun 3, 2021

@xu-gaopeng, I think I can merge it if you fix the CI error https://github.com/espnet/espnet/pull/3210/checks?check_run_id=2674693218#step:10:268 and @b-flo's comments.
This PR is really valuable, and we want to use it as a default setup for our inference demo!

Copy link
Member

@b-flo b-flo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Following-up to last comment/change (See #3210 (comment))

@ShigekiKarita
Copy link
Member

@sw005320 Because no one can write codes with tied hands, I suggest @xu-gaopeng to submit a quick PR in advance to unlock the "first time contributer" limitation that restricts CI trials. For example, README.md update on this feature?

@sw005320
Copy link
Contributor

sw005320 commented Jun 9, 2021

@sw005320 Because no one can write codes with tied hands, I suggest @xu-gaopeng to submit a quick PR in advance to unlock the "first time contributer" limitation that restricts CI trials. For example, README.md update on this feature?

Good idea.
@xu-gaopeng, could you make a separate small PR?
We’ll merge it soon and CI trials in this PR become very smooth.

@mergify
Copy link
Contributor

mergify bot commented Jun 9, 2021

This pull request is now in conflict :(

@mergify mergify bot added the conflicts label Jun 9, 2021
@xu-gaopeng
Copy link
Contributor Author

@sw005320 The CI error has been fixed

1 similar comment
@xu-gaopeng
Copy link
Contributor Author

@sw005320 The CI error has been fixed

parser.add_argument(
"--quantize-config",
nargs="*",
help="Quantize config list. E.g.: --quantize-config=[Linear,LSTM,GRU]",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you mention that the values must be attributes of torch.nn?

Copy link
Member

@ShigekiKarita ShigekiKarita left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your hard work! LGTM again.

@sw005320
Copy link
Contributor

@b-flo, is it OK for you?

@sw005320 sw005320 merged commit bfb979c into espnet:master Jun 15, 2021
@sw005320
Copy link
Contributor

I just merged this PR.
Many thanks, @xu-gaopeng for your great efforts!
@b-flo, if you are not OK with some parts, please continue the discussion here.

@b-flo
Copy link
Member

b-flo commented Sep 16, 2021

@b-flo, if you are not OK with some parts, please continue the discussion here.

Sorry for the (extremely) late answer, I just saw the mention when re-reading the PR..!
I'm OK with the general scheme! I'm currently extending current work to :

  1. Add support for Transducer
  2. Handle some problematic cases: quantization can't be applied to some parts of our implementation depending on the torch versions.
  3. Add missing doc (such as the one requested by @ShigekiKarita)

Also, I wanted to ask: do you know if the feature is widely used or at least useful for some of ESPnet member projects? I'm wondering if we should also support quantization-aware training and/or post-training quantization to handle the loss of information.

@sw005320 sw005320 mentioned this pull request Jan 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ASR Automatic speech recogntion CI Travis, Circle CI, etc ESPnet1 New Features

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants