Dynamic quantization for decoding#3210
Conversation
|
Thanks! |
|
|
That's very impressive! |
sw005320
left a comment
There was a problem hiding this comment.
- Can you also add it to the LM?
- Can you also add it to espnet2? I believe you can do it at https://github.com/espnet/espnet/blob/master/espnet2/bin/asr_inference.py
- This is a great function, and it would be better to mention it in our document, e.g., by adding "fast inference" section to https://github.com/espnet/espnet/blob/master/doc/tutorial.md. I think I'll do it after this PR is merged and will ask you to check it.
Codecov Report
@@ Coverage Diff @@
## master #3210 +/- ##
==========================================
- Coverage 80.96% 80.60% -0.37%
==========================================
Files 356 356
Lines 30773 30706 -67
==========================================
- Hits 24916 24751 -165
- Misses 5857 5955 +98
Continue to review full report at Codecov.
|
|
Very simple and good job! I think we can merge this without @sw005320 requests but we need tests. Can you add a new test config next to this? Line 20 in 08feae5 I think it can be ./run.sh --python "${python}" --stage 5 --decode-config "$(change_yaml.py conf/decode.yaml -a quantize-model=true)"And if you can submit your existing AISHELL config, it is super helpful. |
|
For information: using quantization with transducer model will currently throw an error because JointNetwork's linear layers inputs during decoding have less than 2 dimensions. |
|
espnet/bin/asr_recog.py
Outdated
| parser.add_argument( | ||
| "--quantize-config", | ||
| type=set, | ||
| default={torch.nn.Linear}, |
There was a problem hiding this comment.
I guess this does not work when non default value is set. It is OK to remove this and support only Linear at this moment.
There was a problem hiding this comment.
nn.Linear, nn.LSTM, nn.GRU, etc.
|
|
||
| from espnet.nets.asr_interface import dynamic_import_asr | ||
|
|
||
| torch = pytest.importorskip("torch") |
There was a problem hiding this comment.
Can you do importorskip if pytorch is too old for the feature? For example
quantization = pytest.importorskip("torch.quantization")
def test_asr_quantize(...):
...
quantization.quantize_dynamic(...)
sw005320
left a comment
There was a problem hiding this comment.
LGTM.
Once we fix the CI issue and reflect @ShigekiKarita's comments, we can merge this PR.
There was a problem hiding this comment.
As @ShigekiKarita pointed out, I'm not sure quantize-config works as intended outside default value? Or at least, I'm a bit confused how the set should be provided in yaml config to yield a proper set through argparse.
For example, quantize-config: { torch.nn.LSTM } or quantize-config: !!set { torch.nn.LSTM } yield the following set: { 'S', '{', 'h', 'M', 'n', 'L', 'T', '.', 'i', 't', ',', 'c', ' ', 'e', 'r', '}', 'o', 'a' }
I suppose I'm forgetting a conventional notation but if not, the following changes could be used for ASR case. Note that I took some liberty on the parameter usage compared to initial proposition.
|
@xu-gaopeng, I think I can merge it if you fix the CI error https://github.com/espnet/espnet/pull/3210/checks?check_run_id=2674693218#step:10:268 and @b-flo's comments. |
b-flo
left a comment
There was a problem hiding this comment.
Following-up to last comment/change (See #3210 (comment))
|
@sw005320 Because no one can write codes with tied hands, I suggest @xu-gaopeng to submit a quick PR in advance to unlock the "first time contributer" limitation that restricts CI trials. For example, README.md update on this feature? |
Good idea. |
|
This pull request is now in conflict :( |
|
@sw005320 The CI error has been fixed |
1 similar comment
|
@sw005320 The CI error has been fixed |
| parser.add_argument( | ||
| "--quantize-config", | ||
| nargs="*", | ||
| help="Quantize config list. E.g.: --quantize-config=[Linear,LSTM,GRU]", |
There was a problem hiding this comment.
Can you mention that the values must be attributes of torch.nn?
ShigekiKarita
left a comment
There was a problem hiding this comment.
Thanks for your hard work! LGTM again.
|
@b-flo, is it OK for you? |
|
I just merged this PR. |
Sorry for the (extremely) late answer, I just saw the mention when re-reading the PR..!
Also, I wanted to ask: do you know if the feature is widely used or at least useful for some of ESPnet member projects? I'm wondering if we should also support quantization-aware training and/or post-training quantization to handle the loss of information. |
This PR adds dynamic quantization for decoding.