Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Efficient Conformer implementation #1636

Merged
merged 5 commits into from
Jan 4, 2023
Merged

Conversation

zwglory
Copy link
Contributor

@zwglory zwglory commented Dec 26, 2022

This PR is about our implementation of Efficient Conformer for WeNet encoder structure and runtime.

In 58.Com Inc, using Efficient Conformer can reduce CER by 6% relative to Conformer and a 10% increase in inference speed (CPU JIT runtime). Combined with int8 quantization, the inference speed can be improved by 50~70%. More detail of our work: https://mp.weixin.qq.com/s/7T1gnNrVmKIDvQ03etltGQ

Added features

  • Efficient Conformer Encoder structure
    • StrideConformerEncoderLayer for "Progressive Downsampling to the Conformer encoder"
    • GroupedRelPositionMultiHeadedAttention for "Grouped Attention"
    • Conv2dSubsampling2 for 1/2 Convolution Downsampling
  • Recognize and JIT export
    • forward_chunk and forward_chunk_by_chunk in wenet/efficient_conformer/encoder.py
  • Streaming inference at JIT runtime
    • TorchAsrModelEfficient in runtime/core/decoder for Progressive Downsampling
  • Configuration file of Aishell-1
    • train_u2++_efficonformer_v1.yaml for our online deployment
    • train_u2++_efficonformer_v2.yaml for Original paper

Developers

  • Efficient Conformer Encoder structure: ( Yaru Wang & Wei Zhou )
  • Recognize and JIT export: ( Wei Zhou )
  • Streaming inference at JIT runtime: ( Yongze Li )
  • Configuration file of Aishell-1: ( Wei Zhou )

TODO

  • ONNX export and runtime
  • Aishell-1 experiment

@xingchensong
Copy link
Member

  1. 粗看了一下,感觉很多地方可以复用已有的代码,特别是如果能把“att_cache_shape” “cnn_cache_shape” 的功能做进模型内部,那么runtime下的所有修改都是可以省去的。squeezeformer的pr保持了和既有forward_chunk接口的的一致性, squeezeformer 内部也会涉及不同层的cache shape不相同的问题,可以参考下。
  2. wenet/efficient_conformer文件夹下,attention.py 中 MultiHeadedAttention 和 RelPositionMultiHeadedAttention 应该不需要重新实现(和wenet.transformer.MultiHeadAttention几乎没有太大差别),相应的小修改(主要在forward_attention函数涉及一些padding)通过在GroupedRelPositionMultiHeadedAttention中重载即可

@zwglory
Copy link
Contributor Author

zwglory commented Dec 26, 2022

  1. 粗看了一下,感觉很多地方可以复用已有的代码,特别是如果能把“att_cache_shape” “cnn_cache_shape” 的功能做进模型内部,那么runtime下的所有修改都是可以省去的。squeezeformer的pr保持了和既有forward_chunk接口的的一致性, squeezeformer 内部也会涉及不同层的cache shape不相同的问题,可以参考下。
  2. wenet/efficient_conformer文件夹下,attention.py 中 MultiHeadedAttention 和 RelPositionMultiHeadedAttention 应该不需要重新实现(和wenet.transformer.MultiHeadAttention几乎没有太大差别),相应的小修改(主要在forward_attention函数涉及一些padding)通过在GroupedRelPositionMultiHeadedAttention中重载即可

@xingchensong 感谢建议,cache shape问题我再具体看下。

zwglory and others added 3 commits December 30, 2022 10:51
…e changes. Completed the casual and non-casual convolution model tests for the EfficientConformer, as well as JIT runtime tests. Modified yaml files for Aishell-1
@xingchensong xingchensong merged commit 7427258 into wenet-e2e:main Jan 4, 2023
@xingchensong
Copy link
Member

THX!

@KakayaLin
Copy link

@zwglory 請問有預訓練模型能夠下載測試嗎? 謝謝。

@zwglory
Copy link
Contributor Author

zwglory commented Mar 14, 2023

@zwglory 請問有預訓練模型能夠下載測試嗎? 謝謝。

@KakayaLin 可以的,后面我们上传后在这里同步。

@KakayaLin
Copy link

@zwglory 不好意思, 請問上傳有更新進度嗎? 謝謝。

@zwglory
Copy link
Contributor Author

zwglory commented Mar 20, 2023

@zwglory 請問有預訓練模型能夠下載測試嗎? 謝謝。

@KakayaLin 可以的,后面我们上传后在这里同步。

@KakayaLin AISHELL-1 的模型链接如下,后续会更新在相关README中

@KakayaLin
Copy link

@zwglory 請問有預訓練模型能夠下載測試嗎? 謝謝。

@KakayaLin 可以的,后面我们上传后在这里同步。

@KakayaLin AISHELL-1 的模型链接如下,后续会更新在相关README中

謝謝!!

@dipeshhoncho07
Copy link

Can we incorporate LM in efficient conformer?

@zwglory
Copy link
Contributor Author

zwglory commented May 9, 2023

Can we incorporate LM in efficient conformer?

@dipeshhoncho07 yes, efficient conformer support LM in runtime.

@bourne979
Copy link

Hi, @zwglory, do you have an update on onnx cpu export? Thanks.

@zwglory
Copy link
Contributor Author

zwglory commented Oct 26, 2023

Hi, @zwglory, do you have an update on onnx cpu export? Thanks.

You can refer to this description to try it out,
#1918 (comment)

and we will follow up on this part of the feature when we have time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants