Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

端侧化,paraformer转RK框架,跪求fp16模型或者finetune训练方案 #1363

Open
Xsx93 opened this issue Feb 8, 2024 · 4 comments
Labels
question Further information is requested

Comments

@Xsx93
Copy link

Xsx93 commented Feb 8, 2024

背景:我在探索paraformer在端侧上部署方法,我希望通过RK框架调用NPU进行推理。RK框架只支持fp16精度的模型进行推理。
FP16的表示范围[-65504 ~ 66504],FP32表示范围[-3.4×10^{38},3.4×10^{38}],因此FP32模型直接转RK模型,在推理过程中会出现溢出(NAN)。

我采用了FUNASR教程:https://github.com/alibaba-damo-academy/FunASR/blob/v0.8.8/funasr/export/README.md ,进行INT8量化,然而该方案是动态量化,在真正计算时仍会逆量化为fp32。
问题:

请问是否有真正的fp16模型或者finetune训练方案?

@Xsx93 Xsx93 added the question Further information is requested label Feb 8, 2024
@LauraGPT
Copy link
Collaborator

Maybe you could ref to this code: 33f2d46

@Xsx93
Copy link
Author

Xsx93 commented Mar 5, 2024

Maybe you could ref to this code: 33f2d46

请问一下,怎么具体使用这两句代码?scale的大小是多少?

@Text2-m
Copy link

Text2-m commented Mar 21, 2024

Maybe you could ref to this code: 33f2d46

请问一下,怎么具体使用这两句代码?scale的大小是多少?

请问,你做到端侧的方案是什么?

@Xsx93
Copy link
Author

Xsx93 commented Mar 25, 2024

Maybe you could ref to this code: 33f2d46

请问一下,怎么具体使用这两句代码?scale的大小是多少?

请问,你做到端侧的方案是什么?

2pass方案

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants