-
Notifications
You must be signed in to change notification settings - Fork 399
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix export turbomind weight bug #1140
Conversation
Hi @lvhan028 @grimoire @AllentDan May you help review this pr? Thanks. |
这个地方没问题,因为numpy没有bfloat16的类型,所以如果想保存bin的话,只能以view的形式保存。 |
tensor.view only changes the shape and cannot do type conversion from bf16 to half |
lmdeploy/lmdeploy/turbomind/deploy/target_model/base.py Lines 189 to 195 in c48d32d
看下191行,这里本来就是要保存bfloat16 |
It seems that we had some misunderstandings. Base model lmdeploy/src/turbomind/utils/memory_utils.cu Lines 588 to 593 in 2831dc2
this is not invoked. And that's why medusa weights are wrong. So we need to add medusa_weight_type field in ini to solve this problem or just convert and save Medusa weights with half directly.
|
Motivation
fix export Turbomind weight bug
In the process of developing Medusa parameter conversion, we referred to the implementation of
export_weight
and found that there was a bug with the implementation of parameter conversion in bf16. This fix was proposed by the new grad named baozhiwei in my group.Modification
use
tensor.to
rather thantensor.view
Checklist