-
Notifications
You must be signed in to change notification settings - Fork 243
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature]Merge lmdeploy lite calibrate
and lmdeploy lite auto_awq
#849
Conversation
Will it affect kv8 quantization? |
docs/en/w4a16.md
Outdated
|
||
```shell | ||
lmdeploy lite calibrate \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
随意改 API/用法会被怼。 实在要改,就保持旧的加 deprecate 时间。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个更改不会影响原本用法,calibrate 依旧会保留,原本的 lmdeploy calibrate + lmdeploy auto_awq 依旧可以使用
d55e896
to
680e3b0
Compare
It will not affect the usage of KV8 and the original W4A16. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Before this PR, AWQ quantization needs to execute two commands
In this PR, AWQ quantization only needs to execute one command
lmdeploy lite auto_awq $HF_MODEL