Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[General Quant Linear] Register quant params of general quant linear for friendly post process. #226

Merged
merged 1 commit into from
Aug 4, 2023

Conversation

LeiWang1999
Copy link
Contributor

Hi all, I have implemented support for loading auto-gptq pre-quantized checkpoints into MLC-LLM (based on the release 0.1.0 ). However this flow has encountered some issues with the GeneralQuantLinear in the release later than 0.2.0 has some issues of the load flow.

As from the compiler side, we load params with three different ways: named_buffers(), named_paramters(), state_dicts(), however, because the qweight, qzeros is not registered in the GeneralQantLinear, which means we cannot retrieve them using any of these methods.

This issue prevents us from fully utilizing the pre-quantized auto-gptq checkpoints in MLC-LLM. I believe addressing this issue will significantly enhance the functionality and flexibility of our model loading or post processing.

Thanks, and please CC @PanQiWei

Copy link
Collaborator

@PanQiWei PanQiWei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me, will merge. ❤️

@PanQiWei PanQiWei merged commit 5d8fa85 into AutoGPTQ:main Aug 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants