[General Quant Linear] Register quant params of general quant linear for friendly post process. #226

LeiWang1999 · 2023-08-03T05:24:12Z

Hi all, I have implemented support for loading auto-gptq pre-quantized checkpoints into MLC-LLM (based on the release 0.1.0 ). However this flow has encountered some issues with the GeneralQuantLinear in the release later than 0.2.0 has some issues of the load flow.

As from the compiler side, we load params with three different ways: named_buffers(), named_paramters(), state_dicts(), however, because the qweight, qzeros is not registered in the GeneralQantLinear, which means we cannot retrieve them using any of these methods.

This issue prevents us from fully utilizing the pre-quantized auto-gptq checkpoints in MLC-LLM. I believe addressing this issue will significantly enhance the functionality and flexibility of our model loading or post processing.

Thanks, and please CC @PanQiWei

PanQiWei

This looks good to me, will merge. ❤️

regist buffer of general quant linear

a0de5c2

PanQiWei approved these changes Aug 4, 2023

View reviewed changes

PanQiWei merged commit 5d8fa85 into AutoGPTQ:main Aug 4, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[General Quant Linear] Register quant params of general quant linear for friendly post process. #226

[General Quant Linear] Register quant params of general quant linear for friendly post process. #226

LeiWang1999 commented Aug 3, 2023

PanQiWei left a comment

[General Quant Linear] Register quant params of general quant linear for friendly post process. #226

[General Quant Linear] Register quant params of general quant linear for friendly post process. #226

Conversation

LeiWang1999 commented Aug 3, 2023

PanQiWei left a comment

Choose a reason for hiding this comment