A10 量化1.8B，推理失败，出现 Some weights of the model checkpoint at /Qwen-1.8B-8bit-gptq-1 were not used when initializing Qwen2ForCausalLM: #239

wellcasa · 2024-04-02T06:00:34Z

版本

auto-gptq                 0.7.1                    pypi_0    pypi
optimum                   1.18.0                   pypi_0    pypi
peft                      0.10.0                   pypi_0    pypi
torch                     2.1.0                    pypi_0    pypi
torchaudio                2.1.1+cu121              pypi_0    pypi
torchvision               0.16.1+cu121             pypi_0    pypi
transformers              4.38.0                   pypi_0    pypi

量化代码和官方一致。

加载和官方一致，仅仅改了地址

from transformers import AutoModelForCausalLM, AutoTokenizer
device = "cuda" # the device to load the model onto

model = AutoModelForCausalLM.from_pretrained(
    "Qwen/Qwen1.5-7B-Chat-GPTQ-Int8",
    torch_dtype="auto",
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen1.5-7B-Chat-GPTQ-Int8")

prompt = "Give me a short introduction to large language model."
messages = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(device)

generated_ids = model.generate(
    model_inputs.input_ids,
    max_new_tokens=512
)
generated_ids = [
    output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]

response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]

错误

Some weights of the model checkpoint at /home/admin/workspace/aop_lab/app_source/Qwen-1.8B-8bit-gptq-1 were not used when initializing Qwen2ForCausalLM: ['model.layers.0.mlp.down_proj.bias', 'model.layers.0.mlp.down_proj.g_idx', 'model.layers.0.mlp.down_proj.qweight', 'model.layers.0.mlp.down_proj.qzeros', 'model.layers.0.mlp.down_proj.scales', 'model.layers.0.mlp.gate_proj.bias', 'model.layers.0.mlp.gate_proj.g_idx', 'model.layers.0.mlp.gate_proj.qweight', 'model.layers.0.mlp.gate_proj.qzeros', 'model.layers.0.mlp.gate_proj.scales', 'model.layers.0.mlp.up_proj.bias', 'model.layers.0.mlp.up_proj.g_idx', 'model.layers.0.mlp.up_proj.qweight', 'model.layers.0.mlp.up_proj.qzeros', 'model.layers.0.mlp.up_proj.scales', 'model.layers.0.self_attn.k_proj.g_idx', 'model.layers.0.self_attn.k_proj.qweight', 'model.layers.0.self_attn.k_proj.qzeros', 'model.layers.0.self_attn.k_proj.scales', 'model.layers.0.self_attn.o_proj.bias', 'model.layers.0.self_attn.o_proj.g_idx', 'model.layers.0.self_attn.o_proj.qweight', 'model.layers.0.self_attn.o_proj.qzeros', 'model.layers.0.self_attn.o_proj.scales', 'model.layers.0.self_attn.q_proj.g_idx', 'model.layers.0.self_attn.q_proj.qweight', 'model.layers.0.self_attn.q_proj.qzeros', 'model.layers.0.self_attn.q_proj.scales', 'model.layers.0.self_attn.v_proj.g_idx', 'model.layers.0.self_attn.v_proj.qweight', 'model.layers.0.self_attn.v_proj.qzeros', 'mode...省略。

推理出现乱码

{"output":{"text":"ழ珊产荷苍霜姗dimsROLROL铢兆物质霜霜脆xmin霜霜痼兆篓碎片沉产寒铢珊ROLdims識霜霜霜霜ROL霜xmin姗产霜震动dims霜霜荷萎 salt霜 heb执抽奖霜dims珊dimsdims葫芦霜霜降荷葫芦滨海兆削减霜瑞丑跚()\r\n\r\n\r\nScenario�珊晰()\r\n\r\n\r\n霜dims_consumer霜 salt震动荷户滨海产软罕 responseType姗雨水产境dims霜寓.handleSubmit霜跚霜霜 Closing荷跚户产hope峻产dims产_lit葫芦降dims产霜觳霜削减跚姗idata霜达不到滨海 salt霜霜荷dims确hope霜霜dimsdimsdims雨水dims产产荷 Closingdims霜产hopeGov滨海 Closing霜产自助邻产降霜霜霜dimshope软荷霜 salt产霜dims霜hopedims软境寓削减 responseType痼霜葫芦Govhopedims削减痼hope坚定不移hope降dimsdimsdims产dims滨海dimsdims荷霜产产霜 hopedims剖产hopedimshopedims ClosingPairs霜霜窕窠Gov产加重寓寓远方dimshope削减hope瞭产_attachments荷痼 Humans霜产 responseType葫芦�产dims产 responseTypeClosing产产产 responseTypedims产.codehaus葫芦窕dims户降dims霜降hope.chk霜产霜dimsGov罕dims葫芦dims霜寓霜hope窕寓寓dimsdims产降车库Gov寓产hope产产自助瞭 responseType自助产自助 Closinghope计较UserData削减再到dims霜_CHOICES自助hope自助产ower削减dims产_lit葫芦瞭_lit坚定不移hope窕寓坚定不移产 debugger Closing产dims产hope自助瞭产hope模境owehope compañero产Goviphertext工具两端葫芦寓_lithope产商用产产-collapse Closing Closing升降Gov产dimsGovhope廉霜Gov产霜_lit产瞭hope霜产产墒寓产 responseType葫芦降ledgehope坚定不移hopedimsGov_lit寓坚定不移Govhope論hope远方hopehopedims霜窕葫芦从严Gov寓hopehopehope霜寓.astype产产 Closinghopehope Closing产hopehopehopehopeGov自助计较�产寓葫芦_lit窕计较寓产自助产hope葫芦hopehope削减产hope坚定不移寓hope霜hope竿邰霜hopehope寓产计较霜hope坚定不移葫芦葫芦瞭墒产hope霜产霜 validationResulthopehope寓_lit削减_lit坚定不移_lithope墒墒神葫芦霜hope产侥 Addresseshopehope两端encode自助hope哩坚定不移 hope窠窠廉hope产hope两端hopehopeemiGovhope自助自助葫芦寓产hope瞭两端hope产计较产","finish_reason":"stop"},"usage":{"output_tokens":512,"input_tokens":27,"cost_time":11.28798532485962,"token_rate":45.35796116534794,"start_time":"2024-04-02 13:52:04","money":0.0,"trace_id":null,"error":null}

The text was updated successfully, but these errors were encountered:

wellcasa · 2024-04-02T06:33:50Z

解决了，我拿官方的GPTQ模型附带的config.json文件覆盖，就可以。

jklj077 · 2024-04-02T06:36:21Z

加载和官方一致，仅仅改了地址

If you wish to directly use the output checkpoint from auto-gptq, please refer to the script given at https://qwen.readthedocs.io/en/latest/quantization/gptq.html. In short, you will need to use AutoGPTQForCausalLM.from_pretrained.

jklj077 closed this as completed Apr 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

A10 量化1.8B，推理失败，出现 Some weights of the model checkpoint at /Qwen-1.8B-8bit-gptq-1 were not used when initializing Qwen2ForCausalLM: #239

A10 量化1.8B，推理失败，出现 Some weights of the model checkpoint at /Qwen-1.8B-8bit-gptq-1 were not used when initializing Qwen2ForCausalLM: #239

wellcasa commented Apr 2, 2024 •

edited by jklj077

Loading

wellcasa commented Apr 2, 2024

jklj077 commented Apr 2, 2024

A10 量化1.8B，推理失败，出现 Some weights of the model checkpoint at /Qwen-1.8B-8bit-gptq-1 were not used when initializing Qwen2ForCausalLM: #239

A10 量化1.8B，推理失败，出现 Some weights of the model checkpoint at /Qwen-1.8B-8bit-gptq-1 were not used when initializing Qwen2ForCausalLM: #239

Comments

wellcasa commented Apr 2, 2024 • edited by jklj077 Loading

wellcasa commented Apr 2, 2024

jklj077 commented Apr 2, 2024

wellcasa commented Apr 2, 2024 •

edited by jklj077

Loading