RuntimeError: Unsupported compute type Float #9

ChuanhongLi · 2024-05-23T07:55:40Z

使用 decoupleQ 量化了一个 llama 33B的模型，推理时报错

Traceback (most recent call last):
  File "/mnt/afs/quantization/test/decoupleQ/llama.py", line 476, in <module>
    model_output = model.generate(input_token_ids_tensor, max_length=40, do_sample=False)
...
File "/mnt/afs/quantization/test/decoupleQ/decoupleQ/linear_w2a16.py", line 36, in forward
    output = dQ_asymm_qw2_gemm(input, self.weight, self.scale, self.zp, self.bias, self.group_size)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: Unsupported compute type Float

请问你们是否有遇到过该问题？

谢谢！

The text was updated successfully, but these errors were encountered:

MyPandaShaoxiang · 2024-05-23T09:36:31Z

@ChuanhongLi 是否中间推理出现了输入全为nan的结果

ChuanhongLi · 2024-05-23T11:44:36Z

@ChuanhongLi 是否中间推理出现了输入全为nan的结果

不是很确定，现在量化的模型，都有些问题 #8

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RuntimeError: Unsupported compute type Float #9

RuntimeError: Unsupported compute type Float #9

ChuanhongLi commented May 23, 2024

MyPandaShaoxiang commented May 23, 2024

ChuanhongLi commented May 23, 2024

RuntimeError: Unsupported compute type Float #9

RuntimeError: Unsupported compute type Float #9

Comments

ChuanhongLi commented May 23, 2024

MyPandaShaoxiang commented May 23, 2024

ChuanhongLi commented May 23, 2024