fail to run model when load low bits instead of load original for qwen #10327

aoke79 · 2024-03-05T08:11:24Z

the error message is like below:
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, xpu:0 and cpu!
Thanks a lot,

Oscilloscope98 · 2024-03-06T01:52:42Z

Hi @aoke79 ,

In your code, when using load_low_bit, is cpu_embedding set to True? If so, this may due to a bug with cpu_embedding. We have fixed it for bigdl-llm >= 2.5.0b20240123. Would you mind letting us know your bigdl-llm version? If bigdl-llm is below 2.5.0b20240123, you could update to latest bigdl-llm and have a try again :)

Please let us know for any further updates.

aoke79 closed this as completed Mar 6, 2024

jason-dai added the user issue label Apr 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fail to run model when load low bits instead of load original for qwen #10327

fail to run model when load low bits instead of load original for qwen #10327

aoke79 commented Mar 5, 2024

Oscilloscope98 commented Mar 6, 2024 •

edited

Loading

fail to run model when load low bits instead of load original for qwen #10327

fail to run model when load low bits instead of load original for qwen #10327

Comments

aoke79 commented Mar 5, 2024

Oscilloscope98 commented Mar 6, 2024 • edited Loading

Oscilloscope98 commented Mar 6, 2024 •

edited

Loading