Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fail to run model when load low bits instead of load original for qwen #10327

Closed
aoke79 opened this issue Mar 5, 2024 · 1 comment
Closed

Comments

@aoke79
Copy link

aoke79 commented Mar 5, 2024

the error message is like below:
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, xpu:0 and cpu!
Thanks a lot,

@Oscilloscope98
Copy link
Contributor

Oscilloscope98 commented Mar 6, 2024

Hi @aoke79 ,

In your code, when using load_low_bit, is cpu_embedding set to True? If so, this may due to a bug with cpu_embedding. We have fixed it for bigdl-llm >= 2.5.0b20240123. Would you mind letting us know your bigdl-llm version? If bigdl-llm is below 2.5.0b20240123, you could update to latest bigdl-llm and have a try again :)

Please let us know for any further updates.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants