Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ascend NPU推理llama2报错 #4159

Closed
1 task done
hrz394943230 opened this issue Jun 8, 2024 · 4 comments
Closed
1 task done

Ascend NPU推理llama2报错 #4159

hrz394943230 opened this issue Jun 8, 2024 · 4 comments
Labels
npu This problem is related to NPU devices solved This problem has been already solved

Comments

@hrz394943230
Copy link

hrz394943230 commented Jun 8, 2024

Reminder

  • I have read the README and searched the existing issues.

System Info

依赖版本

llamafactory version ==v0.7.1
torch ==2.1
torch_npu == 2.1.0.post3

系统版本

ubuntu22.0.4

机器信息

华为800T A2
8卡 Ascend 910b
CANN toolkit = Ascend-cann-toolkit_8.0.RC1_linux-aarch64.run
CANN kernels = Ascend-cann-kernels-910b_8.0.RC1_linux.run

Reproduction

运行代码
ASCEND_RT_VISIBLE_DEVICES=0 llamafactory-cli chat llama2.yaml
llama2.yaml内容:

model_name_or_path: /mnt/nvme1/models/Llama-2-7b-chat-hf/
template: llama2
do_sample: false

命令成功运行·但是提示使用cpu推理而不是npu
image

Expected behavior

能执行单卡推理llama2 多卡推理llama2

Others

No response

@hrz394943230
Copy link
Author

之前也是用过readme中的cann toolkit和kernel 报一样的错误

@hiyouga
Copy link
Owner

hiyouga commented Jun 8, 2024

推理速度正常吗?正常的话那就是在 npu 上面,cpu 会特别慢

@hrz394943230
Copy link
Author

推理速度正常吗?正常的话那就是在 npu 上面,cpu 会特别慢

不正常,一秒几个token吧,看htop用了两核cpu,npu把模型推进去了,显存有占用但是功率没有变化

@hiyouga
Copy link
Owner

hiyouga commented Jun 10, 2024

一秒几个应该是正常速度

@hiyouga hiyouga added the solved This problem has been already solved label Jun 10, 2024
@hiyouga hiyouga closed this as completed Jun 10, 2024
@hiyouga hiyouga added the npu This problem is related to NPU devices label Jun 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
npu This problem is related to NPU devices solved This problem has been already solved
Projects
None yet
Development

No branches or pull requests

2 participants