-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rpc error: code = Unknown desc = failed to find gpu id #10
Comments
OK, let me have a try. |
I met the same problem too
|
@Thor-wl is there any progress for this issue? |
Please check: If not free install latest volcano just replace volcano scheduler image maybe miss step 2) and 3). In my test env, it works well:
|
@wpeng102 It still could not work by your steps, my volcano version is 1.0.1 and volcano-device-plugin version is 1.0.1. |
@jxfruit could you help paste your |
@wpeng102 volcano-scheduler-configmap, the scheduler log and test yaml are here |
I found that the filed "volcano.sh/gpu-memory" cannot be like "1024Mi". After I changed it from "1024Mi" to 1024, it runs. But when I deployed 2 jobs on the same node, there is only process on gpu node, and on the other job logs throw the OOM error |
Talked with jxfrui, this version can not support GPU memory hart isolation, refer https://github.com/volcano-sh/devices#docs |
Thank dalaos. |
/close |
1 similar comment
/close |
Maybe is a bug
The yaml is
And I use "olcano.sh/gpu-memory" resource is error:
env:
so is volcano.sh/gpu-memory support?
The text was updated successfully, but these errors were encountered: