We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ray-operator
If Ray head node is scheduled on GPU node with no GPU resource requested, e.g
resources: limits: ephemeral-storage: 10Gi memory: 16Gi requests: cpu: '4' ephemeral-storage: 10Gi memory: 16Gi
Ray resource scheduler can still access those GPUs accidentally and considered the entire host GPU as "Logical Resources" during scheduling.
Use RayJob CRD to scheduled both head and workers on the same physical host with > 1 GPUs.
RayJob
No response
The text was updated successfully, but these errors were encountered:
This is not a KubeRay-specific issue. See https://docs.ray.io/en/latest/cluster/kubernetes/user-guides/gpu.html#gpu-multi-tenancy for more details. Recently, GPU UX on K8s seems to have improved. I will take a look at MIG and time-slicing GPU and get back to you.
Sorry, something went wrong.
kevin85421
No branches or pull requests
Search before asking
KubeRay Component
ray-operator
What happened + What you expected to happen
If Ray head node is scheduled on GPU node with no GPU resource requested, e.g
Ray resource scheduler can still access those GPUs accidentally and considered the entire host GPU as "Logical Resources" during scheduling.
Reproduction script
Use
RayJob
CRD to scheduled both head and workers on the same physical host with > 1 GPUs.Anything else
No response
Are you willing to submit a PR?
The text was updated successfully, but these errors were encountered: