Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Doc] GKE GPU cluster setup #1223

Merged
merged 4 commits into from
Jul 7, 2023

Conversation

kevin85421
Copy link
Member

@kevin85421 kevin85421 commented Jul 7, 2023

Why are these changes needed?

This document heavily references https://github.com/ray-project/aviary/blob/master/docs/kuberay/deploy-on-gke.md.

Related issue number

Checks

  • I've made sure the tests are passing.
  • Testing Strategy
    • Unit tests
    • Manual tests
    • This PR is not tested :(
  1. Follow this document to create a GKE cluster with GPU
  2. Deploy the Stable Diffusion example.
Screen Shot 2023-07-06 at 11 53 46 PM

output

@kevin85421 kevin85421 marked this pull request as ready for review July 7, 2023 07:20
Copy link
Contributor

@zcin zcin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM, just some nits!

docs/guidance/gcp-gke-gpu-cluster.md Outdated Show resolved Hide resolved
docs/guidance/gcp-gke-gpu-cluster.md Outdated Show resolved Hide resolved
docs/guidance/gcp-gke-gpu-cluster.md Outdated Show resolved Hide resolved
# NAME GPU
# ... 1
# ... <none>
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be helpful to show a sample output of kubectl get pods? Also, is there a way to show that some pods (e.g. ray gpu workers) are running on a node with a GPU, while other pods like the operator or ray head are running on a node with no GPU?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This document does not cover the installation steps for the KubeRay operator and RayCluster.

kevin85421 and others added 3 commits July 7, 2023 14:42
Co-authored-by: Cindy Zhang <cindyzyx9@gmail.com>
Signed-off-by: Kai-Hsun Chen <kaihsun@apache.org>
@kevin85421 kevin85421 merged commit 1ee5f95 into ray-project:master Jul 7, 2023
19 of 20 checks passed
lowang-bh pushed a commit to lowang-bh/kuberay that referenced this pull request Sep 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants