Skip to content

Commit

Permalink
[Doc] [KubeRay] Add tutorial for connecting to google cloud storage b…
Browse files Browse the repository at this point in the history
…ucket from GKE RayCluster (ray-project#38858)

This PR adds a self contained tutorial for connecting to a google cloud storage bucket. (Mostly self contained, we do link out to the google cloud docs for creating a bucket.)

---------

Signed-off-by: Kai-Hsun Chen <kaihsun@apache.org>
Signed-off-by: Archit Kulkarni <architkulkarni@users.noreply.github.com>
Co-authored-by: Kai-Hsun Chen <kaihsun@anyscale.com>
Co-authored-by: Kai-Hsun Chen <kaihsun@apache.org>
Co-authored-by: angelinalg <122562471+angelinalg@users.noreply.github.com>
Signed-off-by: Jim Thompson <jimthompson5802@gmail.com>
  • Loading branch information
4 people authored and jimthompson5802 committed Sep 12, 2023
1 parent 916aed2 commit b6c3024
Show file tree
Hide file tree
Showing 3 changed files with 143 additions and 0 deletions.
1 change: 1 addition & 0 deletions doc/source/_toc.yml
Expand Up @@ -299,6 +299,7 @@ parts:
- file: cluster/kubernetes/user-guides/gcp-gke-gpu-cluster.md
- file: cluster/kubernetes/user-guides/config.md
- file: cluster/kubernetes/user-guides/configuring-autoscaling.md
- file: cluster/kubernetes/user-guides/gke-gcs-bucket.md
- file: cluster/kubernetes/user-guides/logging.md
- file: cluster/kubernetes/user-guides/gpu.md
- file: cluster/kubernetes/user-guides/rayserve-dev-doc.md
Expand Down
1 change: 1 addition & 0 deletions doc/source/cluster/kubernetes/user-guides.md
Expand Up @@ -20,3 +20,4 @@ at the {ref}`introductory guide <kuberay-quickstart>` first.
* {ref}`kuberay-pod-security`
* {ref}`kuberay-tls`
* {ref}`deploy-a-static-ray-cluster-without-kuberay`
* {ref}`kuberay-gke-bucket`
141 changes: 141 additions & 0 deletions doc/source/cluster/kubernetes/user-guides/gke-gcs-bucket.md
@@ -0,0 +1,141 @@
(kuberay-gke-bucket)=
# Configuring KubeRay to use Google Cloud Storage Buckets in GKE

If you are already familiar with Workload Identity in GKE, you can skip this document. The gist is that you need to specify a service account in each of the Ray pods after linking your Kubernetes service account to your Google Cloud service account. Otherwise, read on.

This example is an abridged version of the documentation at <https://cloud.google.com/kubernetes-engine/docs/how-to/workload-identity>. The full documentation is worth reading if you are interested in the details.

## Create a Kubernetes cluster on GKE

This example creates a minimal KubeRay cluster using GKE.

Run this and all following commands on your local machine or on the [Google Cloud Shell](https://cloud.google.com/shell). If running from your local machine, install the [Google Cloud SDK](https://cloud.google.com/sdk/docs/install).

```bash
gcloud container clusters create cloud-bucket-cluster \
--num-nodes=1 --min-nodes 0 --max-nodes 1 --enable-autoscaling \
--zone=us-west1-b --machine-type e2-standard-8 \
--workload-pool=my-project-id.svc.id.goog # Replace my-project-id with your GCP project ID
```


This command creates a Kubernetes cluster named `cloud-bucket-cluster` with one node in the `us-west1-b` zone. This example uses the `e2-standard-8` machine type, which has 8 vCPUs and 32 GB RAM.

For more information on how to find your project ID, see <https://support.google.com/googleapi/answer/7014113?hl=en> or <https://cloud.google.com/resource-manager/docs/creating-managing-projects>.

Now get credentials for the cluster to use with `kubectl`:

```bash
gcloud container clusters get-credentials cloud-bucket-cluster --zone us-west1-b --project my-project-id
```

## Create an IAM Service Account

```bash
gcloud iam service-accounts create my-iam-sa
```

## Create a Kubernetes Service Account

```bash
kubectl create serviceaccount my-ksa
```

## Link the Kubernetes Service Account to the IAM Service Account and vice versa

In the following two commands, replace `default` with your namespace if you are not using the default namespace.

```bash
gcloud iam service-accounts add-iam-policy-binding my-iam-sa@my-project-id.iam.gserviceaccount.com \
--role roles/iam.workloadIdentityUser \
--member "serviceAccount:my-project-id.svc.id.goog[default/my-ksa]"
```

```bash
kubectl annotate serviceaccount my-ksa \
--namespace default \
iam.gke.io/gcp-service-account=my-iam-sa@my-project-id.iam.gserviceaccount.com
```

## Create a Google Cloud Storage Bucket and allow the Google Cloud Service Account to access it

Please follow the documentation at <https://cloud.google.com/storage/docs/creating-buckets> to create a bucket using the Google Cloud Console or the `gsutil` command line tool.

This example gives the principal `my-iam-sa@my-project-id.iam.gserviceaccount.com` "Storage Admin" permissions on the bucket. Enable the permissions in the Google Cloud Console ("Permissions" tab under "Buckets" > "Bucket Details") or with the following command:

```bash
gsutil iam ch serviceAccount:my-iam-sa@my-project-id.iam.gserviceaccount.com:roles/storage.admin gs://my-bucket
```

## Create a minimal RayCluster YAML manifest

You can download the RayCluster YAML manifest for this tutorial with `curl` as follows:

```bash
curl -LO https://raw.githubusercontent.com/ray-project/kuberay/master/ray-operator/config/samples/ray-cluster.gke-bucket.yaml
```

The key parts are the following lines:

```yaml
spec:
serviceAccountName: my-ksa
nodeSelector:
iam.gke.io/gke-metadata-server-enabled: "true"
```

Include these lines in every pod spec of your Ray cluster. This example uses a single-node cluster (1 head node and 0 worker nodes) for simplicity.

## Create the RayCluster

```bash
kubectl apply -f ray-cluster.gke-bucket.yaml
```

## Test GCS bucket access from the RayCluster

Use `kubectl get pod` to get the name of the Ray head pod. Then run the following command to get a shell in the Ray head pod:

```bash
kubectl exec -it raycluster-mini-head-xxxx -- /bin/bash
```

In the shell, run `pip install google-cloud-storage` to install the Google Cloud Storage Python client library.

(For production use cases, you will need to make sure `google-cloud-storage` is installed on every node of your cluster, or use `ray.init(runtime_env={"pip": ["google-cloud-storage"]})` to have the package installed as needed at runtime -- see <https://docs.ray.io/en/latest/ray-core/handling-dependencies.html#runtime-environments> for more details.)

Then run the following Python code to test access to the bucket:

```python
import ray
import os
from google.cloud import storage

GCP_GCS_BUCKET = "my-bucket"
GCP_GCS_FILE = "test_file.txt"

ray.init(address="auto")

@ray.remote
def check_gcs_read_write():
client = storage.Client()
bucket = client.get_bucket(GCP_GCS_BUCKET)
blob = bucket.blob(GCP_GCS_FILE)

# Write to the bucket
blob.upload_from_string("Hello, Ray on GKE!")

# Read from the bucket
content = blob.download_as_text()

return content

result = ray.get(check_gcs_read_write.remote())
print(result)
```

You should see the following output:

```text
Hello, Ray on GKE!
```

0 comments on commit b6c3024

Please sign in to comment.