In [12]:
PROJECT="dgena-demo2"
ZONE="us-east4-c"
CLUSTER='embeddings-cluster'

In [13]:
!gcloud config set project $PROJECT

Updated property [core/project].


### 1. Create a new GKE cluster

In [34]:
# create cluster, ~5 min
!gcloud container clusters create $CLUSTER \
    --zone=$ZONE \
    --num-nodes=2 \
    --machine-type=c2-standard-8

Default change: VPC-native is the default mode during cluster creation for versions greater than 1.21.0-gke.1500. To create advanced routes based clusters, please pass the `--no-enable-ip-alias` flag
Note: The Kubelet readonly port (10255) is now deprecated. Please update your workloads to use the recommended alternatives. See https://cloud.google.com/kubernetes-engine/docs/how-to/disable-kubelet-readonly-port for ways to check usage and for migration instructions.
Note: Your Pod address range (`--cluster-ipv4-cidr`) can accommodate at most 1008 node(s).
Creating cluster embeddings-cluster in us-east4-c... Cluster is being configure
d...working.                                                                   
Creating cluster embeddings-cluster in us-east4-c... Cluster is being health-ch
ecked (master is healthy)...done.                                              
Created [https://container.googleapis.com/v1/projects/dgena-demo2/zones/us-east4-c/clusters/embeddings-cluster].
To ins

In [35]:
# save cluster credentials in ~/.kube/config so that local kubectl command can use them
!gcloud container clusters get-credentials $CLUSTER --project $PROJECT --zone $ZONE

Fetching cluster endpoint and auth data.
kubeconfig entry generated for embeddings-cluster.


### 2. Deploy Dask to our cluster using Helm

In [47]:
# install Dask om our cluster
!helm install --repo https://helm.dask.org \
    --set worker.replicas=16 \
    --set scheduler.serviceType=NodePort --set webUI.serviceType=NodePort --set jupyter.enabled=false \
    my-dask dask > /dev/null

In [105]:
# run this command a few times to see then pod VMs are ready
!kubectl get pods

NAME                                 READY   STATUS    RESTARTS   AGE
my-dask-scheduler-55444c7c89-jmlc2   1/1     Running   0          30m
my-dask-worker-657c447464-5wgr9      1/1     Running   0          30m
my-dask-worker-657c447464-7fmfh      1/1     Running   0          30m
my-dask-worker-657c447464-9cxn4      1/1     Running   0          30m
my-dask-worker-657c447464-9jhzp      1/1     Running   0          30m
my-dask-worker-657c447464-9tgc2      1/1     Running   0          30m
my-dask-worker-657c447464-dtssj      1/1     Running   0          30m
my-dask-worker-657c447464-hf8nd      1/1     Running   0          30m
my-dask-worker-657c447464-jxf4v      1/1     Running   0          30m
my-dask-worker-657c447464-kqm5b      1/1     Running   0          30m
my-dask-worker-657c447464-l5hf4      1/1     Running   0          30m
my-dask-worker-657c447464-njct8      1/1     Running   0          30m
my-dask-worker-657c447464-ptktt      1/1     Running   0          30m
my-dask-worker-657c4

### Forward ports for Dask Scheduler and Dask UI to run Dask compute job from local machine and to observe Dask status

In [106]:
# open a terminal on the machine where your notebook runs and run the command printed below to forward Dask Scheduler port to localhost:8080

!echo kubectl port-forward $(kubectl get pod --selector="app=dask,component=scheduler,release=my-dask" --output jsonpath='{.items[0].metadata.name}') 8080:8786

kubectl port-forward my-dask-scheduler-55444c7c89-jmlc2 8080:8786


In [107]:
# query node name and port to forward for Dask UI
NODE_NAME=!kubectl get pod --selector="app=dask,component=scheduler,release=my-dask" --output jsonpath='{.items[0].spec.nodeName}'
NODE_NAME=NODE_NAME[0]
NODE_PORT=!kubectl get services --output jsonpath='{.items[1].spec.ports[1].nodePort}'
NODE_PORT=NODE_PORT[0]

# run commands printed below in your local shell to forward Dask UI ports to localhost:8080, enter ssh password when asked
!echo gcloud compute firewall-rules create allow-ssh-ingress-from-iap --direction=INGRESS --action=allow --rules=tcp:22 --source-ranges=35.235.240.0/20
!echo gcloud compute ssh --tunnel-through-iap $NODE_NAME -- -NL 8080:localhost:$NODE_PORT

gcloud compute firewall-rules create allow-ssh-ingress-from-iap --direction=INGRESS --action=allow --rules=tcp:22 --source-ranges=35.235.240.0/20
gcloud compute ssh --tunnel-through-iap gke-embeddings-cluster-default-pool-25d9c0f2-nfcd -- -NL 8080:localhost:31467


### Test it if works by running some python calculations

In [108]:
# check if we can connect to cluster and run jobs
from dask.distributed import Client
client = Client("tcp://127.0.0.1:8080")

print(client)

<Client: 'tcp://10.108.1.16:8786' processes=16 threads=128, memory=501.65 GiB>



+---------+--------+-----------+---------+
| Package | Client | Scheduler | Workers |
+---------+--------+-----------+---------+
| msgpack | 1.0.8  | 1.0.7     | 1.0.7   |
| numpy   | 1.26.4 | 1.26.3    | 1.26.3  |
| pandas  | 2.2.2  | 2.1.4     | 2.1.4   |
| toolz   | 0.12.1 | 0.12.0    | 0.12.0  |
| tornado | 6.4.1  | 6.3.3     | 6.3.3   |
+---------+--------+-----------+---------+


In [109]:
import dask.array as da
x = da.random.random((10000, 10000), chunks=(1000, 1000))
x

Unnamed: 0,Array,Chunk
Bytes,762.94 MiB,7.63 MiB
Shape,"(10000, 10000)","(1000, 1000)"
Dask graph,100 chunks in 1 graph layer,100 chunks in 1 graph layer
Data type,float64 numpy.ndarray,float64 numpy.ndarray
"Array Chunk Bytes 762.94 MiB 7.63 MiB Shape (10000, 10000) (1000, 1000) Dask graph 100 chunks in 1 graph layer Data type float64 numpy.ndarray",10000  10000,

Unnamed: 0,Array,Chunk
Bytes,762.94 MiB,7.63 MiB
Shape,"(10000, 10000)","(1000, 1000)"
Dask graph,100 chunks in 1 graph layer,100 chunks in 1 graph layer
Data type,float64 numpy.ndarray,float64 numpy.ndarray


In [110]:
y = x + x.T
z = y[::2, 5000:].mean(axis=1)

z.compute() # observe parallel tasks in the Dask UI

array([1.00041517, 1.00763299, 1.00277814, ..., 1.00535934, 1.00059734,
       0.99626806])

### Clean up resources

In [111]:
!helm uninstall my-dask

release "my-dask" uninstalled


In [112]:
# ~3-4 min
!gcloud container clusters delete --quiet --zone=$ZONE $CLUSTER

Deleting cluster embeddings-cluster...done.                                    
Deleted [https://container.googleapis.com/v1/projects/dgena-demo2/zones/us-east4-c/clusters/embeddings-cluster].


2024-08-29 20:47:16,841 - distributed.client - ERROR - Failed to reconnect to scheduler after 30.00 seconds, closing client
