# Ray cluster

Ray is a framework for distributed computing in Python. It is growing in popularity.

Please note that the Ray project provides its own docker image `rayproject/ray`, however, installing dependencies manually adds to the intuition.

#### Create a docker image which has all necessary software installed

```Dockerfile
FROM continuumio/miniconda3
RUN apt update && apt install -y iputils-ping iproute2
RUN pip install "ray[all]"
``

In [None]:
%%time
!docker build -t test-ray-image .

#### Create a simulated network

In [None]:
!docker network rm simulated-cluster

In [None]:
!docker network create simulated-cluster

#### Start a docker container 
Scheduler at port 6379 and UI at port 8265

In [None]:
!docker run \
    -dit \
    --network simulated-cluster \
    -p 6379:6379 -p 8265:8265 -p 10001:10001 -p 10002:10002 \
    --name ray-head \
test-ray-image \
    ray start --head --node-ip-address=0.0.0.0 --dashboard-host=0.0.0.0 --disable-usage-stats --block

In [1]:
!docker run \
    -dit \
    -p 6379:6379 -p 8265:8265 -p 10001:10001 -p 10002:10002 \
    --name ray-head \
test-ray-image \
    ray start --head --node-ip-address=0.0.0.0 --dashboard-host=0.0.0.0 --disable-usage-stats --block

dff21f978052157439b5ac01738321542d31bb012a7d5a365b973536b04abfcf


In [3]:
!docker logs ray-head

Usage stats collection is disabled.

[37mLocal node IP[39m: [1m0.0.0.0[22m

[32m--------------------[39m
[32mRay runtime started.[39m
[32m--------------------[39m

[36mNext steps[39m
  To add another node to this Ray cluster, run
  [1m  ray start --address='0.0.0.0:6379'[22m
  
  To connect to this Ray cluster:
    [35mimport[39m[26m ray
    ray[35m.[39m[26minit(_node_ip_address[35m=[39m[26m[33m'0.0.0.0'[39m[26m)
  
  To submit a Ray job using the Ray Jobs CLI:
  [1m  RAY_ADDRESS='http://0.0.0.0:8265' ray job submit --working-dir . -- python my_script.py[22m
  
  See https://docs.ray.io/en/latest/cluster/running-applications/job-submission/index.html 
  for more information on submitting Ray jobs to the Ray cluster.
  
  To terminate the Ray runtime, run
  [1m  ray stop[22m
  
  To view the status of the cluster, use
    [1mray status[22m[26m
  
  To monitor and debug Ray, view the dashboard at 
    [1m0.0.0.0:8265[22m[26m
  
  [4mIf connection to t

In [5]:
!docker exec ray-head bash -c "ip addr|grep inet"

    inet 127.0.0.1/8 scope host lo
    inet6 ::1/128 scope host 
    inet 172.17.0.2/16 brd 172.17.255.255 scope global eth0


In [None]:
SCHEDULER_IP = "172.18.0.2"

#### Start workers

In [None]:
!docker run -dit --name ray-worker1 test-ray-image ray start --address=ray-head:6379 --block

In [None]:
!docker run -dit --network simulated-cluster --name ray-worker1 test-ray-image ray start --address=ray-head:6379 --block
!docker run -dit --network simulated-cluster --name ray-worker2 test-ray-image ray start --address=ray-head:6379 --block
!docker run -dit --network simulated-cluster --name ray-worker3 test-ray-image ray start --address=ray-head:6379 --block

In [17]:
!docker run -dit --name ray-worker1 test-ray-image ray start --address=172.17.0.2:6379 --block
!docker run -dit --name ray-worker2 test-ray-image ray start --address=172.17.0.2:6379 --block
!docker run -dit --name ray-worker3 test-ray-image ray start --address=172.17.0.2:6379 --block

892da8a30e20cc2d74e67f08e859820c2b206b720b274df6d2dd49b24f7e53ef
b53d65ad9bff25966cfb467be902adab6a8c716e0b42f9c4fba36a24e9089607
dc4170ae84b655aac05bcb1781d0b9d3222e784521478c2e81c5b1d1b5fbf1b2


In [21]:
!docker logs ray-worker1

[37mLocal node IP[39m: [1m172.17.0.3[22m
[2025-03-04 01:36:03,872 W 1 1] global_state_accessor.cc:429: Retrying to get node with node ID 93182829376f69e3a7301cba65f37d976b6cdfa271f45371a5601077
[2025-03-04 01:36:04,873 W 1 1] global_state_accessor.cc:429: Retrying to get node with node ID 93182829376f69e3a7301cba65f37d976b6cdfa271f45371a5601077

[32m--------------------[39m
[32mRay runtime started.[39m
[32m--------------------[39m

To terminate the Ray runtime, run
[1m  ray stop[22m

[36m[1m--block[22m[39m
  This command will now block forever until terminated by a signal.
  Running subprocesses are monitored and a message will be printed if any of them terminate unexpectedly. Subprocesses exit with SIGTERM will be treated as graceful, thus NOT reported.


### Now test the network

In [None]:
#!pip install "ray[all]"

In [1]:
import ray

In [9]:
ray.init("ray://localhost:6379")

@ray.remote
def test():
    return "Hello from Ray!"

2025-03-03 19:53:17,423	INFO client_builder.py:244 -- Passing the following kwargs to ray.init() on the server: log_to_driver


ConnectionError: ray client connection timeout