# A Guided Tour of Ray Core: Ray Actor Tree Pattern

© 2019-2022, Anyscale. All Rights Reserved

📖 [Back to Table of Contents](./ex_00_tutorial_overview.ipynb)<br>
➡ [Next notebook](./ex_05_multiprocess_pool.ipynb) <br>
⬅️ [Previous notebook](./ex_03_remote_classes.ipynb) <br>


### Learning objectives
In this this tutorial, we revist Ray Actors and learn about:
 * Common Ray Actors patterns used in Ray native libraries for writing distributed Actors
 * How to pass Ray Actors to remote tasks for distributed computing

Let's implement a simple example to illustrate this pattern.

# Tree of Actors Pattern

A common pattern used in Ray libraries [Ray Tune](https://docs.ray.io/en/latest/tune/index.html), [Ray Train](https://docs.ray.io/en/latest/train/train.html), and [RLlib](https://docs.ray.io/en/latest/rllib/index.html) to train models in a parallel or conduct distributed HPO.

In this common pattern, tree of actors, a collection of workers as actors, are managed by a supervisor. For example, you want to train multiple models at the same time, while being able to checkpoint/inspect its state.

<img src="https://docs.ray.io/en/latest/_images/tree-of-actors.svg" width="50%" height="30%">


---

First, let's start Ray…

In [1]:
import logging
import time
from pprint import pprint
import ray
import random
from random import randint
import numpy as np

In [2]:
if ray.is_initialized:
    ray.shutdown()
ray.init(logging_level=logging.ERROR)

0,1
Python version:,3.8.13
Ray version:,3.0.0.dev0
Dashboard:,http://127.0.0.1:8268


### Generic model factory classs 

This factory generates a few specify type of models (they are fake 😏): linear, classificaton, or neural network, and will have its respective training function. Each model will be in a particular state  during training. The final state is `DONE`.

In [4]:
STATES = ["RUNNING", "PENDING", "DONE"]

class Model:

    def __init__(self, m:str, func: object):
        self._model = m
        self._func = func

    def train(self):
        # do some training work here for the respective model type
        self._func()

# Factory function to return an instance of a model type
def model_factory(m: str, func: object):
    return Model(m, func)

### Create a Worker Actor
This worker actor will train each model. When the model's state reaches `DONE` we stop training

In [8]:
@ray.remote
class Worker(object):
    def __init__(self, m:str, func: object):
        # type of a model: lr, cl, or nn
        self._model = m  
        self._func = func
        
    def state(self) -> str:
        return random.choice(STATES)
    # Create the model for this worker and do the training
    # by inovking its objective function for this model
    def work(self) -> None:
        model_factory(self._model, self._func).train()

### Create Supervisor Actor 
The supervisor create three actors, each with its own train model type and its respective training or objective function

In [9]:
# Define respective model training functions

def lf_func():
    # do some training work for linear regression
    time.sleep(1)
    return 0

def cl_func():
     # do some training work for classification
    time.sleep(1)
    return 0

def nn_func():
     # do some training work for neural networks
    time.sleep(1)
    return 0

@ray.remote
class Supervisor:
    def __init__(self):
        # Create three Actor Workers, each by its unique model type and their respective training function
        self.workers = [Worker.remote(name, func) for (name, func) in [("lr", lf_func), ("cl",cl_func), ("nn", nn_func)]]
                        
    def work(self):
        # do the work 
        [w.work.remote() for w in self.workers]
        
    def terminate(self):
        [ray.kill(w) for w in self.workers]
        
    def state(self):
        return ray.get([w.state.remote() for w in self.workers])

### Create a Actor instance for supervisor and launch its workers

In [10]:
sup = Supervisor.remote()

# Launch remote actors as workers
sup.work.remote()

ObjectRef(32d950ec0ccf9d2a4e123b09b6596da5c80ce9e30100000001000000)

### Look at the Ray Dashboard

You should see Actors running as process on the workders nodes
 * Supervisor
 * Workers
 
Also, click on the `Logical View` to view more metrics and data on individual Ray Actors

In [8]:
# check their status
while True:
    # Fetch the states of all its workers
    states = ray.get(sup.state.remote())
    print(states)
    # check if all are DONE
    result = all('DONE' == e for e in states)
    if result:
        # Note: Actor processes will be terminated automatically when the initial actor handle goes out of scope in Python. 
        # If we create an actor with actor_handle = ActorClass.remote(), then when actor_handle goes out of scope and is destructed, 
        # the actor process will be terminated. Note that this only applies to the original actor handle created for the actor 
        # and not to subsequent actor handles created by passing the actor handle to other tasks.
        
        # kill supervisors' all workers manually, only for illustrtation and demo
        sup.terminate.remote()

        # kill the supervisor manually, only for illustration and demo
        ray.kill(sup)
        break

['RUNNING', 'PENDING', 'RUNNING']
['DONE', 'PENDING', 'DONE']
['DONE', 'RUNNING', 'RUNNING']
['RUNNING', 'RUNNING', 'PENDING']
['RUNNING', 'RUNNING', 'RUNNING']
['RUNNING', 'RUNNING', 'DONE']
['PENDING', 'RUNNING', 'RUNNING']
['RUNNING', 'PENDING', 'PENDING']
['PENDING', 'PENDING', 'RUNNING']
['DONE', 'PENDING', 'DONE']
['RUNNING', 'PENDING', 'DONE']
['RUNNING', 'DONE', 'RUNNING']
['PENDING', 'DONE', 'PENDING']
['RUNNING', 'PENDING', 'RUNNING']
['DONE', 'DONE', 'RUNNING']
['RUNNING', 'RUNNING', 'RUNNING']
['DONE', 'PENDING', 'RUNNING']
['DONE', 'RUNNING', 'DONE']
['RUNNING', 'PENDING', 'PENDING']
['PENDING', 'DONE', 'PENDING']
['RUNNING', 'PENDING', 'RUNNING']
['PENDING', 'PENDING', 'DONE']
['RUNNING', 'DONE', 'DONE']
['RUNNING', 'RUNNING', 'RUNNING']
['PENDING', 'DONE', 'PENDING']
['DONE', 'PENDING', 'DONE']
['PENDING', 'RUNNING', 'PENDING']
['PENDING', 'PENDING', 'RUNNING']
['RUNNING', 'RUNNING', 'RUNNING']
['DONE', 'RUNNING', 'RUNNING']
['DONE', 'PENDING', 'RUNNING']
['RUNNING', 'DO

### Passing Actor handles to Ray Tasks

Consider writing a distributed messaing service, where workers or entities may post messages
and pupdate the state of the messaging service. This could be a loggin or monitoring service.

You can pass actor handle instances to remote Ray tasks, which can change its 
state. The `MessageActor` keeps or clears messages, depending on the its method
invoked.

In [14]:
@ray.remote
class MessageActor(object):
    def __init__(self):
        # Keep the state of the messages
        self.messages = []
    
    def add_message(self, message):
        self.messages.append(message)
    
    # reset and clear all messages
    def get_and_clear_messages(self):
        messages = self.messages
        self.messages = []
        return messages

Create a worker and message actor.

In [15]:
@ray.remote
def worker(message_actor, j):
    for i in range(10):
        time.sleep(1)
        message_actor.add_message.remote(
            f"Message {i} from worker {j}.")

In [16]:
message_actor = MessageActor.remote()

Start workers that update the `MessageActor` service

In [20]:
[worker.remote(message_actor, j) for j in range(3)]

[ObjectRef(347cc60e0bb3da74ffffffffffffffffffffffff0100000001000000),
 ObjectRef(a02c24b8b7fc0a31ffffffffffffffffffffffff0100000001000000),
 ObjectRef(a631fe8d231813bfffffffffffffffffffffffff0100000001000000)]

In [21]:
for _ in range(10):
    new_messages = ray.get(message_actor.get_and_clear_messages.remote())
    print("New messages\n:", new_messages)
    time.sleep(1)

New messages
: []
New messages
: ['Message 0 from worker 2.', 'Message 0 from worker 0.', 'Message 0 from worker 1.']
New messages
: ['Message 1 from worker 0.', 'Message 1 from worker 1.', 'Message 1 from worker 2.']
New messages
: ['Message 2 from worker 0.', 'Message 2 from worker 1.']
New messages
: ['Message 2 from worker 2.', 'Message 3 from worker 0.', 'Message 3 from worker 1.', 'Message 3 from worker 2.']
New messages
: ['Message 4 from worker 1.', 'Message 4 from worker 2.', 'Message 4 from worker 0.']
New messages
: ['Message 5 from worker 1.', 'Message 5 from worker 2.', 'Message 5 from worker 0.']
New messages
: ['Message 6 from worker 1.', 'Message 6 from worker 2.', 'Message 6 from worker 0.']
New messages
: ['Message 7 from worker 1.', 'Message 7 from worker 2.', 'Message 7 from worker 0.']
New messages
: ['Message 8 from worker 1.', 'Message 8 from worker 2.', 'Message 8 from worker 0.']


### Exercises

1. Add a remote class, such as a logging actor, that keeps states by logging info (may be only in memory)
2. Implement methods that alters the state
3. Instantiate it and call its methods

### Solution hints

This solution is just a structural hint. There are few missing bits:
 * instantiation of `LoggingActor`
 * Need to use `ray.get()` to fetch the values from the object store

In [None]:
from collections import defaultdict
@ray.remote
class LoggingActor(object):
    def __init__(self):
        self.logs = defaultdict(list)
    
    def log(self, index, message):
        self.logs[index].append(message)
    
    def get_logs(self):
        return dict(self.logs)
    
@ray.remote
def run_experiment(experiment_index, logging_actor):
    for i in range(60):
        time.sleep(1)
        # Push a logging message to the actor.
        logging_actor.log.remote(experiment_index, 'On iteration {}'.format(i))    

In [None]:
# logging_actor = # TODO Instantiate Actor here
experiment_ids = []
for i in range(3):
    experiment_ids.append(run_experiment.remote(i, logging_actor))

In [None]:
logs = logging_actor.get_logs.remote()
# TODO use ray.get() to fetch the logs

### Next Step
We going to swtich a focus little and learn how you can use Ray's replacement
for Python's Multiprocessing pool. 

Let's move on to the [Multiprocessing pool with Ray](ex_05_multiprocess_pool.ipynb)

---
## References

 * [Writing your First Distributed Python Application with Ray](https://www.anyscale.com/blog/writing-your-first-distributed-python-application-with-ray)
 * [Using and Programming with Actors](https://docs.ray.io/en/latest/actors.html)
 * [Advanced Patterns and Anti-Patterns in Ray](https://docs.ray.io/en/latest/ray-design-patterns/index.htmlhttps://docs.ray.io/en/latest/ray-design-patterns/index.html)

📖 [Back to Table of Contents](./ex_00_tutorial_overview.ipynb)<br>
➡ [Next notebook](./ex_05_multiprocess_pool.ipynb) <br>
⬅️ [Previous notebook](./ex_03_remote_classes.ipynb) <br>