# A Guided Tour of Ray Core: Remote Stateful Classes

© 2019-2022, Anyscale. All Rights Reserved

### Learning objectives
In this this tutorial, we'll discuss Ray Actors and learn about:
 * How Ray Actors work
 * How to write a stateful Ray Actor
 * How Ray Actors can be writen as a statful distributed service

[*Remote Classes*](https://docs.ray.io/en/latest/walkthrough.html#remote-classes-actors)
involve using a `@ray.remote` decorator on a class. 

This implements an [*actor*](https://patterns.eecs.berkeley.edu/?page_id=258) pattern, with properties: *stateful*, *message-passing semantics*

Actors are extremely powerful. They allow you to take a Python class and instantiate it as a stateful microservice that can be queried from other actors and tasks and even other Python applications. Actors can be passed as arguments to other tasks and actors. 

When you instantiate a remote Actor, a separate worker process is attached to a worker process and becomes an Actor process on the worker node, for the purpose of running methods called on the actor. Other Ray tasks and actors can invoke its methods on that process, mutating its internal state. Actors can also be terminated manually if needed. The examples code below show all these cases.

<img src="images/ray_worker_actor_1.png" height="40%" width="70%">
<img src="images/ray_worker_actor_2.png" height="40%" width="70%">

---

First, let's start Ray…

In [1]:
import logging
import time
from pprint import pprint
import ray
import random
from random import randint
import numpy as np

In [2]:
if ray.is_initialized:
    ray.shutdown()
context = ray.init(logging_level=logging.ERROR)
pprint(context)

RayContext(dashboard_url='127.0.0.1:8266', python_version='3.8.13', ray_version='1.13.0', ray_commit='e4ce38d001dbbe09cd21c497fedd03d692b2be3e', address_info={'node_ip_address': '127.0.0.1', 'raylet_ip_address': '127.0.0.1', 'redis_address': None, 'object_store_address': '/tmp/ray/session_2022-07-19_16-35-52_476810_13122/sockets/plasma_store', 'raylet_socket_name': '/tmp/ray/session_2022-07-19_16-35-52_476810_13122/sockets/raylet', 'webui_url': '127.0.0.1:8266', 'session_dir': '/tmp/ray/session_2022-07-19_16-35-52_476810_13122', 'metrics_export_port': 58141, 'gcs_address': '127.0.0.1:55584', 'address': '127.0.0.1:55584', 'node_id': '79fe60feec93db52aa75b319d8617f6ea386e35fbf5396a196e43a64'})


In [3]:
print(f"Dashboard url: http://{context.address_info['webui_url']}")

Dashboard url: http://127.0.0.1:8266


## 3. Remote class as a stateful actor pattern

To start, we'll define a class and use the decorator: `@ray.remote`

#### Example 1: Method tracking 

Let's use this patten to track method invocation state of actor methods. Each instance will track 
who invoked it and number of times

In [4]:
CALLERS = ["A", "B", "C"]

@ray.remote
class MethodStateCounter:
    def __init__(self):
        self.invokers = {"A": 0, "B": 0, "C": 0}
    
    def invoke(self, name):
        # pretend to do some work here
        time.sleep(0.5)
        # update times invoked
        self.invokers[name] += 1
        # return the state of that invoker
        return self.invokers[name]
        
    def get_invoker_state(self, name):
        # return the state of the named invoker
        return self.invokers[name]
    
    def get_all_invoker_state(self):
        # reeturn the state of all invokers
        return self.invokers

In [5]:
# Create an instance of our Actor and randomy invoke
# methods by caller's names
worker_invoker = MethodStateCounter.remote()
worker_invoker

Actor(MethodStateCounter, 6bce47937d4e725775a53eec01000000)

Iterate and invoke method by randomly callers and keep track of who called

In [6]:
for _ in range(10):
    name = random.choice(CALLERS)
    worker_invoker.invoke.remote(name)

Invoke a random caller and fetch the value or invocations of a random caller

In [16]:
for _ in range(5): 
    random_name_invoker = random.choice(CALLERS)
    times_invoked = ray.get(worker_invoker.invoke.remote(random_name_invoker))
    print(f"Named caller: {random_name_invoker} called {times_invoked}")

Named caller: B called 6
Named caller: A called 6
Named caller: C called 6
Named caller: C called 7
Named caller: C called 8


Fetch the count of all callers

In [17]:
print(ray.get(worker_invoker.get_all_invoker_state.remote()))

{'A': 6, 'B': 6, 'C': 8}


Note that we did not have to reason about where and how the actors are scheduled. We did not worry about the socket connection or IP addresses where these actors reside. All that's abstracted away from us. 

All do is write Python code, convert our classes into distributed stateful services!

#### Example 2: Parameter Server distributed application with Ray Actors 


Let's use Python class and convert that to a remote Actor class actor service as a Parameter Server. This is a common example in machine learning where you have a central Parameter server updating gradients from other worker processes computing individual gradients. 

<img src="https://terrytangyuan.github.io/img/inblog/mpi-operator-1.png" width="60%" height="30%">

In [18]:
@ray.remote
class ParameterSever:
    def __init__(self):
        # Initialized our gradients to zero
        self.params = np.zeros(10)

    def get_params(self):
        # Return current gradients
        return self.params

    def update_params(self, grad):
        # Update the gradients 
        self.params -= grad

Define worker or task as a function for a remote Worker process. This could be a machine learning objective function that computes gradients and sends them to the parameter server.

In [19]:
@ray.remote
def worker(ps):
    # Iterate over some epoch
    for i in range(25):
        time.sleep(1.5)  # this could be your loss function computing gradients
        grad = np.ones(10)
        # update the gradients in the parameter server
        ps.update_params.remote(grad)

Start our Parameter Server actor. This will be scheduled as a worker process on a remote Ray Worker. You invoke its `ActorClass.remote(...)` to instantiate an Actor instance of that type.

In [20]:
param_server = ParameterSever.remote()
param_server

Actor(ParameterSever, d70144b0cae53540eda968b601000000)

Let's get the initial values of the parameter server

In [21]:
print(f"Initial params: {ray.get(param_server.get_params.remote())}")

Initial params: [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]


### Create Workers Nodes Computing Gradients
Let's create three separate workers as our machine learning tasks that compute gradients.
These will be scheduled as tasks on a Ray cluster.

You can use list comprehension. Quite Pythonic!

If we need more workers to scale, we can always bump them up.

**Note**: That we are sending the `parameter_server` as an argument to the remote
worker task. Ray will resolve this.

In [22]:
[worker.remote(param_server) for _ in range(3)]

[ObjectRef(c5db14a0419b947bffffffffffffffffffffffff0100000001000000),
 ObjectRef(91581beb08e6c9deffffffffffffffffffffffff0100000001000000),
 ObjectRef(ae46b8beecd25f3affffffffffffffffffffffff0100000001000000)]

Now, let's iterate over a loop and query the Parameter Server 
as the workers are running independently and updating the gradients

In [23]:
for _i in range(20):
    print(f"Updated params: {ray.get(param_server.get_params.remote())}")
    time.sleep(1)

Updated params: [-30. -30. -30. -30. -30. -30. -30. -30. -30. -30.]
Updated params: [-30. -30. -30. -30. -30. -30. -30. -30. -30. -30.]
Updated params: [-33. -33. -33. -33. -33. -33. -33. -33. -33. -33.]
Updated params: [-36. -36. -36. -36. -36. -36. -36. -36. -36. -36.]
Updated params: [-36. -36. -36. -36. -36. -36. -36. -36. -36. -36.]
Updated params: [-39. -39. -39. -39. -39. -39. -39. -39. -39. -39.]
Updated params: [-42. -42. -42. -42. -42. -42. -42. -42. -42. -42.]
Updated params: [-42. -42. -42. -42. -42. -42. -42. -42. -42. -42.]
Updated params: [-45. -45. -45. -45. -45. -45. -45. -45. -45. -45.]
Updated params: [-48. -48. -48. -48. -48. -48. -48. -48. -48. -48.]
Updated params: [-48. -48. -48. -48. -48. -48. -48. -48. -48. -48.]
Updated params: [-51. -51. -51. -51. -51. -51. -51. -51. -51. -51.]
Updated params: [-54. -54. -54. -54. -54. -54. -54. -54. -54. -54.]
Updated params: [-54. -54. -54. -54. -54. -54. -54. -54. -54. -54.]
Updated params: [-57. -57. -57. -57. -57. -57. -

### Look at the Ray Dashboard

You should see Actors running as process on the workers nodes
 * Parameter Server
 
Also, click on the `Logical View` to view more metrics and data on individual Ray Actors

Finally, shutdown Ray

In [24]:
ray.shutdown()

### Exercises

1. Modify the Actor class `MethodStateCounter` and add/modify methods that return the following:
 * Get number of times an invoker `name` was called
 * Get a list of values computed by invoker `name` 
 * Get state of all invokers
 
2. Modify method `invoke` to return a random int value between [5, 25]

## Homework

Read these references as cure to your insomnia :-)

 * [Writing your First Distributed Python Application with Ray](https://www.anyscale.com/blog/writing-your-first-distributed-python-application-with-ray)
 * [Using and Programming with Actors](https://docs.ray.io/en/latest/actors.html)
 * [Advanced Patterns and Anti-Patterns in Ray](https://docs.ray.io/en/latest/ray-design-patterns/index.htmlhttps://docs.ray.io/en/latest/ray-design-patterns/index.html)