# Install Dependencies

If you are running on Google Colab, you need to install the necessary dependencies before beginning the exercise.

In [None]:
print('NOTE: Intentionally crashing session to use the newly installed library.\n')

!pip uninstall -y pyarrow
!pip install ray[debug]==0.7.5

# A hack to force the runtime to restart, needed to include the above dependencies.
import os
os._exit(0)

# Exercise 4 - Introducing Actors

**Goal:** The goal of this exercise is to demonstrate how to create stateful actors and call their methods.

For more details, please see the documentation on actors: http://ray.readthedocs.io/en/latest/actors.html

Although remote functions are useful for parallelizing stateless computations, sometimes your workload requires maintaining state across invocations. Some examples might be a simple counter, a neural network during training, or a simulator environment. If using remote functions, you would have to pass this state into each function invocation and return the updated state when it finishes.

However, Ray comes with a stateful abstraction for these situations: remote actors. An actor is a lot like a Python object - it is initialized with an `__init__` function (that has the same features has remote tasks), and can contain internal state that is accessed and mutated by remote method calls. Remote method calls will be executed one at a time on each actor, so there's no need to worry about race conditions on the actor's state. To achieve more parallelism, multiple actors can be created.

### Remote Actors

To create an actor, decorate a Python class with the `@ray.remote` decorator.

```python
@ray.remote
class Example(object):
    def __init__(self, x):
        self.x = x
    
    def set(self, x):
        self.x = x
    
    def get(self):
        return self.x
```

Like regular Python classes, **actors encapsulate state that is shared across actor method invocations**.

Actor classes differ from regular Python classes in the following ways.
1. **Instantiation:** A regular class would be instantiated via `e = Example(1)`. Actors are instantiated via
    ```python
    e = Example.remote(1)
    ```
    When an actor is instantiated, a **new process** is created somewhere in the cluster and the actor __init__ method is run in that process.
2. **Method Invocation:** Methods of a regular class would be invoked via `e.set(2)` or `e.get()`. Actor methods are invoked using remote task syntax.
    ```python
    >>> e.set.remote(2)
    ObjectID(d966aa9b6486331dc2257522734a69ff603e5a1c)
    
    >>> e.get.remote()
    ObjectID(7c432c085864ed4c7c18cf112377a608676afbc3)
    ```
3. **Return Values:** Actor methods are non-blocking. They immediately return an object ID and **create a task that is scheduled on the actor**. The result can be retrieved with `ray.get`.
    ```python
    >>> ray.get(e.set.remote(2))
    None
    
    >>> ray.get(e.get.remote())
    2
    ```

In [None]:
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

from collections import defaultdict
import numpy as np
import time

import ray

print('Successfully imported ray!')

In [None]:
ray.init(num_cpus=4, ignore_reinit_error=True)

**EXERCISE:** Make the `Foo` class an actor class |using the `@ray.remote` decorator.

In [None]:
class Foo(object):
    def __init__(self):
        self.counter = 0

    def reset(self):
        self.counter = 0

    def increment(self):
        time.sleep(0.5)
        self.counter += 1
        return self.counter

assert hasattr(Foo, 'remote'), 'You need to turn "Foo" into an actor with @ray.remote.'

**EXERCISE:** Change the intantiations below to create two actors by calling `Foo.remote()`.

In [None]:
# Create two Foo actors.
f1 = Foo()
f2 = Foo()

**EXERCISE:** Parallelize the code below. The two actors can execute methods in parallel (though each actor can only execute one method at a time).

In [None]:
start_time = time.time()

# Reset the actor state so that we can run this cell multiple times without
# changing the results.
f1.reset()
f2.reset()

# We want to parallelize this code. However, it is not straightforward to
# make "increment" a remote function, because state is shared (the value of
# "self.counter") between subsequent calls to "increment". In this case, it
# makes sense to use actors.
results = []
for _ in range(5):
    results.append(f1.increment())
    results.append(f2.increment())

duration = time.time() - start_time
assert not any([isinstance(result, ray.ObjectID) for result in results]), 'Looks like "results" is {}. You may have forgotten to call ray.get.'.format(results)

**VERIFY:** Run some checks to verify that the changes you made to the code were correct. Some of the checks should fail when you initially run the cells. After completing the exercises, the checks should pass.

In [None]:
assert results == [1, 1, 2, 2, 3, 3, 4, 4, 5, 5]

assert duration < 3, ('The experiments ran in {:.3f} seconds. This is too '
                      'slow.'.format(duration))
assert duration > 2.5, ('The experiments ran in {:.3f} seconds. This is too '
                        'fast.'.format(duration))

print('Success! The example took {:.3f} seconds.'.format(duration))

# Exercise 5 - Sharing References to an Actor

**GOAL:** The goal of this exercise is to show how to pass references to actors to remote functions and methods.

Sometimes, we may want to have multiple remote tasks that invoke methods on the same actor. For example, we may have a single actor that records logging information for a group of tasks and allows other tasks to query the logs. We can achieve this by passing a handle to the actor (the object returned from calling `Actor.remote()`) as an argument to the tasks.

### Actor Handles

First, we instantiate an actor:

```python
@ray.remote
class Actor(object):
    def method(self):
        pass

# Create the actor
actor = Actor.remote()
```

We can define a remote function (or another actor) that takes an actor handle as an argument:

```python
@ray.remote
def f(actor):
    # We can invoke a method on the actor and wait for its result.
    return ray.get(actor.method.remote())
```

This remote function can be invoked multiple times. Each invocation will have a reference to the
same actor.

```python
# Each of the three tasks created below will invoke methods on the same actor.
f.remote(actor)
f.remote(actor)
f.remote(actor)
```

In this exercise, we're going to write some code that runs several "experiments" in parallel and has each experiment log its results to a shared actor. The main driver script can then periodically pull the results from the logging actor.

**EXERCISE:** Turn this `LoggingActor` class into an actor class.

In [None]:
class LoggingActor(object):
    def __init__(self):
        self.logs = defaultdict(lambda: [])
    
    def log(self, index, message):
        self.logs[index].append(message)
    
    def get_logs(self):
        return dict(self.logs)


assert hasattr(LoggingActor, 'remote'), ('You need to turn LoggingActor into an '
                                         'actor (by using the ray.remote keyword).')

**EXERCISE:** Instantiate the actor.

In [None]:
logging_actor = LoggingActor()

Now we define a remote function that runs and pushes its logs to the `LoggingActor`.

**EXERCISE:** Modify this function so that it invokes methods correctly on `logging_actor` (you need to change the way you call the `log` method).

In [None]:
@ray.remote
def run_experiment(experiment_index, logging_actor):
    for i in range(60):
        time.sleep(1)
        # Push a logging message to the actor.
        logging_actor.log(experiment_index, 'On iteration {}'.format(i))

Now we create several tasks that use the logging actor.

In [None]:
experiment_ids = []
for i in range(3):
    experiment_ids.append(run_experiment.remote(i, logging_actor))

While the experiments are running in the background, the driver process (that is, this Jupyter notebook) can query the actor to read the logs.

---



**EXERCISE:** Modify the code below to fetch logs from the `LoggingActor` using a remote method call.

In [None]:
logs = logging_actor.get_logs()

assert isinstance(logs, dict), ("Make sure that you dispatch tasks to the "
                                "actor using the .remote keyword and get the results using ray.get.")
logs

**EXERCISE:** Try running the above box multiple times and see how the results change (while the experiments are still running in the background). You can also try running more of the experiment tasks and see what happens.