## Ray

#### What is Ray?

This whole course we've been using the Ray package:

In [2]:
import ray.rllib

![](img/ray-logo.png)

What is Ray? From the [docs](https://docs.ray.io/en/latest/):

> Ray is a general-purpose and universal distributed compute framework.

Ray is also:

- An [active open source project](https://github.com/ray-project/ray) with over 20k stars on GitHub 🤩
- Backed by the unicorn startup [Anyscale](https://www.anyscale.com/), that produced this course 🦄

Notes:

But, back to distributed computing.

#### What is distributed computing?

_Distributed computing_ is computing that involves multiple machines (nodes) distributed across a network.

![](img/supercomputer.png)

Pros:

- Massively improved capabilities

Cons/challenges:

- Synchronization
- Failure
- ...

#### Ray makes distributed computing easy

- The goal of Ray is to make distributed computing easy and accessible.
- Ray handles most of the challenges for users.
- RLlib, tune and the other sub-packages were built on top of Ray.
- This means _RLlib and tune automatically have distributed capabilities._

Notes:

Surprise! RLlib is easy to use and conveniently implements many state-of-the-art RL algorithms, but it has another benefit that we didn't mention until now: natural distributed computing capabilities. This puts it well ahead of competing packages in ease of distributing the computation.

#### RLlib, distributed

- In this course we set up algorithm configs many times.
- But there are some parameters we haven't used before:

In [1]:
from ray.rllib.algorithms.ppo import PPOConfig

In [2]:
ppo_config = (
    PPOConfig()\
    .framework("torch")\
    .rollouts(num_rollout_workers=4, num_envs_per_worker=2)\
    .resources(num_gpus=0)
)

You can read more about specifying resources [here](https://docs.ray.io/en/master/rllib/rllib-training.html#specifying-resources) and about scaling [here](https://docs.ray.io/en/master/rllib/rllib-training.html#scaling-guide).

But... what is a "rollout worker"?

#### Rollout workers

- Rollout workers collect data from the environment (simulator) in parallel.
- For most simulator environments, one can replicate the environment in a cluster.
- Therefore, you can collect data much faster and avoid bottlenecking the training.
- Whatever cluster Ray is connected to on the backend, `num_rollout_workers=4` works seamlessly.

Notes:

In supervised learning, when you're waiting you know you're probably waiting for the model to train. In RL, the bottleneck could be the data collection or the model updates. Being able to parallelize rollouts alleviates the data collection bottleneck. 

#### Driver

In all our configs we've had

```python
create_env_on_driver = True
```

What this means is that we put the env on the same "driver" process that's running the training.

#### Let's apply what we learned!

## MCQ
<!-- multiple choice -->


## Coding
<!-- coding exercise -->

Have them do something simple with Ray core, or with RLlib where we can see the processes somehow - or just see that it's faster, though it might not be depending on how the cluster is set up...
