# Exploring Ray API Calls

© 2019-2022, Anyscale. All Rights Reserved

📖 [Back to Table of Contents](./ex_00_tutorial_overview.ipynb)<br>
➡ [Next notebook](./ex_07_ray_data.ipynb) <br>
⬅️ [Previous notebook](./ex_05_multiprocess_pool.ipynb) <br>


### Learning objectives
In this quick tour of the API, you will learn about:

 * Common Ray Core APIs
 * Some useful arguments to these APIs 
 * Tips and Tricks for first-time users


This lesson explores a few of the other API calls you might find useful, as well as options that can be used with the API calls we've already used in the previous lessons. Additionally, we will walk through some tips and tricks for first time users.

> **Tip:** The [Ray Package Reference](https://docs.ray.io/en/latest/package-ref.html) in the [Ray Docs](https://docs.ray.io/en/latest/) is useful for exploring the API features we'll learn.

In [1]:
import ray, time, sys, logging
import numpy as np 
import json

In [2]:
if ray.is_initialized:
    ray.shutdown()
ray.init(logging_level=logging.ERROR)

0,1
Python version:,3.8.13
Ray version:,2.0.0rc0
Dashboard:,http://127.0.0.1:8267


## ray.init()

When we used [`ray.init()`](https://ray.readthedocs.io/en/latest/package-ref.html#ray.init), we used it to start Ray on our local machine. When the optional `address=...` argument is specified, the driver connects to the corresponding Ray cluster.

There are a lot of optional keyword arguments you can pass to `ray.init()`. Here are some of them. All options are described in the [documentation](https://ray.readthedocs.io/en/latest/package-ref.html#ray.init). 

| Name | Type | Example | Description |
| :--- | :--- | :------ | :---------- |
| `address` | `str` | `address='auto'` | The address of the Ray cluster to connect to. If this address is not provided, then this command will start Redis, a raylet, a plasma store, a plasma manager, and some workers. It will also kill these processes when Python exits. If the driver is running on a node in a Ray cluster, using `auto` as the value tells the driver to detect the the cluster, removing the need to specify a specific node address. |
| `num_cpus` | `int` | `num_cpus=4` | Number of CPUs the user wishes to assign to each _raylet_. |
| `num_gpus` | `int` | `num_gpus=1` | Number of GPUs the user wishes to assign to each _raylet_. |
| `resources` | `dictionary` | `resources={'resource1': 4, 'resource2': 16}` | Maps the names of custom resources to the quantities of those resources available. |
| `memory` | `int` | `memory=1000000000` | The amount of memory (in bytes) that is available for use by workers requesting memory resources. By default, this is automatically set based on the available system memory. |
| `object_store_memory` | `int` | `object_store_memory=1000000000` | The amount of memory (in bytes) for the object store. By default, this is automatically set based on available system memory, subject to a 20GB cap. |
| `log_to_driver` | `bool` | `log_to_driver=True` | If true, then the output from all of the worker processes on all nodes will be directed to the driver program. |
| `local_mode` | `bool` | `local_mode=True` | If true, the code will be executed serially. This is useful for debugging. |
| `ignore_reinit_error` | `bool` | `ignore_reinit_error=True` | If true, Ray suppresses errors from calling `ray.init()` a second time (as we've done in these notebooks). Ray won't be restarted. |
| `configure_logging` | `bool` | `configure_logging=True` | If true (default), configuration of logging is allowed here. Otherwise, the user may want to configure it separately. |
| `logging_level` | _Flag_ | `logging_level=logging.INFO` | The logging level, defaults to `logging.INFO`. Ignored unless "configure_logging" is true. |
| `logging_format` | `str` | `logging_format='...'` | The logging format to use, defaults to a string containing a timestamp, filename, line number, and message. See the Ray source code `ray_constants.py` for details. Ignored unless "configure_logging" is true. |
| `runtime_env` | `map` | `{"working_dir": "/path/to/files"}` | Your Ray application might depend on source files or data files. For a development workflow, these might live on your local machine, but when it comes time to run things at scale, you will need to get them to your remote cluster. A way to send these files across all nodes in the cluster so that your distributed tasks or actors can access them, use this option, [for example.](https://docs.ray.io/en/latest/ray-core/handling-dependencies.html#runtime-environments) |

See also the documentation for [ray.shutdown()](https://ray.readthedocs.io/en/latest/package-ref.html#ray.shutdown), which is needed in some contexts.

## ray.is_initialized()

Is Ray [initialized](https://ray.readthedocs.io/en/latest/package-ref.html#ray.is_initialized)?

In [4]:
if not ray.is_initialized():
    ray.init()

## @ray.remote()

We've used [@ray.remote](https://ray.readthedocs.io/en/latest/package-ref.html#ray.remote) a lot. You can pass arguments when using it. Here are some of them.

| Name | Type | Example | Description |
| :--- | :--- | :------ | :---------- |
| `num_cpus` | `int` | `num_cpus=4` | The number of CPU cores to reserve for this task or for the lifetime of the actor. |
| `num_gpus` | `int` | `num_gpus=1` | The number of GPU cores to reserve for this task or for the lifetime of the actor. |
| `num_returns` | `int` | `num_returns=2` | (Only for tasks, not actors.) The number of object refs returned by the remote function invocation. |
| `runtime_env` | `map` | `runtime_env = {"working_dir": ".", "pip": ["requests"]}}` | The runtime environment to use for this job (see [Runtime environments](https://docs.ray.io/en/latest/ray-core/handling-dependencies.html#runtime-environments) for details. |
| `max_calls` | `int` | `max_calls=5` | Only for *remote tasks*. This specifies the maximum of times that a given worker can execute the given remote function before it must exit (this can be used to address memory leaks in third-party libraries or to reclaim resources that cannot easily be released, e.g., GPU memory that was acquired by TensorFlow). By default this is infinite. |
| `max_restarts` | `int` | `max_restarts=-1` | Only for *actors*. This specifies the maximum number of times that the actor should be restarted when it dies unexpectedly. The minimum valid value is 0 (default), which indicates that the actor doesn't need to be restarted. A value of -1 indicates that an actor should be restarted indefinitely. |
| `max_task_retries` | `int` | `max_task_retries=-1` | Only for *actors*. How many times to retry an actor task if the task fails due to a system error, e.g., the actor has died. If set to -1, the system will retry the failed task until the task succeeds, or the actor has reached its max_restarts limit. If set to n > 0, the system will retry the failed task up to n times, after which the task will throw a `RayActorError` exception upon `ray.get`. Note that Python exceptions are not considered system errors and will not trigger retries. |
| `max_retries` | `int` | `max_retries=-1` | Only for *remote functions*. This specifies the maximum number of times that the remote function should be rerun when the worker process executing it crashes unexpectedly. The minimum valid value is 0, the default is 4 (default), and a value of -1 indicates infinite retries. |

Here's an example with and without `num_return_vals`:

In [3]:
@ray.remote(num_returns=3)
def tuple3(one, two, three):
    return (one, two, three)

x_ref, y_ref, z_ref = tuple3.remote("a", 1, 2.2)
x, y, z = ray.get([x_ref, y_ref, z_ref])
print(f'({x}, {y}, {z})')

@ray.remote
def tuple3(one, two, three):
    return (one, two, three)

xyz_ref = tuple3.remote("a", 1, 2.2)
x, y, z = ray.get(xyz_ref)
print(f'({x}, {y}, {z})')

(a, 1, 2.2)
(a, 1, 2.2)


### @ray.method()

Related to `@ray.remote()`, [@ray.method()](https://ray.readthedocs.io/en/latest/package-ref.html#ray.method) allows you to specify the number of return values for a method in an actor, by passing the `num_returns` keyword argument. None of the other `@ray.remote()` keyword arguments are allowed. Here is an example:

In [6]:
@ray.remote
class Tupleator:
    @ray.method(num_returns=3)
    def tuple3(self, one, two, three):
        return (one, two, three)
    
tupleator = Tupleator.remote()
x_ref, y_ref, z_ref = tupleator.tuple3.remote("a", 1, 2.2)
x, y, z = ray.get([x_ref, y_ref, z_ref])
print(f'({x}, {y}, {z})')   

(a, 1, 2.2)


## ray.put()

We used [`ray.get`](https://ray.readthedocs.io/en/latest/package-ref.html#ray.gett) a lot to retrieve objects and we used actor methods to retrieve state from an actor. You can actually put objects into the object store explicitly with [`ray.put`](https://ray.readthedocs.io/en/latest/package-ref.html#ray.put), as shown in the following example:

In [7]:
ref = ray.put("Hello World!")
print(f'Object returned: {ray.get(ref)}')

Object returned: Hello World!


In [8]:
ref = ray.put(np.random.rand(2_000, 5_000))
print(f'Object returned: {ray.get(ref)}')

Object returned: [[9.28789424e-01 5.88450727e-01 3.60375269e-01 ... 1.42004987e-01
  4.13856262e-01 7.08533567e-01]
 [9.20752545e-01 3.97516260e-01 3.13765856e-01 ... 3.75670449e-01
  7.19817426e-01 8.38361670e-04]
 [3.69238638e-01 7.03181046e-01 9.62826671e-02 ... 6.90639598e-02
  5.01079671e-01 3.32691043e-01]
 ...
 [9.90635593e-01 5.08797602e-01 1.43848551e-01 ... 1.06937177e-01
  1.29106509e-01 4.10313220e-01]
 [8.17311500e-01 6.75583305e-01 4.22965416e-01 ... 6.96575718e-01
  1.66150844e-01 9.60151494e-01]
 [6.30828009e-01 5.97115921e-01 5.48616023e-01 ... 7.69663111e-01
  6.75029754e-01 8.35391625e-02]]


There is an optional flag you can pass `weakref=True` (defaults to `False`). If true, Ray is allowed to evict the object while a reference to the returned ref still exists. This is useful if you are putting a lot of objects into the object store and many of them might not be needed in the future. It allows Ray to aggressively reclaim memory.

## Fetching Cluster Information

Many methods return information:

| Method | Brief Description |
| :----- | :---------------- |
| [`ray.get_gpu_ids()`](https://ray.readthedocs.io/en/latest/package-ref.html#ray.get_gpu_ids) | GPUs |
| [`ray.nodes()`](https://ray.readthedocs.io/en/latest/package-ref.html#ray.nodes) | Cluster nodes |
| [`ray.cluster_resources()`](https://ray.readthedocs.io/en/latest/package-ref.html#ray.cluster_resources) | All the available resources, used or not |
| [`ray.available_resources()`](https://ray.readthedocs.io/en/latest/package-ref.html#ray.available_resources) | Resources not in use |

In [4]:
print(f"""
ray.get_gpu_ids():          {ray.get_gpu_ids()}
ray.nodes():                {ray.nodes()}
ray.cluster_resources():    {ray.cluster_resources()}
ray.available_resources():  {ray.available_resources()}
""")


ray.get_gpu_ids():          []
ray.nodes():                [{'NodeID': '0a65dccbbfdcf3e7dc344e0ba727130460db214b47f139cbc17ce970', 'Alive': True, 'NodeManagerAddress': '127.0.0.1', 'NodeManagerHostname': 'Juless-MacBook-Pro-16', 'NodeManagerPort': 53155, 'ObjectManagerPort': 53154, 'ObjectStoreSocketName': '/tmp/ray/session_2022-08-04_15-14-27_868170_21716/sockets/plasma_store', 'RayletSocketName': '/tmp/ray/session_2022-08-04_15-14-27_868170_21716/sockets/raylet', 'MetricsExportPort': 59760, 'NodeName': '127.0.0.1', 'alive': True, 'Resources': {'memory': 41505980416.0, 'node:127.0.0.1': 1.0, 'object_store_memory': 2147483648.0, 'CPU': 10.0}}]
ray.cluster_resources():    {'memory': 41505980416.0, 'CPU': 10.0, 'object_store_memory': 2147483648.0, 'node:127.0.0.1': 1.0}
ray.available_resources():  {'memory': 41505980416.0, 'object_store_memory': 2147483648.0, 'node:127.0.0.1': 1.0, 'CPU': 10.0}



Recall that we used `ray.nodes()[0]['Resources']['CPU']` in the second lesson to determine the number of CPU cores on our machines:

In [5]:
ray.nodes()[0]['Resources']['CPU']

10.0

# Tips and Tricks for first-time users
First time users can trip upon certain API calls in the usage patterns. This short tip & triks will insure you against unexpected results. Below we briefly explore of a handful of API calls and their best practice.

### Tip 1: Delay ray.get()

With Ray, all invocations of `.remote()` calls are asynchronous, meaning the operation  returns immediately with a promise/future object ID. This is key to achieving massive parallelism, for it allows a devloper to launch many remote tasks, each returning a remote future object ID. Whenever needed, this object ID is fetched with `ray.get`. Because `ray.get` is a blocking call, where and how often you use affects the performance. 


In [6]:
@ray.remote
def do_some_work(x):
    time.sleep(1)
    return x * x

#### Bad usage
We use `ray.get` inside a loop since it blocks on each call of `.remote()`

In [7]:
%%time
results = [ray.get(do_some_work.remote(x)) for x in range(10)]
results

CPU times: user 53 ms, sys: 28.8 ms, total: 81.7 ms
Wall time: 10.1 s


[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

#### Good usage
We delay `ray.get` after all the tasks have been invoked and their references have been returned.


In [8]:
%%time
results = ray.get([do_some_work.remote(x) for x in range(10)])
results

CPU times: user 10.8 ms, sys: 6.34 ms, total: 17.2 ms
Wall time: 1.06 s


[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

#### Takeway tip 1: 
Since `ray.get` is a blocking call, postpone its use only when you need object ID's value. If called eagerly, it can
affect the performance of your desired parallelism.

### Tip 2: Avoid tiny remote tasks
Ray APIs are general and simple to use. As a result, new comers' natural intinct to parallelize all tasks, including tiny small ones, which can incur an overhead overtime. 
In short, if the Ray remote tasks are tiny and miniscule, they may take longer to execute than their serial Python equivalents.

In [9]:
def tiny_task(x):
    time.sleep(0.0001)
    return x

In [10]:
%%time
results = [tiny_task(x) for x in range(100000)]
results[:10]

CPU times: user 113 ms, sys: 165 ms, total: 278 ms
Wall time: 12.9 s


[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

Now convert this into Ray remote task

In [11]:
@ray.remote
def remote_tiny_task(x):
    time.sleep(0.0001)
    return x

In [12]:
%%time
result_ids = [remote_tiny_task.remote(x) for x in range(100000)]
results = ray.get(result_ids)
results[:10]

CPU times: user 6.97 s, sys: 3.15 s, total: 10.1 s
Wall time: 6.75 s


[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

Surprisingly, not only Ray didn’t improve the execution time, but the Ray program is actually slower than the sequential program! What can we do to remedy it? What's going on?

Well, the issue here is that every task invocation has a non-trivial overhead (e.g., scheduling, inter-process communication, updating the system state) and this overhead dominates the actual time it takes to execute the task.

One way to mitigate is to make the remote tasks "larger" in order to amortize invocation overhead. This is achieved by aggregating tasks into bigger chunks of 1000.


In [13]:
@ray.remote
def mega_work(start, end):
    return [tiny_task(x) for x in range(start, end)]

In [14]:
%%time
result_ids = []
[result_ids.append(mega_work.remote(x*1000, (x+1)*1000)) for x in range(100)]
results = ray.get(result_ids)

CPU times: user 208 ms, sys: 19 ms, total: 227 ms
Wall time: 1.52 s


A huge difference in execution time!

### Tip 3: Using ray.wait() with ray.get()

As we noted above, an idiomatic way of using `ray.get()` is delay fetching the object until you need them. Another way is to use is with `ray.wait()`, and only fetch values that are already available. 

Let's look at a simple example.

In [15]:
import numpy as np
@ray.remote
def make_array(n):
    time.sleep(n/10.0)
    return np.random.standard_normal(n)

Now define a task that can add two NumPy arrays together. The arrays need to be the same size, but we'll ignore any checking for this requirement.

In [16]:
@ray.remote
def add_arrays(a1, a2):
    time.sleep(a1.size/10.0)
    return np.add(a1, a2)

Now let's use `ray.wait` and `ray.get`

In [17]:
%%time

array_refs = [make_array.remote(n*10) for n in range(6)]
added_array_refs = [add_arrays.remote(ref, ref) for ref in array_refs]

arrays = []
waiting_refs = list(added_array_refs)  # Assign a working list to the full list of refs
while len(waiting_refs) > 0:           # Loop until all tasks have completed
    # Call ray.wait with:
    #   1. the list of refs we're still waiting to complete,
    #   2. tell it to return immediately as soon as one of them completes,
    #   3. tell it wait up to 10 seconds before timing out.
    ready_refs, remaining_refs = ray.wait(waiting_refs, num_returns=2, timeout=10.0)
    new_arrays = ray.get(ready_refs)
    arrays.extend(new_arrays)
    for array in new_arrays:
        print(f'{array.size}: {array}')
    waiting_refs = remaining_refs  # Reset this list; don't include the completed refs in the list again!
    
# print(f"\nall arrays: {arrays}")

0: []
10: [ 0.67924325  0.72046521  1.81187227 -5.96841037 -1.52087523  1.17441729
  0.73717767 -1.16755827 -5.49778349 -2.39335352]
20: [ 1.21720454  1.09355635 -2.32726367 -0.81798154 -1.54310347  2.80597244
 -0.80182366  0.31358211  1.12971558  2.49173343 -1.23055309  0.6234161
  0.38255905 -1.36429114 -2.25437702 -0.32642152  0.22794833 -1.94587712
 -0.44481705 -2.50265168]
30: [-0.35749696  1.41205417 -0.43934293  0.31231693  1.43702512  1.33706947
 -2.62163193 -2.69845543  1.96752567 -2.92100035 -0.52141962 -1.15096703
  0.49679219 -0.33984233 -1.19709676  4.04466364  0.8439049   0.14126709
 -1.67057183  0.05481192 -1.11272189 -3.58555992  0.31233282 -1.90485382
 -2.05219375  0.91991279 -0.59231694 -0.04535516 -1.64007465 -2.35838909]
40: [-2.80149339  0.01118768 -1.458583    1.31211394 -0.33444827 -1.63405632
 -0.49802598 -1.55075499  3.98152556 -0.76316437 -0.54348778 -1.09790428
  0.86483016  1.85707778 -2.26064266  0.02658009  0.72300167 -1.6333332
  0.23418264  0.85920503 -3

In [11]:
ray.shutdown()

### Next Step

Let's move on to our final module 3 and start with [Ray Datasets lesson](ex_07_ray_data.ipynb)

### Homework 

Read some more [tricks and tips](https://docs.ray.io/en/latest/ray-core/tips-for-first-time.html) in the documentation

📖 [Back to Table of Contents](./ex_00_tutorial_overview.ipynb)<br>
➡ [Next notebook](./ex_07_ray_data.ipynb) <br>
⬅️ [Previous notebook](./ex_05_multiprocess_pool.ipynb) <br>