# Exploring Ray API Calls

To finish our tutorial on the core features of Ray, this lesson explores a few of the other API calls you might find useful, as well as options that can be used with the API calls we've already learned.

> **Tip:** The [Ray Package Reference](https://ray.readthedocs.io/en/latest/package-ref.html) in the [Ray Docs](https://ray.readthedocs.io/en/latest/) is useful for exploring the API features we'll learn.

In [1]:
import ray, time, sys
sys.path.append('..')  
from util.printing import pnd, pd

In [2]:
ray.init(ignore_reinit_error=True)

2020-04-10 12:23:48,634	INFO resource_spec.py:204 -- Starting Ray with 4.25 GiB memory available for workers and up to 2.13 GiB for objects. You can adjust these settings with ray.init(memory=<bytes>, object_store_memory=<bytes>).
2020-04-10 12:23:48,984	INFO services.py:1146 -- View the Ray dashboard at [1m[32mlocalhost:8265[39m[22m


{'node_ip_address': '192.168.1.149',
 'redis_address': '192.168.1.149:49277',
 'object_store_address': '/tmp/ray/session_2020-04-10_12-23-48_623844_55305/sockets/plasma_store',
 'raylet_socket_name': '/tmp/ray/session_2020-04-10_12-23-48_623844_55305/sockets/raylet',
 'webui_url': 'localhost:8265',
 'session_dir': '/tmp/ray/session_2020-04-10_12-23-48_623844_55305'}

## ray.init()

[`ray.init()`](https://ray.readthedocs.io/en/latest/package-ref.html#ray.init) has a lot of keyword arguments you can pass. Here are some of them.

| Name | Type | Example | Description |
| :--- | :--- | :------ | :---------- |
| `address` | `str` | `address='auto'` | The address of the Ray cluster to connect to. If this address is not provided, then this command will start Redis, a raylet, a plasma store, a plasma manager, and some workers. It will also kill these processes when Python exits. Using `auto` on a node in a Ray cluster connects to the cluster. |
| `num_cpus` | `int` | `num_cpus=4` | Each _raylet_ should be configured with this number of CPU cores. |
| `num_gpus` | `int` | `num_gpus=1` | Each _raylet_ should be configured with this number of GPU cores. |
| `memory` | `int` | `memory=1000000000` | The amount of memory (in bytes) that is available for use by workers requesting memory resources. By default, this is automatically set based on the available system memory. |
| `object_store_memory` | `int` | `object_store_memory=1000000000` | The amount of memory (in bytes) for the object store. By default, this is automatically set based on available system memory, subject to a 20GB cap. |
| `log_to_driver` | `bool` | `log_to_driver=True` | If true, then output from all of the worker processes on all nodes will be directed to the driver program. |
| `local_mode` | `bool` | `local_mode=True` | Use `True` if the code should be executed serially. This is useful for debugging. |
| `ignore_reinit_error` | `bool` | `ignore_reinit_error=True` | `True` if we should suppress errors from calling `ray.init()` a second time, as we've done in these notebooks. |
| `include_webui` | `bool` | `include_webui=False` | Boolean flag indicating whether to start the web UI, which displays the status of the Ray cluster. Bt default, or if this argument is `None`, then the UI will be started if the relevant dependencies are present. |
| `configure_logging` | `bool` | `configure_logging=True` | True if allow the logging cofiguration here. Otherwise, the users may want to configure it by their own. |
| `logging_level` | _Flag_ | `logging_level=logging.INFO` | Logging level, defaults to `logging.INFO`. |
| `logging_format` | `str` | `logging_format='...'` | The logging format to use, defaults to a string containing a timestamp, filename, line number, and message. See in the Ray source code `ray_constants.py` for details. |



## ray.is_initialized()

Is Ray [initialized](https://ray.readthedocs.io/en/latest/package-ref.html#ray.is_initialized)?

In [3]:
ray.is_initialized()

True

## @ray.remote()

We've used [@ray.remote](https://ray.readthedocs.io/en/latest/package-ref.html#ray.remote) a lot. You can pass arguments when using it. Here are some of them.

| Name | Type | Example | Description |
| :--- | :--- | :------ | :---------- |
| `num_cpus` | `int` | `num_cpus=4` | The number of CPU cores to reserve for this task or for the lifetime of the actor. |
| `num_gpus` | `int` | `num_gpus=1` | The number of GPU cores to reserve for this task or for the lifetime of the actor. |
| `num_return_vals` | `int` | `num_return_vals=2` | (only for remote functions/tasks, not actors) The number of object IDs returned by the remote function invocation. |
| `max_calls` | `int` | `max_calls=5` | (only for remote functions) The maximum number of times that a given worker can execute the given remote function before it must exit (this can be used to address memory leaks in third-party libraries or to reclaim resources that cannot easily be released, e.g., GPU memory that was acquired by TensorFlow). By default this is infinite. |
| `max_reconstructions` | `int` | `max_reconstructions=4` | (only for actors) The maximum number of times that the actor should be reconstructed when it dies unexpectedly. The minimum valid value is 0 (default), which indicates that the actor doesn’t need to be reconstructed. And the maximum valid value is `ray.ray_constants.INFINITE_RECONSTRUCTION`. |
| `max_retries` | `int` | `max_retries=5` | (only for remote functions) The maximum number of times that the remote function should be rerun when the worker process executing it crashes unexpectedly. The minimum valid value is 0, the default is 4 (default), and the maximum valid value is `ray.ray_constants.INFINITE_RECONSTRUCTION`. |

What is `ray.ray_constants.INFINITE_RECONSTRUCTION`?

In [5]:
ray.ray_constants.INFINITE_RECONSTRUCTION

1073741824

## @ray.method()

here is a similar [`ray.method()`](https://ray.readthedocs.io/en/latest/package-ref.html#ray.method) used to annotate methods in actors. Its only purpose is to provide the same optional arguments available for `ray.remote()`, when needed. Otherwise, methods in an actor are called just as we described in lesson 4.

## ray.get()

We used [`ray.get`](https://ray.readthedocs.io/en/latest/package-ref.html#ray.gett) a lot to retrieve objects and we used actor methods to retrieve state from an actor. You can actually put objects into the object store explicitly with [`ray.put`](https://ray.readthedocs.io/en/latest/package-ref.html#ray.put), as shown in the following example:

In [8]:
id = ray.put("Hello World!")
print(f'Object returned: {ray.get(id)}')

Object returned: Hello World!


There is an optional flag you can pass `weakref=True` (defaults to `False`). If set, Ray is allowed to evict the object while a reference to the returned ID still exists. This is useful if you are putting a lot of objects into the object store and many of them might not be needed in the future. It allows Ray to more aggressively reclaim memory.

## ray.kill(actor)

Sometimes it might be necessary to terminate an actor. Use [`ray.kill(actor)`](https://ray.readthedocs.io/en/latest/package-ref.html#ray.kill) for this purpose.
It will interrupt any running tasks on the actor, causing them to fail immediately. Any [`atexit`](https://docs.python.org/3/library/atexit.html) handlers installed in the actor process will still be run.

If you want to kill the actor but let pending tasks finish, you can call `actor.__ray_terminate__.remote()` instead to queue a termination task.

If this actor is reconstructable, reconstruction will be attempted.

## Fetching Information

Many methods return information:

| Method | Brief Description |
| :----- | :---------------- |
| [`ray.get_gpu_ids()`](https://ray.readthedocs.io/en/latest/package-ref.html#ray.get_gpu_ids) | GPUs |
| [`ray.get_resource_ids()`](https://ray.readthedocs.io/en/latest/package-ref.html#ray.get_resource_ids) | Resources available to the _worker_ |
| [`ray.get_webui_url()`](https://ray.readthedocs.io/en/latest/package-ref.html#ray.get_webui_url) | Ray Dashboard URL |
| [`ray.nodes()`](https://ray.readthedocs.io/en/latest/package-ref.html#ray.nodes) | Cluster nodes |
| [`ray.objects()`](https://ray.readthedocs.io/en/latest/package-ref.html#ray.objects) | Objects currently in the Object Store |
| [`ray.cluster_resources()`](https://ray.readthedocs.io/en/latest/package-ref.html#ray.cluster_resources) | All the available resources, used or not |
| [`ray.available_resources()`](https://ray.readthedocs.io/en/latest/package-ref.html#ray.available_resources) | Resources not in use |
| [`ray.errors()`](https://ray.readthedocs.io/en/latest/package-ref.html#ray.errors) | What errrors have occurred for this job (use `all_jobs=True` for all) |

In [23]:
print(f"""
ray.get_gpu_ids():          {ray.get_gpu_ids()}
ray.get_resource_ids():     {ray.get_resource_ids()}
ray.get_webui_url():        {ray.get_webui_url()}
ray.nodes():                {ray.nodes()}
ray.objects():              {ray.objects()}
ray.cluster_resources():    {ray.cluster_resources()}
ray.available_resources():  {ray.available_resources()}
ray.errors():               {ray.errors(all_jobs=True)}
""")


ray.get_gpu_ids():          []
ray.get_resource_ids():     {}
ray.get_webui_url():        localhost:8265
ray.nodes():                [{'NodeID': 'eaff91c6b430b8ae6d1c92703c4b0095f0523b78', 'Alive': True, 'NodeManagerAddress': '192.168.1.149', 'NodeManagerHostname': 'DWAnyscaleMBP.local', 'NodeManagerPort': 53569, 'ObjectManagerPort': 53600, 'ObjectStoreSocketName': '/tmp/ray/session_2020-04-10_12-23-48_623844_55305/sockets/plasma_store', 'RayletSocketName': '/tmp/ray/session_2020-04-10_12-23-48_623844_55305/sockets/raylet', 'Resources': {'object_store_memory': 30.0, 'CPU': 8.0, 'node:192.168.1.149': 1.0, 'memory': 87.0}, 'alive': True}]
ray.objects():              {ObjectID(ffffffffffffffffffffffff0100008801000000): {'DataSize': 0, 'Manager': b'\xea\xff\x91\xc6\xb40\xb8\xaem\x1c\x92p<K\x00\x95\xf0R;x'}}
ray.cluster_resources():    {'object_store_memory': 30.0, 'CPU': 8.0, 'node:192.168.1.149': 1.0, 'memory': 87.0}
ray.available_resources():  {'memory': 87.0, 'node:192.168.1.149': 1.0,

Recall that we used `ray.nodes()[0]['Resources']['CPU']` in the second lesson to determine the number of CPU cores on our machines:

In [16]:
import json
ray.nodes()[0]['Resources']['CPU']

8.0

## ray.shutdown()

You don't always need to call it explicitly, but [`ray.shutdown()`](https://ray.readthedocs.io/en/latest/package-ref.html#ray.shutdown) will disconnect the worker, and terminate processes started by `ray.init()`.

It is automatically run at the end when a Python process that uses Ray exits. It is okay to run this twice in a row. The primary use case for this function is to cleanup state between tests.

Note that this will clear any remote function definitions, actor definitions, and existing actors, so if you wish to use any previously defined remote functions or actors after calling `ray.shutdown()`, then you need to redefine them. If they were defined in an imported module, then you will need to reload the module.

There is one optional parameter, `exiting_interpreter=True|False`. `True` is passed when `ray.shutdown()` is called by the `atexit` hook and `False` otherwise. If Ray is exiting the interpreter, it waits a little while to print any extra error messages.

## ray.timeline()

Sometimes you need to find where the bottlenecks are. [`ray.timeline()`](https://ray.readthedocs.io/en/latest/package-ref.html#ray.timeline) helps. It returns a list of profiling events that can viewed as a timeline. To use the results, the easiest method is to dump the data to a JSON file by passing in `filename=...` argument. Or, you can call `json.dump(filename)` on the returned object. In either case, then open chrome://tracing in a Chrome browser window (only Chrome works) and load the dumped file. Try examing the following file (but the output will be boring at the moment):

In [21]:
ray.timeline(filename = '../tmp/timeline-example.json')

## ray.object_transfer_timeline()

The related [`ray.object_transfer_timeline()`](https://ray.readthedocs.io/en/latest/package-ref.html#ray.object_transfer_timeline) returns events for objects moved between nodes.