# Ray Crash Course - Ray Clusters and the Ray CLI

We used the `../tools/start-ray.sh` script to run a Ray "cluster" on our local machines or to detect that one is running already, such as when you go through these lessons using the Anyscale hosted platform. This script uses the Ray CLI command `ray` to perform these tasks. 

Let's briefly explore a few of the Ray CLI subcommands. For the full documentation, see the ray documentation for the [Ray CLI](https://docs.ray.io/en/latest/package-ref.html#the-ray-command-line-api). See also the [Cluster Setup](https://docs.ray.io/en/latest/cluster-index.html) section.

> **Tip:** If any of the CLI commands used here print a lot of output, right click on the output and select _Enable Scrolling for Outputs_.

> **Note:** The Anyscale hosted platform has its own CLI command, `anyscale`, which integrates the `ray` CLI and provides other capabilities for managing and running Ray projects and sessions, including automated cluster integration, synchronization of code to your local development environment, etc. Further information on this service will be available soon. [Contact us](mailto:academy@anyscale.com) for details.

## ray --help

The typical `help` information is available with `--help` or with no arguments:

In [7]:
!ray --help

Usage: ray [OPTIONS] COMMAND [ARGS]...

Options:
  --logging-level TEXT   The logging level threshold, choices=['debug',
                         default='info'

  --logging-format TEXT  The logging format. default='%(asctime)s
                         %(levelname)s %(filename)s:%(lineno)s -- %(message)s'

  --help                 Show this message and exit.

Commands:
  attach
  clusterbenchmark
  create-or-update  Create or update a Ray cluster.
  dashboard
  down              Tear down the Ray cluster.
  exec
  exec-cmd
  get-head-ip
  get-worker-ips
  get_head_ip
  globalgc
  kill-random-node  Kills a random Ray node.
  memory
  microbenchmark
  monitor           Runs `tail -n [lines] -f...
  project           [Experimental] Commands working with ray project
  rsync-down
  rsync-up
  rsync_down
  rsync_up
  session           [Experimental] Commands working with sessions, which
                    are...

  stack
  start
  stat
  stop
  submit            Uploads and runs a script on

Some of these commands are aliases, e.g., `down` and `teardown`, `get-head-ip` and `get_head_ip`, etc. `kill-random-node` looks strange, but it is useful for [Chaos Engineering](https://en.wikipedia.org/wiki/Chaos_engineering) purposes. 

For more details on a particular command, use `ray <command> --help`:

In [9]:
!ray start --help

Usage: ray start [OPTIONS]

Options:
  --node-ip-address TEXT          the IP address of this node
  --redis-address TEXT            same as --address
  --address TEXT                  the address to use for Ray
  --redis-port TEXT               the port to use for starting Redis
  --num-redis-shards INTEGER      the number of additional Redis shards to use
                                  in addition to the primary Redis shard

  --redis-max-clients INTEGER     If provided, attempt to configure Redis with
                                  this maximum number of clients.

  --redis-password TEXT           If provided, secure Redis ports with this
                                  password

  --redis-shard-ports TEXT        the port to use for the Redis shards other
                                  than the primary Redis shard

  --object-manager-port INTEGER   the port to use for starting the object
                                  manager

  --node-manager-port INTEGER     the port

## ray stat

Your first question might be, is Ray already running on this node as part of a cluster? This is what `start-ray.sh` has to determine. The command `ray stat` can be used to determine this.

In [1]:
!ray stat

2020-05-23 07:22:42,785	INFO scripts.py:947 -- Connecting to Ray instance at 192.168.1.149:45926.
2020-05-23 07:22:42,848	INFO scripts.py:957 -- Querying raylet 192.168.1.149:57239
workers_stats {
  pid: 12113
  core_worker_stats {
    current_task_func_desc: "{type=EmptyFunctionDescriptor}"
    ip_address: "192.168.1.149"
    port: 57447
    actor_id: "\377\377\377\377\377\377"
    num_in_plasma: 1
  }
}
workers_stats {
  pid: 12115
  core_worker_stats {
    current_task_func_desc: "{type=EmptyFunctionDescriptor}"
    ip_address: "192.168.1.149"
    port: 57448
    actor_id: "\377\377\377\377\377\377"
    num_in_plasma: 1
  }
}
workers_stats {
  pid: 12114
  core_worker_stats {
    current_task_func_desc: "{type=EmptyFunctionDescriptor}"
    ip_address: "192.168.1.149"
    port: 57446
    actor_id: "\377\377\377\377\377\377"
    num_in_plasma: 1
  }
}
workers_stats {
  pid: 8863
  is_driver: true
  core_worker_stats {
    current_task_func_desc: "{type=EmptyFunctionDescriptor}"
    ip

If Ray is running on this node, the output can be very long. It shows the status of the nodes, running worker processes and various other Python processes being executed, and [Redis](https://redis.io/) processes, which are used as part of the distributed object store for Ray. We discuss these services in greater detail in the [Advance Ray tutorial](../advanced-ray/00-Advanced-Ray-Overview.ipynb).

If there are multiple Ray instances running on this node, you'll have to specify the correct address. Run `ray stat` to see a list of those addresses, then pick the correct one:

```shell
ray stat --address IP:PORT
```

`ray stat` returns the exit code `0` if Ray is running locally or a nonzero value if it isn't. `start-ray.sh` exploits this feature to decide when to start ray:

```shell
ray stat > /dev/null 2>&1 || ray start --head
```

All output of `ray stat` is sent to `/dev/null` (which throws it away) and if the status code is nonzero, then the command after the `||` is executed, `ray start --head`.

## ray start and ray stop

As shown in the previous cell, `ray start` is used to start the Ray processes on a node. When the `--head` flag is used, it means this is the master node that will be used to bootstrap the cluster. 

When you want to stop Ray running on a particular node, use `ray stop`.

> **WARNING:** Running `ray stop` will impact any Ray applications currently running on this node, including all other lesson notebooks currently running Ray, so if you intend to stop Ray, first save your work, close those notebooks, and stop their processes using the _Running_ tab on the left of the Jupyter Lab UI. The tab might be labelled with a white square surrounded by a dark circle instead of _Running_.  

We won't actually run `ray start` or `ray stop` in what follows, to avoid causing problems for other lessons. We'll just describe what they do and the output they print.

When you run `ray start --head` you see output like the following (unless an error occurs):

```shell
$ ray start --head
2020-05-23 07:47:47,469	INFO scripts.py:357 -- Using IP address 192.168.1.149 for this node.
2020-05-23 07:47:47,489	INFO resource_spec.py:212 -- Starting Ray with 4.3 GiB memory available for workers and up to 2.17 GiB for objects. You can adjust these settings with ray.init(memory=<bytes>, object_store_memory=<bytes>).
2020-05-23 07:47:47,865	INFO services.py:1170 -- View the Ray dashboard at localhost:8265
2020-05-23 07:47:47,912	INFO scripts.py:387 -- 
Started Ray on this node. You can add additional nodes to the cluster by calling

    ray start --address='192.168.1.149:10552' --redis-password='5241590000000000'

from the node you wish to add. You can connect a driver to the cluster from Python by running

    import ray
    ray.init(address='auto', redis_password='5241590000000000')

If you have trouble connecting from a different machine, check that your firewall is configured properly. If you wish to terminate the processes that have been started, run

    ray stop
```

(You'll see a different IP address.)

The output includes a line like this:

```shell
ray start --address='192.168.1.149:10552' --redis-password='5241590000000000'
```

This is the `ray start` command you would use on the other machines where you want to start Ray and have them join the same cluster.

Note also the instructions for code to add to your application.

```python
import ray
ray.init(address='auto', redis_password='5241590000000000')
```

The `redis_password` shown is the default value. We didn't specify this argument when we called `ray.init()` in other notebooks.

You can actually call `ray start --head` multiple times on the same node to create separate clusters. They may appear at first to be a bug, but it is actually useful for testing purposes. 

The `ray stop` command usually prints no output. Add the `--verbose` flag for details. 

> **Warning:** `ray stop` stops all running Ray processes on this node. There is no command line option to specify which one to stop.