# Getting Started With Anyscale
This notebook takes you from a Ray script to an Anyscale cluster.  It is a very fast but complete tour from development to production jobs.

To limit the material, this notebook contains no machine learning code nor serious workloads.  It does however show how Ray works, how Anyscale scales, and how to work in the cloud.

Find this notebook on Github [here](https://github.com/anyscale/getting-started-webinar).

In [None]:
!pip install "ray[default]"==1.10
!pip install anyscale

In [None]:
import ray
import time
import anyscale

You can shut down Ray or disconnect from it with this command:

In [None]:
ray.shutdown()

Let's start Ray and connect to it.
We're going to start by running Ray on the same machine as this notebook.

In [None]:
ray.init()

# Task, Actor and Entry Point
The following is a complete Ray program that we'll use to take us from Ray to Anyscale

In [None]:
def local_func(i):
    time.sleep(0.1)
    return i*i

@ray.remote
def my_remote_task(i):
    return f"The square of {i} is {local_func(i)}"

@ray.remote
class Squarer:
    def squareme(self, i):
        return local_func(i)
    def labelme(self, i):
        ref = my_remote_task.remote(i)
        return ray.get(ref)


The next couple of cells play with the above code and show how to interact with code running in Ray.

In [None]:
# local function (not remote)
print(local_func(1001))

In [None]:
# what will this do?
ref = my_remote_task.remote(123)
ref

In [None]:
ray.get(ref)

In [None]:
# this code is the entry point.  If I were to run
# python this_script.py, then this would be executed
if __name__ == "__main__":
    #argument = int(sys.argv[1])
    argument = 15
    n = local_func(argument)
    actor = Squarer.remote()
    ref = actor.labelme.remote(n)
    print(ray.get(ref))

# Ray Resources
How can we take a look at what Ray is doing.  The first time through this section, we'll be using the local Ray.  These cells are repeated later on after connecting with Anyscale.

In [None]:
ray.cluster_resources()

In [None]:
ray.shutdown()

**Ray does not care about reality**

In [None]:
ray.init(num_cpus=1000)

In [None]:
ray.cluster_resources()

Use **num_cpus** to control allocation of actors and tasks

In [None]:
x = [Squarer.options(num_cpus=0.01).remote() for _ in range(100)]

In [None]:
y = [z.squareme.remote(i) for i,z in enumerate(x)]
ray.get(y)

# Connecting to Anyscale

In [None]:
import ray
# make sure you're not using Ray
ray.shutdown()

You need credentials to connect to Anyscale.  Since this notebook is running on my laptop, I'm using the credentials I fetched from the [Anyscale UI](https://console.anyscale.com)

In [None]:
# Ray Client
import ray
#ray.init("anyscale://")
#ray.init("anyscale://my_project/")
#ray.init("anyscale://my_cluster")
ctx = ray.init("anyscale://getting_started/my_cluster",
              runtime_env={"working_dir" : "."},
              cluster_env="demo-with-aws:3",
              cluster_compute="demos-s3-access",
              )
ctx

In [None]:
# here's how we'll ask Ray to go get more
import ray.autoscaler.sdk
ray.autoscaler.sdk.request_resources(num_cpus=100)

In [None]:
ray.cluster_resources()

The following cell is repeated for convenience:

In [None]:
def local_func(i):
    import time
    time.sleep(0.1)
    return i*i

@ray.remote
def my_remote_task(i):
    return f"The square of {i} is {local_func(i)}"

@ray.remote
class Squarer:
    def squareme(self, i):
        return local_func(i)
    def labelme(self, i):
        ref = my_remote_task.remote(i)
        return ray.get(ref)


In [None]:
x = [Squarer.options(num_cpus=0.5).remote() for _ in range(10)]


In [None]:
y = [z.squareme.remote(i) for i,z in enumerate(x)]
ray.get(y)

# Long Term Storage
Your first experience with Anyscale will be the fully-managed version.  In order to use long-term storage, you'll have to allow the Ray cluster running in Anyscale's cloud to access your long term storage.

There are [detailed instructions](https://docs.anyscale.com/user-guide/configure/access-resources-from-cloud/overview) on the docs website for setting up this role and leveraging it.

```
ctx = ray.init("anyscale://getting_started/my_cluster2",
              runtime_env={"working_dir" : "."},
              cluster_env="demo-with-aws:3",
              cluster_compute="demos-s3-access",
              )
```

Below is code that uses ray datasets to read and write data to a S3 bucket.



In [None]:
import ray
@ray.remote
def write_generated_data(path):
    ds = ray.data.range(100000)

    ds.write_parquet(path)
    return "Done"

@ray.remote
def read_data(path):
    ds = ray.data.read_parquet(path)
    return "Read the data"

ref = write_generated_data.remote("s3://dir-temp/notebook-data.parquet")
ray.get(ref)

In [None]:
read = read_data.remote("s3://dir-temp/notebook-data.parquet")

ray.get(read)