# Ray - a Python framework for distributed computation

Definition of a distributed system: 

_"Where a computer that you never heard of can bring your system down"_ -- Leslie Lamport

Distributed systems are necessary for many reasons: avoiding a single point of failure, distributing work to responsible departments and improving scalability.

Rather than a theoretical explanation, we will install and work with a real-world distributed framework called Ray. Ray is a python specific framework with many high level components which make it easier to serve APIs, train models, share data, etc. We will look at one of these components, before digging a bit deeper into a lower level component.

### Ray serve
Ray serve is a component which makes it easy to spread the serving of an API across several machines. Let's jump into code.

In [None]:
%%writefile simple_api.py

from fastapi import FastAPI
from typing import Dict

app = FastAPI()

@app.get("/status")
def status() -> Dict[str, str]:
    """Simple health check endpoint."""
    return {"status": "ok"}


@app.get("/compute")
def fibonacci(n: int):
    """Compute Fibonacci sequence up to n (inclusive)."""
    if n <= 0:
        return []
    fib = [0, 1]
    while fib[-1] + fib[-2] <= n:
        fib.append(fib[-1] + fib[-2])
    return fib

# fastapi run simple_api.py
# http://localhost:8000/compute?n=10

Normally you run the code above as:

```python
fastapi run simple_apy.py
```

This will run the API on a single machine. 

However, is your startup grows, how do you make sure you can continue to serve clients?

### Is your current setup going to scale when you go viral? 
Test it with https://locust.io/

In [None]:
!pip install locust

Create a virtual users who will hit your API

In [None]:
%%writefile locustfile.py

from locust import HttpUser, TaskSet, task, between

class APIUser(HttpUser):
    wait_time = between(1, 3)
    host = "http://127.0.0.1:8000"

    @task
    class UserTasks(TaskSet):
        @task
        def get_status(self):
            self.client.get("/status/") 

        @task
        def do_compute(self):
            self.client.get("/compute?n=100") 

Run it as `locust` at the command line.
This will refer you to a web page, which will let you control the test.

#### Things to note
1. Any filures on the locust dashboard?
2. Monitor the logs of your application
3. What is the median execution time?
4. **What is the tail execution time**?
5. What is the RPS (requests per second)?

### Let's try to scale this across several machines

If we are not on the same network, use Tailscale to hop on the same vpn.

#### Install Ray

In [None]:
!pip install ray[all]

#### Deploy FastAPI on a cluster (via Ray)

In [None]:
%%writefile simple_api_ray.py

from fastapi import FastAPI
from typing import Dict
from ray import serve
#import ray

#ray.init(address="192.168.12.239:10001") 

app = FastAPI()

@app.get("/status")
def status() -> Dict[str, str]:
    """Simple health check endpoint."""
    return {"status": "ok"}


@app.get("/compute")
def fibonacci(n: int):
    """Compute Fibonacci sequence up to n (inclusive)."""
    if n <= 0:
        return []
    fib = [0, 1]
    while fib[-1] + fib[-2] <= n:
        fib.append(fib[-1] + fib[-2])
    return fib

@serve.deployment
@serve.ingress(app)
class FastAPIWrapper:
    pass

serve.run(FastAPIWrapper.bind(), route_prefix="/")

# python simple_api_ray.py
# http://localhost:8000/compute?n=10

In [None]:
%%writefile simple_api_ray2.py

import ray
from ray import serve
from fastapi import FastAPI

ray.init(address="auto")  # Connect to the existing Ray cluster

serve.start(detached=True)  # Ensure Serve runs in the background

app = FastAPI()

@app.get("/")
def read_root():
    return {"message": "Hello from Ray Serve"}

# Create a deployment with multiple replicas
@serve.deployment(num_replicas=2)
@serve.ingress(app)
class FastAPIDeployment:
    pass

FastAPIDeployment.deploy()

print("FastAPI app is deployed. Try accessing it at the head node's IP.")
