# Load test
Simulate a large number of clients accessing your web site or service

### A sample web service
Ray serve is a component which makes it easy to spread the serving of an API across several machines. Let's jump into code.

In [3]:
%%writefile simple_api.py

from fastapi import FastAPI
from typing import Dict

app = FastAPI()

@app.get("/status")
def status() -> Dict[str, str]:
    """Simple health check endpoint."""
    return {"status": "ok"}


@app.get("/compute")
def fibonacci(n: int):
    """Compute Fibonacci sequence up to n (inclusive)."""
    if n <= 0:
        return []
    fib = [0, 1]
    while fib[-1] + fib[-2] <= n:
        fib.append(fib[-1] + fib[-2])
    return fib

# fastapi run simple_api.py
# http://localhost:8000/compute?n=10

Overwriting simple_api.py


Normally you run the code above as:

```python
fastapi run simple_api.py
```

This will run the API on a single machine. 

However, is your startup grows, how do you make sure you can continue to serve clients?

### Is your current setup going to scale when you go viral? 
Test it with https://locust.io/

In [None]:
!pip install locust

Create a virtual users who will hit your API

In [2]:
%%writefile locustfile.py

from locust import HttpUser, TaskSet, task, between

class APIUser(HttpUser):
    wait_time = between(1, 3)
    host = "http://127.0.0.1:8000"

    @task
    class UserTasks(TaskSet):
        @task
        def get_status(self):
            self.client.get("/status/") 

        @task
        def do_compute(self):
            self.client.get("/compute?n=100") 

Writing locustfile.py


Run it as `locust` at the command line.
This will refer you to a web page, which will let you control the test.

#### Things to note
1. Any filures on the locust dashboard?
2. Monitor the logs of your application
3. What is the median execution time?
4. **What is the tail execution time**?
5. What is the RPS (requests per second)?