# 1. Set up environment

## 1.1 Install a Virtual env with all dependencies

### 1.1.1 UV Based Environment Creation
- Running below cell  requires uv to be installed on your machine. 
- You can install from https://docs.astral.sh/uv/pip/environments/
- If you dont want UV please use pip based install

In [None]:
%%bash
python -m pip install uv
uv venv ray_jup_env
source ray_jup_env/bin/activate

uv pip install ray[serve] #this is how you install ray_serve python package
uv pip install ipykernel nbconvert ipywidgets #these are required to attach created environment in notebook
python -m ipykernel install --user --name=ray_jup_env


Collecting uv
  Downloading uv-0.6.14-py3-none-manylinux_2_28_aarch64.whl.metadata (11 kB)
Downloading uv-0.6.14-py3-none-manylinux_2_28_aarch64.whl (15.8 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m15.8/15.8 MB[0m [31m19.0 MB/s[0m eta [36m0:00:00[0m00:01[0m00:01[0m
[?25hInstalling collected packages: uv
Successfully installed uv-0.6.14


[0mUsing CPython 3.12.10 interpreter at: [36m/usr/local/bin/python[39m
Creating virtual environment at: [36mray_jup_env[39m
[2mUsing Python 3.12.10 environment at: ray_jup_env[0m
[2mResolved [1m61 packages[0m [2min 519ms[0m[0m


### 1.1.2 PIP Based Environment Creation
 - Uncomment below and run if you do want to not use above uv base install

In [None]:
# %%bash
# python -m pip install --user virtualenv
# python -m virtualenv ray_jup_env
# source ray_jup_env/bin/activate
# python -m pip install ray[serve] #this is how you install ray_serve python package
# python -m pip install nest-asyncio #this is required to run a FastAPI app in non-blocking mode from a jupyter notebook
# python -m pip install ipykernel nbconvert ipywidgets #these are required to attach created environment in notebook
# python -m ipykernel install --user --name=ray_jup_env

## 1.2 Activate the Kernel
- refresh the browser
- activate the _ray_jup_env_ kernel

# 2. Simple FastAPI Endpoint

## 2.1 Create a simple FastAPI endpoint

In [1]:
from fastapi import FastAPI
import os
from datetime import datetime


app = FastAPI()


@app.get("/hello")
def hello():
    
    formatted_timestamp = datetime.now().strftime("%Y-%m-%d %H:%M:%S")
    print(formatted_timestamp)
    return {"message": "Hello, World at :: " + formatted_timestamp }

## 2.2 Serve the Fast API end point
  - I am using an asyncio loop to serve the app in an async way from notebook
  - if you are running this inside a container, you should now see at http://localhost:8001/docs

In [2]:
import asyncio
import uvicorn

if __name__ == "__main__":
    config = uvicorn.Config(app, host="0.0.0.0", port=8002)
    server = uvicorn.Server(config)
    loop = asyncio.get_running_loop()
    loop.create_task(server.serve())

print("If you are running this inside a container, you should now see at http://localhost:8002/hello")

If you are running this inside a container, you should now see at http://localhost:8002/hello


INFO:     Started server process [323]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO:     Uvicorn running on http://0.0.0.0:8002 (Press CTRL+C to quit)


2025-04-10 05:32:41
INFO:     172.18.0.1:64032 - "GET /hello HTTP/1.1" 200 OK
INFO:     172.18.0.1:64032 - "GET /favicon.ico HTTP/1.1" 404 Not Found


# 3. Deploy Fast API app Using Ray Serve

## 3.1 Start a Ray Cluster

In [3]:
import ray
ray.init(num_cpus=8,dashboard_host="0.0.0.0")

2025-04-10 05:33:05,286	INFO worker.py:1843 -- Started a local Ray instance. View the dashboard at [1m[32mhttp://172.18.0.2:8265 [39m[22m


0,1
Python version:,3.12.10
Ray version:,2.44.1
Dashboard:,http://172.18.0.2:8265


[36m(ProxyActor pid=563)[0m INFO 2025-04-10 05:37:03,219 proxy 172.18.0.2 -- Proxy starting on node e21ad4726b84b916deb8d47a321d5062df24f23a708e9970a9473eaa (HTTP port: 8000).
[36m(ProxyActor pid=563)[0m INFO 2025-04-10 05:37:03,264 proxy 172.18.0.2 -- Got updated endpoints: {}.
[36m(ServeController pid=566)[0m INFO 2025-04-10 05:37:33,188 controller 566 -- Deploying new version of Deployment(name='RayApp', app='fastapiappponray') (initial target replicas: 1).
[36m(ProxyActor pid=563)[0m INFO 2025-04-10 05:37:33,192 proxy 172.18.0.2 -- Got updated endpoints: {Deployment(name='RayApp', app='fastapiappponray'): EndpointInfo(route='/', app_is_cross_language=False)}.
[36m(ProxyActor pid=563)[0m INFO 2025-04-10 05:37:33,201 proxy 172.18.0.2 -- Started <ray.serve._private.router.SharedRouterLongPollClient object at 0xffff7427bfb0>.
[36m(ServeController pid=566)[0m INFO 2025-04-10 05:37:33,293 controller 566 -- Adding 1 replica to Deployment(name='RayApp', app='fastapiappponray').

[36m(ServeReplica:fastapiappponray:RayApp pid=564)[0m 2025-04-10 05:37:42


[36m(ServeReplica:fastapiappponray:RayApp pid=564)[0m INFO 2025-04-10 05:37:42,618 fastapiappponray_RayApp 8y16dmar 403d5582-fcf7-4d91-a46a-a196b2654e24 -- GET /hello 200 4.1ms
[36m(ServeReplica:fastapiappponray:RayApp pid=564)[0m INFO 2025-04-10 05:37:42,636 fastapiappponray_RayApp 8y16dmar 79464036-d43a-4aed-ab6a-5674922e1269 -- GET / 404 0.9ms
[36m(ServeController pid=566)[0m INFO 2025-04-10 05:37:50,726 controller 566 -- Deploying new version of Deployment(name='RayAppParamaeterized', app='fastapiappponrayparameterized') (initial target replicas: 2).
[36m(ProxyActor pid=563)[0m INFO 2025-04-10 05:37:50,730 proxy 172.18.0.2 -- Got updated endpoints: {Deployment(name='RayApp', app='fastapiappponray'): EndpointInfo(route='/', app_is_cross_language=False), Deployment(name='RayAppParamaeterized', app='fastapiappponrayparameterized'): EndpointInfo(route='/parameterized_app', app_is_cross_language=False)}.
[36m(ServeController pid=566)[0m INFO 2025-04-10 05:37:50,831 controller 

[36m(ServeReplica:fastapiappponrayparameterized:RayAppParamaeterized pid=562)[0m 2025-04-10 05:37:54


[36m(ServeReplica:fastapiappponrayparameterized:RayAppParamaeterized pid=562)[0m INFO 2025-04-10 05:37:54,409 fastapiappponrayparameterized_RayAppParamaeterized j0h4zv4x 30a084e0-71e9-430d-9fef-9a76ab023982 -- GET /parameterized_app/hello 200 4.0ms
[36m(ServeController pid=566)[0m INFO 2025-04-10 05:38:07,904 controller 566 -- Deploying new version of Deployment(name='RayAppParamaeterizedFail', app='rayappparameterizedfail') (initial target replicas: 6).
[36m(ProxyActor pid=563)[0m INFO 2025-04-10 05:38:07,908 proxy 172.18.0.2 -- Got updated endpoints: {Deployment(name='RayApp', app='fastapiappponray'): EndpointInfo(route='/', app_is_cross_language=False), Deployment(name='RayAppParamaeterized', app='fastapiappponrayparameterized'): EndpointInfo(route='/parameterized_app', app_is_cross_language=False), Deployment(name='RayAppParamaeterizedFail', app='rayappparameterizedfail'): EndpointInfo(route='/parameterized_app_fail', app_is_cross_language=False)}.
[36m(ServeController pid=5

[36m(ServeReplica:fastapiappponrayparameterized:RayAppParamaeterized pid=561)[0m 2025-04-10 05:38:16
[36m(ServeReplica:rayappparameterizedfail:RayAppParamaeterizedFail pid=1101)[0m 2025-04-10 05:38:24


[36m(ServeReplica:rayappparameterizedfail:RayAppParamaeterizedFail pid=1101)[0m INFO 2025-04-10 05:38:24,974 rayappparameterizedfail_RayAppParamaeterizedFail d1lp910t 0c4ad58d-08b2-4e4f-b6c7-81fc0f9e92b5 -- GET /parameterized_app_fail/hello 200 5.8ms


[36m(autoscaler +5m26s)[0m Tip: use `ray status` to view detailed cluster status. To disable these messages, set RAY_SCHEDULER_EVENTS=0.


[36m(ServeController pid=566)[0m INFO 2025-04-10 05:38:58,036 controller 566 -- Removing 1 replica from Deployment(name='RayApp', app='fastapiappponray').
[36m(ServeController pid=566)[0m INFO 2025-04-10 05:38:58,037 controller 566 -- Removing 2 replicas from Deployment(name='RayAppParamaeterized', app='fastapiappponrayparameterized').
[36m(ServeController pid=566)[0m INFO 2025-04-10 05:38:58,037 controller 566 -- Removing 6 replicas from Deployment(name='RayAppParamaeterizedFail', app='rayappparameterizedfail').
[36m(ServeController pid=566)[0m INFO 2025-04-10 05:39:00,070 controller 566 -- Replica(id='8y16dmar', deployment='RayApp', app='fastapiappponray') is stopped.
[36m(ServeController pid=566)[0m INFO 2025-04-10 05:39:00,071 controller 566 -- Replica(id='az36cnh2', deployment='RayAppParamaeterized', app='fastapiappponrayparameterized') is stopped.
[36m(ServeController pid=566)[0m INFO 2025-04-10 05:39:00,071 controller 566 -- Replica(id='j0h4zv4x', deployment='RayAppP

## 3.2 Convert the FastAPI app to a RayDeployment

In [4]:
#Need to convert FastAPI app to a ray deployment actor

from ray import serve

@serve.deployment
@serve.ingress(app)
class RayApp:
    pass


rayapp = RayApp.bind()

## 3.3 Another ray app with parmeterized information

In [5]:
from ray import serve


@serve.deployment(num_replicas=2, ray_actor_options={"num_cpus": 1})
@serve.ingress(app)
class RayAppParamaeterized:
    pass


rayappparameterized = RayAppParamaeterized.bind()

## 3.4 Another ray app with parmeterized information requesting more resource than avaialbale

In [6]:
#Below is a NegativExample
from ray import serve

@serve.deployment(num_replicas=6, ray_actor_options={"num_cpus": 1})
@serve.ingress(app)
class RayAppParamaeterizedFail:
    pass


rayappparameterizedfail = RayAppParamaeterizedFail.bind()

## 3.5 Deploy on a  ray serve cluster

If you are running this inside a container, you should now see at http://localhost:8265

### 3.5.1 Start Serve Instance on Ray Cluster

In [7]:
serve.start(http_options={"host":"0.0.0.0"})

INFO 2025-04-10 05:37:03,266 serve 323 -- Started Serve in namespace "serve".


### 3.5.2 Deploy above Ray Wrapped Fast API apps on Ray Serve

In [8]:
serve.run(rayapp, name="fastapiappponray")
print("Served app should be visible at http://localhost:8000/hello")

INFO 2025-04-10 05:37:33,159 serve 323 -- Connecting to existing Serve app in namespace "serve". New http options will not be applied.
INFO 2025-04-10 05:37:34,278 serve 323 -- Application 'fastapiappponray' is ready at http://0.0.0.0:8000/.


Served app should be visible at http://localhost:8000/hello


In [9]:
serve.run(rayappparameterized, 
          name="fastapiappponrayparameterized", 
          route_prefix="/parameterized_app")
print("Served app should be visible at http://localhost:8000/parameterized_app/hello")

INFO 2025-04-10 05:37:50,711 serve 323 -- Connecting to existing Serve app in namespace "serve". New http options will not be applied.
INFO 2025-04-10 05:37:51,832 serve 323 -- Application 'fastapiappponrayparameterized' is ready at http://0.0.0.0:8000/parameterized_app.


Served app should be visible at http://localhost:8000/parameterized_app/hello


In [None]:
serve.run(rayappparameterizedfail, 
          name="rayappparameterizedfail", 
          route_prefix="/parameterized_app_fail")
print("Served app should be visible at http://localhost:8000/parameterized_app_fail/hello, however all replicas would not come up")

In [12]:
serve.shutdown()

### 3.5.3  ShutDown Ray Cluster

In [13]:
ray.shutdown()

In [None]:
# %%bash
# be careful with this
# rm -rf ray_jup_env