# Deploying AI into production with FastAPI

## Chapter 1 - Introducion to FastAPI for Model Deployment

### Section 1.1 - GET and POST requests for AI

#### GET endpoint for model information

You're part of a machine learning team that has developed several machine learning models, each designed for different tasks such as sentiment analysis, product categorization, and customer churn prediction. You're working on deploying these models, and you need to create an endpoint that provides basic information about each model.

Your task is to implement a GET endpoint at route `/model-info/{model_id}` that retrieves and returns this essential model information.

In [9]:
%%writefile main.py
from fastapi import FastAPI, HTTPException

app = FastAPI()

# Add model_id as a path parameter in the route
@app.get("/model-info/{model_id}")
# Pass on the model id as an argument
async def get_model_info(model_id: int):
    # Check if the passed model id is 0
    if model_id == 0:
      	# Raise the right status code for not found
        raise HTTPException(status_code=404, detail="Model not found")
    model_info = get_model_details(id)  
    # Return the model id and info in the dict
    return {"model_id": model_id, "model_name": model_info}

Overwriting main.py


#### POST endpoint for model registration

While the GET endpoint you created earlier allows users to retrieve information about existing models, you now need a way for authorized team members to register new models or update information about existing ones.

You need to create a POST endpoint that allows team members to register new models or update existing ones. This endpoint will store model information on the server.

In [10]:
%%writefile main.py
from pydantic import BaseModel
from fastapi import FastAPI

app = FastAPI()

model_db = {}

class ModelInfo(BaseModel):
    model_id: int
    model_name: str
    description: str

# Specify the status code for successful POST request
@app.post("/register-model", status_code=201)
# Pass the model info from the request as function parameter 
def register_model(model_info: ModelInfo):
    # Add new model's information dictionary to the model database
    model_db[model_info.model_id] = model_info.model_dump()
    # Return model info dictionary corresponding to model along with success status code
    return {"message": "Model registered successfully", "model": model_info}, 201

Overwriting main.py


In [11]:
import requests

data = {
    "model_id":1, 
    "model_name": "cnn", 
    "description": "convolutional nn"}

url = "http://localhost:8000/register-model"
headers = {"Content_Type": "Application-json"}
response = requests.post(url, json=data, headers=headers)
print(response.json())

[{'message': 'Model registered successfully', 'model': {'model_id': 1, 'model_name': 'cnn', 'description': 'convolutional nn'}}, 201]


### Section 1.2 - FastAPI prediction with a pre-trained model

#### Preperation

The model from the exercises does not work. So hwere we quickly trains and save our own model that can be used for the exercises.

In [12]:
from datasets import load_dataset

ds = load_dataset("SIH/palmer-penguins")

In [13]:
df = ds['train'].to_pandas()
df.head()

Unnamed: 0,species,island,bill_length_mm,bill_depth_mm,flipper_length_mm,body_mass_g,sex,year
0,Adelie,Torgersen,39.1,18.7,181.0,3750.0,male,2007
1,Adelie,Torgersen,39.5,17.4,186.0,3800.0,female,2007
2,Adelie,Torgersen,40.3,18.0,195.0,3250.0,female,2007
3,Adelie,Torgersen,,,,,,2007
4,Adelie,Torgersen,36.7,19.3,193.0,3450.0,female,2007


In [15]:
from sklearn.model_selection import train_test_split, cross_val_score
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import make_pipeline
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score
import numpy as np
import joblib

# Only include numerical features + target
features = ['species', 'bill_length_mm', 'bill_depth_mm',
            'flipper_length_mm', 'body_mass_g']

# 0. Prepare the dataset
df_dataset = df[features].dropna()
X = df_dataset.drop("species", axis=1)
y = df_dataset["species"]

# 1. Split first
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42
)

# 2. Define pipeline (scaler + classifier)
pipe = make_pipeline(
    StandardScaler(),
    LogisticRegression(max_iter=1000)
)

# 3. Perform cross-validation only on the training set
cv_scores = cross_val_score(pipe, X_train, y_train, cv=5)

# 4. Report CV accuracy
print("Cross-validation accuracy scores:", cv_scores)
print("Mean CV accuracy:", np.mean(cv_scores))

# 5. Fit the pipeline on the full training set
pipe.fit(X_train, y_train)

# 6. Evaluate on the test set
y_pred = pipe.predict(X_test)
print("Test set accuracy:", accuracy_score(y_test, y_pred))

# 7. Save the pipeline (includes scaler + model)
joblib.dump(pipe, "penguin_classifier.pkl")

# 8. Load the pipeline
model = joblib.load("penguin_classifier.pkl")
print("Loaded model type:", type(model))

# 9. Use the loaded model for prediction
# (Optional — uncomment to test)
# y_pred = model.predict(X_test)


Cross-validation accuracy scores: [0.98181818 1.         1.         0.98148148 1.        ]
Mean CV accuracy: 0.9926599326599327
Test set accuracy: 0.9855072463768116
Loaded model type: <class 'sklearn.pipeline.Pipeline'>


#### Load the pre-trained model

You're a data scientist at an animal conservation company. You've been given a pre-trained machine learning model that predicts penguin species.

Your task is to load this model so it can be used in an API. The model has been saved using `joblib`.

A pre-trained ML model is stored in the pickle file: `penguin_classifier.pkl`

Write a script to load the pickle file as a model. Test your script by running `python3 solution.py` in the terminal.

In [16]:
# Import the necessary module
import joblib

# Load the pre-trained model
model = joblib.load('penguin_classifier.pkl')

# Print the type of the loaded model
print(f"Loaded model type: {type(model)}")

Loaded model type: <class 'sklearn.pipeline.Pipeline'>


#### Create the prediction endpoint

In this exercise, you'll create a prediction endpoint that uses a pre-trained model to estimate diabetes progression.

The model has been trained on a dataset which has three features age, bmi and blood_pressure. It then predicts the diabetes progression score. Using these inputs, it predicts a diabetes progression score, which helps assess how the condition may develop over time.

You'll use FastAPI to create a POST endpoint that accepts patient data and returns a prediction of diabetes progression.

In [23]:
%%writefile main.py
from fastapi import FastAPI
from pydantic import BaseModel
import pandas as pd
import joblib

# Load the pre-trained model
model = joblib.load('penguin_classifier.pkl')

# Print the type of the loaded model
print(f"Loaded model type: {type(model)}")
class PengiunFeatures(BaseModel):
    bill_length_mm: float
    bill_depth_mm: float
    flipper_length_mm: float
    body_mass_g: float
    
# Create FastAPI instance
app = FastAPI()

# # Create a POST request endpoint at the route "/predict"
@app.post("/predict")
async def predict_progression(features: PengiunFeatures):
    input_data = pd.DataFrame([features.model_dump()])
    
    prediction = model.predict(input_data)
    return {"predicted_progression": prediction[0]}

Overwriting main.py


In [24]:
from pydantic import BaseModel
import requests

class PengiunFeatures(BaseModel):
    bill_length_mm: float
    bill_depth_mm: float
    flipper_length_mm: float
    body_mass_g: float


pengiun = PengiunFeatures(bill_length_mm=39.1,
                          bill_depth_mm=18.7,
                          flipper_length_mm=181.0,
                          body_mass_g=3750.0)

url = "http://localhost:8000/predict"
data = pengiun.model_dump()
response = requests.post(url, json=data)
print(response.json())

{'predicted_progression': 'Adelie'}


#### Running the FastAPI app

Your FastAPI app has been saved in a Python file called `main.py`. You would like to run the app from a Python script using uvicorn.

To serve your FastAPI app directly via the Python script in `solution.py`, you need to finish adding the code block that sets up the host and port of the server where the API will run.

In [25]:
%%writefile solution.py
# Import the server module
import uvicorn
from main import app

if __name__ == "__main__":
    # Start the uvicorn server
    uvicorn.run(
	  app, 
      # Configure the host
      host="0.0.0.0",
      # Configure the port
      port=8080)

Overwriting solution.py


with the snippet below we can start the server using python. It is best to do this in a terminal.

In [None]:
!python3 solution.py

now post to the server on port 8080

In [26]:
pengiun = PengiunFeatures(bill_length_mm=39.1,
                          bill_depth_mm=18.7,
                          flipper_length_mm=181.0,
                          body_mass_g=3750.0)

url = "http://localhost:8080/predict"
data = pengiun.model_dump()
response = requests.post(url, json=data)
print(response.json())

{'predicted_progression': 'Adelie'}


### Section 1.3 - Create a PYdantic model for ML input

#### Create a Pydantic model for ML input

You're developing a FastAPI application to deploy a machine learning model that predicts the quality score of coffee based on attributes including aroma, flavor, and altitude.

The first step is to create a Pydantic model to validate the input request data for your ML model and ensure that only valid data flows through the model for successful model prediction.

In [27]:
# Import the base class from pydantic
from pydantic import BaseModel

class CoffeeQualityInput(BaseModel):
    # Use apt data type for each attribute of coffee quality
    aroma: float  
    flavor: float  
    altitude: int

#### Validate request and response for ML prediction

Building on your work as a data scientist at the coffee company, you now need to create a FastAPI endpoint that validates input request using `CoffeeQualityInput` data validation model and a `QualityPrediction` for response validation.

This endpoint will accept coffee data and return a quality prediction along with the confidence score.

The model is already loaded into a function called `predict_quality` for this exercise.

In [28]:
class CoffeeQualityInput(BaseModel):
    aroma: float
    flavor: float
    altitude: int
    
class QualityPrediction(BaseModel):
    quality_score: float 
    confidence: float

# Specify the data model to validate response
@app.post("/predict", response_model=QualityPrediction) 
# Specify the data model to validate input request
def predict(coffee_data: CoffeeQualityInput):
    prediction = predict_quality(coffee_data)
    return prediction 

## Chapter 2 - Integrating AI Model

#### Handle textual request data

Another requirement in the content moderation system is to take into account user comments' sentiment. The system needs to identify specific problematic phrases to help moderators review potentially inappropriate content.

You'll create an endpoint that analyzes text coming from users and extracts standardized moderation flags.

In [31]:
%%writefile main.py
from fastapi import FastAPI

app = FastAPI()

@app.post("/analyze_comment")
def analyze_comment(text: str):
    problem_keywords = ["spam", "hate", "offensive", "abuse"]
    
    # Convert the input text to lowercase
    text_lower = text.lower()
    # Extract matching flags using list comprehension
    found_issues = [keyword for keyword in problem_keywords if keyword in text_lower]
    # Return the dictionary with required keys
    return {
        "issues": found_issues,
        "issue_count": len(found_issues),
        "original_text": text
    }

Overwriting main.py


In [37]:
response = requests.post(
    "http://localhost:8000/analyze_comment?text=This is a spam with spam which I hate and abusive message"
)
print(response.json())

{'issues': ['spam', 'hate'], 'issue_count': 2, 'original_text': 'This is a spam with spam which I hate and abusive message'}


#### Handle numerical request data

You're building a content moderation system. The system needs to calculate a trust score for each user comment based on numerical features - `length`, `user_reputation`, and `report_count`. You'll create an endpoint that processes these features to make them compatible for the moderation model.

Note that the ML model and `CommentMetrics Pydantic model with `length`(int), `user_reputation`(int) and `report_count`(int) are already created and loaded for you.

In [38]:
%%writefile scorer.py
import numpy as np
from pydantic import BaseModel

class CommentMetrics(BaseModel):
    length: int
    user_reputation: int
    report_count: int

class CommentScorer:
    def predict(self, features: np.ndarray) -> float:
        """
        Predict trust score based on comment metrics
        features: [[length, user_reputation, report_count]]
        """
        # Unpack features
        length, reputation, reports = features[0]
        
        # Calculate trust score
        score = (0.3 * (length/500) +        # Normalize length
                 0.5 * (reputation/100) +    # Normalize reputation
                 -0.2 * reports)             # Reports reduce score
        
        return float(max(min(score * 100, 100), 0))  # Scale to 0-100

Writing scorer.py


In [39]:
%%writefile main.py
import numpy as np
from scorer import CommentMetrics, CommentScorer
from fastapi import FastAPI

app = FastAPI()
model = CommentScorer()

@app.post("/predict_trust")
def predict_trust(comment: CommentMetrics):
    # Convert input and extract comment metrics
    features = np.array([[
        comment.length,
        comment.user_reputation,
        comment.report_count
    ]])
    # Get prediction from model 
    score = model.predict(features)
    return {
        "trust_score": round(score, 2),
        "comment_metrics": comment.dict()
    }
    
if __name__ == "__main__":
    import uvicorn
    uvicorn.run(app, host="0.0.0.0", port=8080)

Overwriting main.py


In [46]:
import requests

url = "http://localhost:8080/predict_trust"
data = {
    "length": 150,
    "user_reputation": 100,
    "report_count": 0}

headers = {"Content-Type": "application/json"}

response = requests.post(url, json=data, headers=headers)
print(response.json())

{'trust_score': 59.0, 'comment_metrics': {'length': 150, 'user_reputation': 100, 'report_count': 0}}


### Section 2.2 - Input validation in PastAPI