# MLOps Project

## Step 0: Establish connection with Azure

In [1]:
from azureml.core import Workspace

ws = Workspace.from_config()  # Retrieves the config.json file with your workspace details

In [2]:
print(ws.name, ws.resource_group, ws.location, ws.subscription_id, sep="\n")

mlops-project-workspace
mlops-project
westeurope
ad9c8da8-b8c5-4ed7-b082-04c4397b8318


## Step 1: Data Preparation and Model Training

A. Load and prepare the data

In [3]:
import pandas as pd

# pre-cleaned dataset from previous project
df = pd.read_csv("./data/titanic_cleaned.csv")
df.head()

Unnamed: 0,Survived,Pclass,Age,SibSp,Parch,Fare,FamilySize,IsAlone,Cabins_booked,Embarked_in_Cherbourg,...,Age_Group_Child,Age_Group_Young Adult,Age_Group_Adult,Age_Group_Senior,Fare_Group_Low,Fare_Group_Mediun,Fare_Group_High,Fare_Group_Very High,Age_Class,Fare_per_person
0,0,3,22.0,1,0,7.25,2,0,0,0,...,0,1,0,0,1,0,0,0,66.0,3.625
1,1,1,38.0,1,0,71.2833,2,0,1,1,...,0,0,1,0,0,0,1,0,38.0,35.64165
2,1,3,26.0,0,0,7.925,1,1,0,0,...,0,1,0,0,1,0,0,0,78.0,7.925
3,1,1,35.0,1,0,53.1,2,0,1,0,...,0,0,1,0,0,0,1,0,35.0,26.55
4,0,3,35.0,0,0,8.05,1,1,0,0,...,0,0,1,0,1,0,0,0,105.0,8.05


In [4]:
X = df.drop('Survived', axis=1)
y = df['Survived']

B. Train the model

In [5]:
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

model = RandomForestClassifier(n_estimators=100)
model.fit(X_train, y_train)

y_pred = model.predict(X_test)
print(f'Accuracy: {accuracy_score(y_test, y_pred)}')

Accuracy: 0.8212290502793296


C. Register the model in Azure ML

In [6]:
import joblib
from azureml.core.model import Model


# Save the model
joblib.dump(model, "./model/model.pkl")

# Register the model
Model.register(workspace=ws, model_path='./model/model.pkl', model_name='titanic_model')

Registering model titanic_model


Model(workspace=Workspace.create(name='mlops-project-workspace', subscription_id='ad9c8da8-b8c5-4ed7-b082-04c4397b8318', resource_group='mlops-project'), name=titanic_model, id=titanic_model:6, version=6, tags={}, properties={})

In [7]:
# List registered models
for model in Model.list(ws):
    print(model.name, 'version:', model.version)

titanic_model version: 6
titanic_model version: 5
titanic_model version: 4
titanic_model version: 3
titanic_model version: 2
titanic_model version: 1


## Step 2: Create and Deploy an API with FastAPI


1. Set up FastAPI
2. Create FastAPI Application -> Code is inside `main.py`

## Step 3: Dockerize the FastAPI Application

1. Create a Dockerfile -> `Dockerfile`

2. Build and Run the Docker Image

    - `docker build -t fastapi-titanic .`

    - `docker run -d --name fastapi-titanic -p 80:80 fastapi-titanic`

## Step 4: Deploy with Kubernetes

1. Install Kubernetes Tools
   - Install `kubectl`
   - Install `Azure CLI`
2. Set up an Azure Kubernetes Service (AKS) Cluster
   - `az aks create --resource-group your_resource_group --name your_aks_cluster --node-count 1 --enable-addons monitoring --generate-ssh-keys`
   - `az aks get-credentials --resource-group your_resource_group --name your_aks_cluster`


3. Deploy the Docker Image to AKS
    - Create a Kubernetes deployment file `deployment.yaml`
    - Apply the deployment
    - Expose the deployment
4. Get the External IP

    <span style="color:#FFAAFF">**Output:**</span>

    |NAME              |TYPE           |CLUSTER-IP       |EXTERNAL-IP   |PORT(S)        |AGE  |
    |------------------|---------------|-----------------|--------------|---------------|-----|
    |fastapi-service   |LoadBalancer   |10.0.38.55       |57.152.65.255 |80:32000/TCP   |17m   |
    |kubernetes        |ClusterIP      |10.0.0.1         |< none>       |443/TCP        |43m  |

## Step 5: Test the Deployed API

http://57.152.65.255:80/predict

<span style="color:#FFAAFF; font-size:3rem">**By the way**<span>

You can find the registed models on [this url](https://ml.azure.com/model/list?wsid=/subscriptions/ad9c8da8-b8c5-4ed7-b082-04c4397b8318/resourcegroups/mlops-project/providers/Microsoft.MachineLearningServices/workspaces/mlops-project-workspace&tid=4ded4bb1-6bff-42b3-aed7-6a36a503bf7a). You can also deploy them straight from there