# Course Name: **AI Mastery Bootcamp: AI Algorithms, DeepSeek AI, AI Agents**

# Section 28: **Introduction and Hands-on MLOps**

## Introduction to MLOps
* Overview of MLOps and its Importance
* Evolution of Machine Learning Operations
* Key Concepts in MLOps: Versioning, Automation, and Monitoring
* MLOps vs. DevOps: Similarities and Differences
* Hands-on:
  * Set up a basic MLOps project structure using Git for version control, Docker for containerization, and create a simple model pipeline.

## Overview of MLOps and its Importance
* What is MLOps?
* Why is MLOps Important?

## Evolution of Machine Learning Operations
* Traditional ML Development
* Introduction of DevOps Practices to ML
* The Shift to Modern MLOps

## Key Concepts in MLOps: Versioning, Automation, and Monitoring
* Versioning
  * Data Versioning
  * Model Versioning
  * Code Versioning
* Automation
  * Automating Model Training
  * Automating Deployment
* Monitoring
  * Model Performance
  * Concept Drift Detection
  * Logging

## MLOps vs DevOps: Similarities and Differences
* Similarities
  * Automation
  * Collaboration
  * Continuous Integration/Continuous Delivery (CI/CD)
* Differences
  * Complexity of Artifacts
  * Experimentation in MLOps
  * Monitoring Needs

In [5]:
# Training the Source Code
import mlflow
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score
import logging

logging.basicConfig(level=logging.INFO, format="%(asctime)s - %(levelname)s - %(message)s",
                    handloert= [
                        logging.FileHandler("mlops.log"),
                        logging.StreamHandler()
                    ])

logging.info("Starting model training proces....")

logging.info("Loading Data....")
iris= load_iris()
X_train, X_test, y_train, y_test = train_test_split(iris.data, iris.target, test_size=0.2, random_state=42)
logging.info("Data loaded and split into training and test sets.")


with mlflow.start_run():
  logging.info("Training the RandomForest Model...")
  model = RandomForestClassifier()
  model.fit(X_train, y_train)
  logging.info("Model Training Completed..")

  predictions = model.predict(X_test)
  accuracy = accuracy_score(y_test, predictions)

  mlflow.log_metric("accuracy", accuracy)

## Data Science to Production Pipeline
* Overview of the ML Workflow: Data Preparation to Deployment
* Experimentation vs. Production
* Challenges in Deploying ML Models
* Hands-on: End-to-End Pipeline for an ML Model

### Overview of the ML Workflow: Data Preparation to Deployment
* Data Preprocessing
* Model Training
* Model Evaluation
* Model Deployment

### Overview of a Basic ML Pipeline
* Data Ingestion
* Data Preprocessing
* Feature Engineering
* Model Training
* Model Evaluation
* Model Tuning
* Deployment
* Monitoring

## Experimentation VS. Production
* Differences in Tooling and Processes
  * Experimentation phase
  * Production phase
* Transitioning Models from Experimentation to Production
  * Code Refactoring
  * Scalability
  * Monitoring
  * Automation

## Challenges in Deploying ML Models
* Scalability
* Reproducibility
* Reliability

## **Hands-on: Build an end-to-end pipeline for an ML model**

In [9]:
import pandas as pd
from sklearn.datasets import fetch_california_housing
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.ensemble import RandomForestRegressor
from sklearn.metrics import mean_squared_error, r2_score
# import mlflow

housing= fetch_california_housing()
data= pd.DataFrame(housing.data, columns=housing.feature_names)
data['PRICE']= housing.target
print(data.shape)
data.head(2)

(20640, 9)


Unnamed: 0,MedInc,HouseAge,AveRooms,AveBedrms,Population,AveOccup,Latitude,Longitude,PRICE
0,8.3252,41.0,6.984127,1.02381,322.0,2.555556,37.88,-122.23,4.526
1,8.3014,21.0,6.238137,0.97188,2401.0,2.109842,37.86,-122.22,3.585


In [10]:
X= data.drop('PRICE', axis=1)
y= data['PRICE']

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

scaler= StandardScaler()
X_train_scaled= scaler.fit_transform(X_train)
X_test_scaled= scaler.transform(X_test)

model = RandomForestRegressor(n_estimators=100, random_state=42)
model.fit(X_train_scaled, y_train)

In [11]:
predictions= model.predict(X_test_scaled)

mse= mean_squared_error(y_test, predictions)
r2= r2_score(y_test, predictions)

print(f"Mean Squared Error: {mse}")
print(f"R-squared: {r2}")

Mean Squared Error: 0.255169737347244
R-squared: 0.8052747336256919


In [12]:
import joblib
joblib.dump(model, 'california_housing_model.pkl')

['california_housing_model.pkl']

In [13]:
loaded_model= joblib.load('california_housing_model.pkl')

load_model_predictions= loaded_model.predict(X_test_scaled)

print(f"Predictions form the Loaded Model: {load_model_predictions[:5]}")

Predictions form the Loaded Model: [0.5095    0.74161   4.9232571 2.52961   2.27369  ]


## Infrastructure for MLOps
* Introduction to Cloud Platforms for MLOps (AWS, GCP, Azure)
* Containerization with Docker
* Kubernetes for Orchestrating ML Workloads
* Setting Up Local MLOps Environments
* Hands-On: Set Up Docker and Kubernetes Environments

### Introduction to Cloud Platforms for MLOps (AWS, GCP, Azure)
* **Comparing Key Features of Cloud Platforms**
  * **AWS (Amazon Web Services)**
    * **Services:** SageMaker I Lambda I Elastic Kubernetes Service (EKS) | S3
    * **Key Features**
      * Highly scalable with global data centers
      * Strong integration with enterprise tools and security.
      * Extensive automation with services like SageMaker pipelines.
    * **Pricing:** Pay-as-you-go with some free-tier options for light use

  * **GCP (Google Cloud Platform)**
    * **Services:** A1 Platform I BigQuery ML I Kubernetes Engine (GKE) | TensorFlow Enterprise
    * **Key Features**
      * Deep integration with TensorFlow
      * Strong data analytics with BigQuery and Dataflow
      * AutoML and pre-built A1 services for faster deployment
    * **Pricing:** Pay-per-use model with strong free-tier offerings for AI Platform

  * **Azure**
    * **Services:** Azure Machine Learning I Azure Kubernetes Service (AKS) I Azure Functions | Blob Storage
    * **Key Features**
      * Integrated with Microsoft services (Office 365, Azure DevOps)
      * Strong tooling for MLOps with automated pipelines
      * Good support for enterprise customers
    * **Pricing:** Competitive pricing with free-tier services for ML

* **Setting Up Cloud Infrastructure for MLOps**
  * Steps to Set Up Infrastructure on AWS
    * Set up an AWS account and create an IAM role for access control.
    * Use S3 for storing training datasets.
    * Train models on SageMaker and deploy endpoints.
    * Set up EKS (Elastic Kubernetes Service) for orchestrating ML workloads.
  * Steps to Set Up Infrastructure on GCP
    * Create a Google Cloud project and enable AI Platform.
    * Store data in Cloud Storage and train models using AI Platform.
    * Use GKE (Google Kubernetes Engine) for large-scale orchestration
  * Steps to Set Up Infrastructure on Azure
    * Create an Azure ML Workspace for tracking models.
    * Use Azure Kubernetes Service (AKS) for managing models at scale.
    * Azure DevOps to automate MLOps pipelines

## Containerization with Docker
* Benefits of Using Docker for Reproducibility
  * Consistency Across Environments
  * Dependency Management
  * Scalability and Portability
* Creating Docker Images for ML Models
  * Dockerfile for ML Model
  * Build and Run the Docker Image

## Kubernetes for Orchestrating ML Workloads
* Basic Concepts of Kubernetes
  * pod
  * Node
  * Deployment
  * Service
* Setting Up Kubernetes for Distributed ML Workloads
  * Install Kubernetes
  * Deploying ML Workloads on Kubernetes
  * Deploying on Kubernetes

## Setting Up Local MLOps Environments
* Tools and Practices for Local Development
  * Python Virtual Environments
  * Local Databases and Storage
  * Docker for Local Development
* Best Practices for Local MLOps
  * Unit Tests and Integration Tests
  * Version Control
  * Automated Builds

In [None]:
import pandas as pd
from sklearn.datasets import fetch_california_housing
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.ensemble import RandomForestRegressor
from sklearn.metrics import mean_squared_error, r2_score
import joblib

housing= fetch_california_housing()


X_train, X_test, y_train, y_test = train_test_split(housing.data, housing.target, test_size=0.2, random_state=42)

scaler= StandardScaler()
X_train_scaled= scaler.fit_transform(X_train)
X_test_scaled= scaler.transform(X_test)

model = RandomForestRegressor(n_estimators=100, random_state=42)
model.fit(X_train_scaled, y_train)

joblib.dump(model, 'model.pkl')
print("Model Trained and Saved to model.pkl")

In [None]:
from flask import Flask, request, jsonify
import joblib

model= joblib.load('model.pkl')

app= Flask(__name__)

@app.route('/predict', methods=['POST'])
def predict():
  data= request.get_json()
  features= np.array(data['features']).reshape(1, -1)
  prediction= model.predict(features)[0]
  return jsonify({'prediction': prediction})

if __name__ == '__main__':
  app.run(host='0.0.0.0', port= 5001 debug=True)


# https://localhost:5001

#### `requirement.txt`
```text
Flask==3.0.3
joblib==1.4.2
scikit-learn==1.5.2
```

#### `Dockerfile`
```docker
FROM python:3.10-slim
WORKDOR /app

COPY requirements.txt requirements.txt

RUN pip install --upgrade pip
RUN pip install -r requirements.txt

COPY . .

EXPOSE 5001

CMD ["python", "app.py"]
```

#### Terminal Command
```bash
docker build -t ml_model_app .

docker run -p 5001:5001 ml_model_app
curl -X POST http://127.0.0.1:5001/predict -H "Content-Type: application/json" -d '{"features": [8.3252, 41, 6.984, 1.023, 322, 2.555, 37.88, -122.231]}'

brew install kubectl
minikube start
```

#### `deployment.yaml`

```yaml
apiVersion: apps/vl
kind: Deployment
metadata:
  name: ml-model-deployment
spec:
  replicas: 3
  selector:
    matchLabels:
      app: ml-model
  template:
    metadata:
      label:
        app: ml—model
    spec:
      containers:
      - name: ml-model-container
      image: ml_model_app: latest
      imagePullPolicy: Never
      ports:
        - containerPort: 5001
```

#### service.yaml
```yaml
apiVersion: vl
kind: Service
metadata:
  name: ml-model—service
spec:
  type: LoadBalancer
  selector:
    app: ml-model
  ports:
  - protocol: TCP
  port: 8080
  targetPort: 50011
```

```bash
eval $(minikube docker-env)
docker build —t ml_model_app:latest .
kubectl apply —f deployment.yaml
kubectl apply —f service.yaml
minikube service ml—model—service --url
curl -X POST http://127.0.0.1:62316 -H "Content-Type: application/json" -d '{"features": [8.3252, 41, 6.984, 1.023, 322, 2.555, 37.88, -122.231]}'
```