# Deploying models with Flask

## Table of contents

1. [Understanding model deployment with Flask](#understanding-model-deployment-with-flask)
2. [Setting up the environment](#setting-up-the-environment)
3. [Loading the pre-trained model](#loading-the-pre-trained-model)
4. [Creating a Flask web application](#creating-a-flask-web-application)
5. [Building RESTful APIs for model inference](#building-restful-apis-for-model-inference)
6. [Handling input data for predictions](#handling-input-data-for-predictions)
7. [Returning model predictions through Flask](#returning-model-predictions-through-flask)
8. [Testing the Flask app locally](#testing-the-flask-app-locally)
9. [Deploying the Flask app to the cloud](#deploying-the-flask-app-to-the-cloud)

## Understanding model deployment with Flask

### **Key concepts**
Deploying models with Flask involves creating a web application that serves machine learning models, enabling users to interact with them via HTTP requests. Flask, a lightweight Python web framework, provides a simple and flexible approach to building RESTful APIs for serving predictions from trained models. This approach is widely used for deploying models to production environments, where they can process real-time or batch requests from external systems.

Key components of deploying models with Flask include:
- **Flask application**: Defines routes to handle requests and responses.
- **Model loading**: Loads the trained model (e.g., PyTorch, TensorFlow) into memory for inference.
- **Prediction endpoints**: Provides RESTful APIs to accept input data, perform inference, and return predictions.
- **Scalability**: Can integrate with tools like Gunicorn and Docker for handling production-scale traffic.

Flask enables seamless integration of machine learning models into applications, providing accessibility to users and systems.

### **Applications**
Deploying models with Flask is widely used in:
- **Web applications**: Serving predictions directly to web interfaces for real-time interaction.
- **Mobile and IoT**: Providing APIs that mobile apps or IoT devices can call for predictions.
- **Enterprise systems**: Integrating models into business workflows or decision-making systems.
- **Data pipelines**: Embedding models as RESTful endpoints in larger data processing architectures.

### **Advantages**
- **Simplicity**: Flask’s minimalistic framework makes it easy to set up and deploy a model-serving API.
- **Flexibility**: Supports custom routing, middleware, and extensions to tailor the deployment environment.
- **Integration**: Easily integrates with databases, caching systems, and containerization tools like Docker.
- **Portability**: The application can be deployed on local machines, cloud platforms, or edge devices.

### **Challenges**
- **Concurrency**: Flask’s default server may not handle high traffic effectively, requiring additional tools like Gunicorn.
- **Latency**: Real-time inference may face delays, especially for large models or complex computations.
- **Scalability**: Scaling Flask applications for large-scale deployments requires careful architecture and additional infrastructure.
- **Security**: Ensuring secure data transmission and API access is critical in production environments.

## Setting up the environment


##### **Q1: How do you install the necessary libraries for Flask and machine learning model deployment using `pip`?**


In [1]:
# !pip install flask torch torchvision requests

##### **Q2: How do you import the required modules, such as Flask, PyTorch (or TensorFlow), and `requests` in Python?**


In [2]:
from flask import Flask, request, jsonify
import torch
import torch.nn as nn
import torchvision.transforms as transforms
import requests

##### **Q3: How do you set up the project directory structure for a Flask-based deployment?**


In [3]:
# use the following structure:

# project/
# ├── app.py                    # main Flask app
# ├── model.pth                 # saved PyTorch model
# ├── requirements.txt          # dependencies
# ├── Procfile                  # required for Heroku deployment
# └── utils/                    # optional: for helper modules
#     └── preprocess.py         # input preprocessing logic

In [4]:
# you can create it using Python, e.g.:
import os

os.makedirs('project/utils', exist_ok=True)
open('project/app.py', 'a').close()
open('project/model.pth', 'a').close()
open('project/requirements.txt', 'a').close()
open('project/Procfile', 'a').close()
open('project/utils/preprocess.py', 'a').close()

##### **Q4: How do you configure the environment to enable debug mode for the Flask application?**

In [6]:
# in app.py, add debug=True when running the app
# app = Flask(__name__)

# if __name__ == '__main__':
#     app.run(debug=True)  # enables live reload and error display

## Loading the pre-trained model


##### **Q5: How do you load a pre-trained model in PyTorch (or TensorFlow) for use in a Flask application?**


In [9]:
class SimpleModel(nn.Module):
    def __init__(self):
        super(SimpleModel, self).__init__()
        self.fc = nn.Linear(4, 3)  # example architecture for input size 4 and output size 3

    def forward(self, x):
        return self.fc(x)

In [10]:
model = SimpleModel()
torch.save(model.state_dict(), 'project/model.pth')

In [11]:
model.load_state_dict(torch.load('project/model.pth', map_location='cpu'))  # load to CPU by default
model.eval()  # set model to evaluation mode

SimpleModel(
  (fc): Linear(in_features=4, out_features=3, bias=True)
)

##### **Q6: How do you verify that the model is working correctly by testing it on sample input data before deploying it?**


In [12]:
sample_input = torch.randn(1, 4)  # single sample with 4 features
output = model(sample_input)
print('Sample output:', output)  # confirm output is tensor with shape [1, 3]

Sample output: tensor([[-0.0655,  0.4912, -0.7613]], grad_fn=<AddmmBackward0>)


##### **Q7: How do you handle the model’s device allocation (CPU/GPU) when loading it for deployment in a Flask app?**

In [13]:
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print(device)

cuda


In [14]:
model = SimpleModel()
model.load_state_dict(torch.load('project/model.pth', map_location=device))
model.to(device)
model.eval()

SimpleModel(
  (fc): Linear(in_features=4, out_features=3, bias=True)
)

In [15]:
print('Model loaded on:', next(model.parameters()).device)  # confirm model device

Model loaded on: cuda:0


## Creating a Flask web application


##### **Q8: How do you initialize a basic Flask app in Python and set up the main app file?**


In [None]:
# app = Flask(__name__)  # initialize the Flask app

# if __name__ == '__main__':
#     app.run(debug=True)  # run app in debug mode for development

##### **Q9: How do you define a simple home route (`/`) that serves a basic welcome message in Flask?**


In [None]:
# @app.route('/')  # define the route for the home page
# def home():
#     return 'Welcome to the model inference API!'  # basic welcome message

##### **Q10: How do you set up route handling for API endpoints in Flask?**

In [None]:
# @app.route('/api/health', methods=['GET'])  # define an example API endpoint
# def health_check():
#     return jsonify({'status': 'ok'}), 200  # return JSON response with status code

## Building RESTful APIs for model inference


##### **Q11: How do you define a `/predict` route in Flask to handle POST requests for model inference?**


In [None]:
# @app.route('/predict', methods=['POST'])  # define the prediction endpoint
# def predict():
#     if not request.is_json:  # check if request has JSON
#         return jsonify({'error': 'Request must be in JSON format'}), 400

#     data = request.get_json()
#     # dummy placeholder: real input processing and model inference go here
#     return jsonify({'message': 'Prediction endpoint hit', 'received': data}), 200

##### **Q12: How do you set up the Flask route to accept input data in JSON format for the model prediction?**


In [None]:
# # the logic is included inside /predict route from Q11 using request.get_json()
# # here is the relevant part again in context:
# @app.route('/predict', methods=['POST'])
# def predict():
#     if not request.is_json:
#         return jsonify({'error': 'Request must be in JSON format'}), 400

#     input_data = request.get_json()  # parse JSON body from the request
#     # continue with validation and inference
#     return jsonify({'input_received': input_data}), 200

##### **Q13: How do you configure the Flask app to return appropriate status codes in response to the API requests?**

In [None]:
# @app.route('/predict', methods=['POST'])
# def predict():
#     if not request.is_json:
#         return jsonify({'error': 'Expected JSON data'}), 400  # return 400 if not JSON

#     input_data = request.get_json()

#     if 'features' not in input_data:  # check for expected key
#         return jsonify({'error': 'Missing "features" in input'}), 400  # return 400 if missing

#     return jsonify({'prediction': [0.1, 0.7, 0.2]}), 200  # dummy success response with 200

## Handling input data for predictions


##### **Q14: How do you parse input data from a JSON request in Flask using `request.get_json()`?**


In [None]:
# @app.route('/predict', methods=['POST'])
# def predict():
#     data = request.get_json()  # parse JSON input
#     features = data.get('features')  # retrieve the 'features' key
#     if features is None:
#         return jsonify({'error': 'Missing "features" in input'}), 400

#     return jsonify({'received_features': features}), 200

##### **Q15: How do you preprocess the input data before passing it to the model for prediction?**


In [None]:
# @app.route('/predict', methods=['POST'])
# def predict():
#     data = request.get_json()
#     features = data.get('features')
#     if not features or not isinstance(features, list):
#         return jsonify({'error': '"features" must be a non-empty list'}), 400

#     try:
#         input_tensor = torch.tensor(features, dtype=torch.float32).unsqueeze(0)  # shape [1, N]
#     except Exception as e:
#         return jsonify({'error': f'Invalid input format: {str(e)}'}), 400

#     return jsonify({'preprocessed_shape': list(input_tensor.shape)}), 200

##### **Q16: How do you validate the input data format in Flask to ensure it matches the model’s expected input shape?**

In [None]:
# @app.route('/predict', methods=['POST'])
# def predict():
#     data = request.get_json()
#     features = data.get('features')
#     if not features or not isinstance(features, list):
#         return jsonify({'error': '"features" must be a list'}), 400

#     input_tensor = torch.tensor(features, dtype=torch.float32).unsqueeze(0)  # shape [1, N]
#     if input_tensor.shape[1] != 4:  # expected input size is 4 for SimpleModel
#         return jsonify({'error': f'Expected 4 input features, got {input_tensor.shape[1]}'}), 400

#     return jsonify({'valid_input': True}), 200

## Returning model predictions through Flask


##### **Q17: How do you run the model’s inference on the preprocessed input data in Flask?**


In [None]:
# @app.route('/predict', methods=['POST'])
# def predict():
#     data = request.get_json()
#     features = data.get('features')
#     if not features or not isinstance(features, list):
#         return jsonify({'error': '"features" must be a list'}), 400

#     input_tensor = torch.tensor(features, dtype=torch.float32).unsqueeze(0)  # shape [1, 4]
#     if input_tensor.shape[1] != 4:
#         return jsonify({'error': f'Expected 4 input features, got {input_tensor.shape[1]}'}), 400

#     input_tensor = input_tensor.to(device)
#     with torch.no_grad():
#         outputs = model(input_tensor)  # run model inference

#     return jsonify({'raw_output': outputs.tolist()}), 200

##### **Q18: How do you format the model’s output into a JSON response?**


In [None]:
# @app.route('/predict', methods=['POST'])
# def predict():
#     data = request.get_json()
#     features = data.get('features')
#     if not features or not isinstance(features, list):
#         return jsonify({'error': '"features" must be a list'}), 400

#     input_tensor = torch.tensor(features, dtype=torch.float32).unsqueeze(0)
#     if input_tensor.shape[1] != 4:
#         return jsonify({'error': f'Expected 4 input features, got {input_tensor.shape[1]}'}), 400

#     input_tensor = input_tensor.to(device)
#     with torch.no_grad():
#         logits = model(input_tensor)
#         probs = torch.softmax(logits, dim=1)  # convert to probabilities
#         predicted_class = torch.argmax(probs, dim=1).item()

#     return jsonify({
#         'predicted_class': predicted_class,
#         'probabilities': probs.squeeze().tolist()
#     }), 200

##### **Q19: How do you return the JSON response with the prediction results to the client in Flask?**

In [None]:
# already handled in Q18 with `jsonify(...)` and status code
# here's a minimal final structure:

# @app.route('/predict', methods=['POST'])
# def predict():
#     data = request.get_json()
#     features = data.get('features')
#     if not features or not isinstance(features, list):
#         return jsonify({'error': '"features" must be a list'}), 400

#     input_tensor = torch.tensor(features, dtype=torch.float32).unsqueeze(0)
#     if input_tensor.shape[1] != 4:
#         return jsonify({'error': f'Expected 4 input features, got {input_tensor.shape[1]}'}), 400

#     input_tensor = input_tensor.to(device)
#     with torch.no_grad():
#         logits = model(input_tensor)
#         probs = torch.softmax(logits, dim=1)
#         predicted_class = torch.argmax(probs, dim=1).item()

#     return jsonify({
#         'prediction': {
#             'class': predicted_class,
#             'probabilities': probs.squeeze().tolist()
#         }
#     }), 200

## Testing the Flask app locally


##### **Q20: How do you use `curl` to send POST requests with input data to the Flask app for testing?**


In [None]:
# run this in terminal (not in Python):
# curl -X POST http://127.0.0.1:5000/predict \
#      -H "Content-Type: application/json" \
#      -d "{\"features\": [0.1, 0.2, 0.3, 0.4]}"

In [None]:
# expected JSON response (example):
{
  "prediction": {
    "class": 1,
    "probabilities": [0.12, 0.78, 0.10]
  }
}

##### **Q21: How do you use Postman to test the Flask API by sending input data and receiving predictions?**


In [None]:
# 1. Open Postman.
# 2. Set method to POST.
# 3. Enter URL: http://127.0.0.1:5000/predict
# 4. Go to 'Body' tab → choose 'raw' → select 'JSON' from the dropdown.
# 5. Enter input:
{
  "features": [0.1, 0.2, 0.3, 0.4]
}
# 6. Hit 'Send' and inspect the response.

##### **Q22: How do you debug common issues such as incorrect input formats or missing model files in Flask?**

In [None]:
# use structured error handling and logging:

# @app.route('/predict', methods=['POST'])
# def predict():
#     try:
#         data = request.get_json(force=True)
#         features = data.get('features')
#         if not features or not isinstance(features, list):
#             return jsonify({'error': '"features" must be a list'}), 400

#         input_tensor = torch.tensor(features, dtype=torch.float32).unsqueeze(0)
#         if input_tensor.shape[1] != 4:
#             return jsonify({'error': f'Expected 4 features, got {input_tensor.shape[1]}'}), 400

#         input_tensor = input_tensor.to(device)
#         with torch.no_grad():
#             logits = model(input_tensor)
#             probs = torch.softmax(logits, dim=1)
#             predicted_class = torch.argmax(probs, dim=1).item()

#         return jsonify({
#             'prediction': {
#                 'class': predicted_class,
#                 'probabilities': probs.squeeze().tolist()
#             }
#         }), 200

#     except FileNotFoundError:
#         return jsonify({'error': 'Model file not found'}), 500
#     except Exception as e:
#         return jsonify({'error': str(e)}), 500

## Deploying the Flask app to the cloud


##### **Q23: How do you set up a `Procfile` for deploying the Flask app to Heroku?**


In [16]:
with open('project/Procfile', 'w') as f:
    f.write('web: python app.py')  # defines the command Heroku will run

##### **Q24: How do you deploy the Flask app to Heroku and test the live API?**


In [None]:
# # 1. Initialize git (if not already done)
# git init
# git add .
# git commit -m "Initial commit"

# # 2. Login to Heroku CLI
# heroku login

# # 3. Create a new Heroku app
# heroku create flask-model-api

# # 4. Add a Heroku-compatible requirements.txt
# pip freeze > requirements.txt

# # 5. Deploy to Heroku
# git push heroku master

# # 6. Open the app or test the endpoint
# heroku open
# # or test via:
# curl -X POST https://flask-model-api.herokuapp.com/predict \
#      -H "Content-Type: application/json" \
#      -d "{\"features\": [0.1, 0.2, 0.3, 0.4]}"

##### **Q25: How do you deploy the Flask app to AWS or Google Cloud for real-time model serving?**


In [None]:
# # Option A: AWS Elastic Beanstalk
# 1. Install EB CLI and configure:
#    eb init -p python-3.9 flask-model-api
# 2. Create environment:
#    eb create flask-env
# 3. Deploy:
#    eb deploy

# # Option B: Google Cloud Run
# 1. Containerize the app using Docker:
#    - Create a Dockerfile in project root:
#      ------------------
#      FROM python:3.9
#      WORKDIR /app
#      COPY . .
#      RUN pip install -r requirements.txt
#      CMD ["python", "app.py"]
#      ------------------
# 2. Deploy:
#    gcloud builds submit --tag gcr.io/YOUR_PROJECT_ID/flask-model-api
#    gcloud run deploy flask-model-api --image gcr.io/YOUR_PROJECT_ID/flask-model-api --platform managed

# # After deployment, the service URL is returned by the CLI for testing

In [17]:
dockerfile_content = '''\
FROM python:3.9
WORKDIR /app
COPY . .
RUN pip install --no-cache-dir -r requirements.txt
CMD ["python", "app.py"]
'''

with open('project/Dockerfile', 'w') as f:
    f.write(dockerfile_content)

##### **Q26: How do you test the deployed Flask API by sending remote requests to the live application?**

In [None]:
# use curl or any HTTP client with your live app URL (replace <URL>)

# curl -X POST https://<your-app-url>/predict \
#      -H "Content-Type: application/json" \
#      -d "{\"features\": [0.1, 0.2, 0.3, 0.4]}"

# you should receive a JSON prediction response with class and probabilities.

In [18]:
import shutil

shutil.rmtree('project')  # deletes the entire 'project' directory and its contents