# Machine Learning Operations (MLOps) - Part 1

> MLOps is a set of practices that aims to deploy and maintain machine learning models in production reliably and efficiently. The word is a compound of "machine learning" and the continuous development practice of DevOps in the software field

Source: [Wikipedia](https://en.wikipedia.org/wiki/MLOps)

- MLOps is a very big topic with several components such as
    1. Experiment tracking
    1. Model packaging and versioning
    1. Model deployment for real time predictions
    1. Model deployment for batch predictions
    1. Deployment monitoring
    1. End to End automations
    1. etc
- In general, most modern DS organizations have atleast a small MLOps team of 2-4 people. Bigger organizations can have entire teams of MLOps engineers. 
- MLOps or ML Engineering has become an individual field by itself similar to Data Science, Data Engineering or Data Analytics
- Despite this, *"No knowledge is wasted!"* Even if you want to be a Data Scientist or Data Analyst, having basic MLOps knowledge will make your skills more well rounded and more employable!
- Luckily for us, there are several frameworks available nowadays that abstract away the complex and repeatable parts of MLOps.
- In this series of lessons, we'll focus on taking an ML model to production (Model deployment).
    - We'll do it using technologies that we already covered in this course so far: [Flask](https://flask.palletsprojects.com/en/2.0.x/)
    - You can simplify a lot of the development by directly using more high level libraries like [Bento ML](https://www.bentoml.com/). I'll leave this up to you to experiment!

# Flask (Recap)
- Microservices architecture [Link](https://docs.google.com/presentation/d/1oT4tqCMQjpkF5ZchsnBEsqbUP6E3PcfrB-KGjRDjyHE)
![](../images/microservice.png)

# Local Serving
To use Flask to package and deploy our model locally, we need to do the following
1. Train an ML model as per normal
1. Save the model as per normal
1. Test the saved model
1. [NEW] Create a `serve.py` file to serve the saved model as a Flask API
1. [NEW] Test the `serve.py` file

## 1. Train an ML model as per normal
- Download the dataset from here: [Link](https://storage.googleapis.com/mledu-datasets/cats_and_dogs_filtered.zip). Download and unzip the file into the data directory
- Below code is directly copied from [9.04-lesson-cnn](../../9.04-lesson-cnn/solution-code/solution-code.ipynb)

In [1]:
# Imports
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.optimizers import Adam

from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.applications.mobilenet_v2 import MobileNetV2, preprocess_input

In [2]:
# Load the Train and Validation data
train_data = ImageDataGenerator(preprocessing_function=preprocess_input).flow_from_directory('../data/cats_and_dogs_filtered/train/', 
                                                                                             target_size=(224,224), 
                                                                                             class_mode='binary'
                                                                                            )

val_data = ImageDataGenerator(preprocessing_function=preprocess_input).flow_from_directory('../data/cats_and_dogs_filtered/validation/', 
                                                                                           target_size=(224,224), 
                                                                                           class_mode='binary'
                                                                                          )

FileNotFoundError: [WinError 3] The system cannot find the path specified: '../data/cats_and_dogs_filtered/train/'

In [None]:
# Import the desired pre-trained model
# List of pre-trained models: https://www.tensorflow.org/api_docs/python/tf/keras/applications
pre_trained_model = MobileNetV2(include_top=False, pooling='avg')

# Freeze the model so we don't accidentally change the pre-trained model
pre_trained_model.trainable = False

In [None]:
# Create our model architecture
trf_model = Sequential()

# Then add the pre-trained model to use Transfer Learning
trf_model.add(pre_trained_model)

# Finally add our custom modifications
trf_model.add(Dense(1, activation='sigmoid'))

trf_model.summary()

In [None]:
# Compile the model
opt = Adam(learning_rate=0.001)
trf_model.compile(loss='binary_crossentropy', optimizer=opt, metrics=['accuracy'])

In [None]:
# Fit model on training data
history = trf_model.fit(train_data, 
                        batch_size=64,
                        validation_data=val_data,
                        epochs=5,
                       )

## 2. Save the model as per normal

In [None]:
# Save the model
trf_model.save('cats_vs_dogs.h5')

## 3. Test the saved model as per normal

In [None]:
# Import necessary libraries
from tensorflow.keras.applications.mobilenet_v2 import preprocess_input
from tensorflow.keras.models import load_model
from tensorflow.keras.preprocessing import image
import numpy as np

In [None]:
# Load the images into Python
images = ['../images/test.jpeg', '../images/test_1.jpeg']
test_images = [image.load_img(img, target_size=(224, 224)) for img in images]

# Convert the images to a matrix of numbers
test_images = [image.img_to_array(img).tolist() for img in test_images]

----
We can decide to use the code below this line in the Flask API

In [None]:
# Load the saved model
trf_model = load_model('cats_vs_dogs.h5')

In [None]:
# Preprocess
test_images = preprocess_input(np.array(test_images))

In [None]:
# Make predictions
result = trf_model.predict(test_images)

# Convert the probability to actual predictions
['Dog' if pred[0]>0.5 else 'Cat' for pred in result]

## 4. [NEW] Create a `serve.py` file to serve the saved model as a Flask API

Copy this code into a new `serve.py` file

```
from flask import Flask, request
import numpy as np
import os
from tensorflow.keras.applications.mobilenet_v2 import preprocess_input
from tensorflow.keras.models import load_model

app = Flask('myApp')

# Load in the model outside any route so that we load the model only once
# Loading the model takes time (especially deep learning models)
# So if you happen to load the model inside a route, it'll load every single time a request is received which is very inefficient
trf_model = load_model('cats_vs_dogs.h5')

# route 1: Return Success status
@app.route('/')
def home():
    # return a simple string
    return {"success": True}, 200

# route 2: accept input data
# Post method is used when we want to receive some data from the user
@app.route('/predict', methods = ['POST'])
def make_predictions():
    # Get the data sent over the API
    user_input = request.get_json(force=True)
    data = user_input['X']
    
    # Preprocess
    data = preprocess_input(np.array(data))

    # Make predictions
    result = trf_model.predict(data)

    # Convert the probability to actual predictions
    predictions = ['Dog' if pred[0]>0.5 else 'Cat' for pred in result]

    # return the results with our predictions
    return {'response': predictions}


if __name__ == '__main__':
    # Run the App!
    app.run(host='0.0.0.0', debug=True, port=int(os.environ.get("PORT", 8080)))
```

## 5. [NEW] Test the serve.py file
- Open a new terminal and run the below commands to start the flask server on your machine.
```
conda activate dsi-sg
cd 12.01-mlops/solution-code
python serve.py
```

In [13]:
import requests

In [14]:
# Load the images into Python
images = ['../images/test.jpeg', '../images/test_1.jpeg']
test_images = [image.load_img(img, target_size=(224, 224)) for img in images]

# Convert the images to a matrix of numbers
test_images = [image.img_to_array(img).tolist() for img in test_images]

In [15]:
user_input = {'X': test_images}

# IP address 127.0.0.1 is your local machine
response = requests.post('http://127.0.0.1:8080/predict', json=user_input)
response.json()

{'response': ['Dog', 'Cat']}

# Cloud Serving
- Once we have the model file and the serve.py, we can run this serve.py file from a machine in the cloud as well!
- Running on a cloud virtual machine involves a few steps. They are not very user friendly, so we will never actually do this in production. Just showing the steps here to convince you the same code can run on any machine as long as the dependencies are met

## Below is a one time setup on your GCP project
- Create a new firewall rule to allow internet traffic into your VM as mentioned here: [Link](https://serverfault.com/questions/831273/unable-to-reach-a-python-flask-enabled-web-server-on-gce). 
- Beware! this is not secure and you should not do this for your company data!
- Rather you should get your company's IT to setup the required firewall rules for you.

## Below are one time setup when creating the Virtual Machine for the first time
1. Create a server on the cloud. Let's use a Google Cloud Compute Engine VM to deploy this model
    - Change `Boot Disk` from `Debian` to `Ubuntu 20.04 LTS`
    - Allow HTTPS and HTTP traffic
1. Once the machine is created, connect to it by clicking on the SSH button.
1. Install Anaconda by running these commands in the SSH terminal
    ```
    # download anaconda for linux from the official website
    wget https://repo.anaconda.com/archive/Anaconda3-2021.11-Linux-x86_64.sh

    # install downloaded anaconda
    sh Anaconda3-2021.11-Linux-x86_64.sh

    # source the bashrc to be able to access the newly installed conda
    source ~/.bashrc
    ```

## Below are one time setup for each new project
1. Create an Anaconda environment for the project
    ```
    # Create an anaconda environment for our model to run and activate it
    conda create --name=cats_v_dogs python=3.8 -y
    conda activate cats_v_dogs

    # Install any libraries that are needed in this environment
    pip install numpy flask tensorflow
    ```
1. Upload our model and serving files to the VM
1. Create a directory to store our files and move the uploaded files into it 
    ```
    mkdir cats_v_dogs 
    mv cats_vs_dogs.h5 serve.py cats_v_dogs/
    cd cats_v_dogs
    ls
    ```
1. Use tmux (terminal multiplexer) to start a terminal session in the background that wont get terminated when we close the SSH terminal window: `tmux new -s cats_v_dogs`
1. Activate the anaconda environment: `conda activate cats_v_dogs`
1. Run the serve.py file: `python serve.py`
1. Ctrl+b d to exit the tmux session. Close the SSH terminal if you want to
1. Copy the `External IP` of the VM from the GCP console webpage
1. As long as this VM is running, you can post requests to it from anywhere on the internet!

## Subsequently everytime you update your model or your serving code
1. Replace the files in the `cats_v_dogs` directory
1. Enter the tmux session: `tmux attach-session -t cats_v_dogs`
1. Stop the running flask server: Ctrl+C
1. Start the flask server again `python serve.py`

In [22]:
# Test the server running on GCP
# Same code as testing locally, only the IP address changed!
import requests

# Load the images into Python
images = ['../images/test.jpeg', '../images/test_1.jpeg']
test_images = [image.load_img(img, target_size=(224, 224)) for img in images]

# Convert the images to a matrix of numbers
test_images = [image.img_to_array(img).tolist() for img in test_images]

In [24]:
user_input = {'X': test_images}
response = requests.post('http://35.232.69.180:8080/predict', json=user_input)
response.json()

{'response': ['Dog', 'Cat']}

# Conclusion
- You now have the model deployed as a real time endpoint on your local machine and on a GCP VM!
- This is one of the many ways to "deploy" a model
- Pros:
    - Simple and easy to implement
- Cons:
    - Not very robust. If the same machine is serving many models, each model's dependencies WILL start to clash with each other eventually!
    - Very susceptible to "It works on my machine cliche"
    - Using Anaconda environments partly solves the issues of conflicting libraries, but doesn't solve the issue of conflicting OS dependencies. 
    - There's an even better way called "Docker"

![](../images/it-works-on-my-machine.jpeg)

# Test your knowledge
1. Try to package the ham vs spam TFIDF + SklearnNLP classifier model from [5.06-lesson-nlp-ii](../../5.06-lesson-nlp-ii/solution-code/solution-code.ipynb) as a Flask API.
1. Use [FastAPI](https://fastapi.tiangolo.com/tutorial/first-steps/) instead of Flask to get auto documentation and an even faster API