# Model Deployment

### Introduction

Model deployment has always been an overly complex part of the MLOPs process. There is a lot of content online on how to build a model but most tutorials stop there. For content that does cover deployment, you have to spin up Kubernetes clustes and a load balencer to test sometimes the most basic of models. It is definitely getting easier to deploy models but we still think it has some way to go as deployment is just a small part of the puzzle.

When we think about deploying a model there are a few questions we like to ask:
* How quickly do I want to get this model up and running?
* Is the model going into production or is it in a testing/beta phase?
* How will I be serving data to this model? Streaming, batch or API inference?
* What scale do we expect this model to reach?

There are many other questions you should asked based on if you have existing architecture but in this tutorial we will be showing you how to use [Gradient](https://gradient.run/) to deploy your ML models. Gradient is a low-code Model deployment platform that abstracts alot of the complexities away from you such setting up infrastructure, scaling, frameworks, etc. To answer our own questions, this is when we like to use gradient:

* Quickly (10 minutes)
* Production or just quickly testing with a subset of users.
* API inference
* We would be comfortale with Gradient handling a few 1000's transactions per second.

### What are you going to learn

In this tutorial, we will be building a transaction fraud detection model, and deploying it as a REST API using Gradient.

By the end of this tutorial you will be able to:
- Add a dataset in Gradient
- Create a FastAPI application for your ML Service
- Deploy your machine learning service to Gradient


## Tutorial


Prerequisites:
- Install Python3.8+
- Install JupyterLab

You should download the data required for this tutorial from [here](https://drive.google.com/file/d/1MidRYkLdAV-i0qytvsflIcKitK4atiAd/view?usp=sharing). This is originally from a [Kaggle dataset](https://www.kaggle.com/competitions/ieee-fraud-detection/data) for Fraud Detection. Place this dataset in a `data` directory in the root of your project. You can run this notebook either in VS Code, Jupyter Notebooks or Colab.

### Build a model

Firstly, let's build a quick model to detect fraudulent transactions. Model building is out of scope for this tutorial but we build an extremely basic model.

We will need a number of libraries so lets install them.

In [10]:
pip install scikit-learn==1.0.2 pandas==1.4.3 numpy==1.23.2 xgboost==1.5.1

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/


In [2]:
import numpy as np
import pandas as pd
from sklearn.preprocessing import OneHotEncoder
from xgboost import XGBClassifier

# Load the data, sample such that the target classes are equal size
df = pd.read_csv("train_transaction.csv")
df = pd.concat([df[df.isFraud == 0].sample(n=len(df[df.isFraud == 1])), df[df.isFraud == 1]], axis=0)

# Select the features and target
X = df[["ProductCD", "P_emaildomain", "R_emaildomain", "card4", "M1", "M2", "M3"]]
y = df.isFraud

# Use one-hot encoding to encode the categorical features
enc = OneHotEncoder(handle_unknown="ignore")
enc.fit(X)

X = pd.DataFrame(enc.transform(X).toarray(), columns=enc.get_feature_names_out().reshape(-1))
X["TransactionAmt"] = df[["TransactionAmt"]].to_numpy()

# Split the dataset and train the model
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
xgb = XGBClassifier(n_estimators=100, learning_rate=0.1, max_depth=3, min_child_weight=1, gamma=0, subsample=0.8, colsample_bytree=0.8, objective='binary:logistic', nthread=4, scale_pos_weight=1, seed=27)
model = xgb.fit(X_train, y_train)

# Save the model and encoder
import joblib
joblib.dump(enc, "ohe_fraud_encoder.joblib")
joblib.dump(model, "xgb_fraud_model.joblib")

['xgb_fraud_model.joblib']

When you're done, create a requirements.txt file in the root of your project and fill it with the following lines. These are the versions, we ran our models with, if yours differ please update them.

```
xgboost==1.5.1
pandas
numpy
fastapi
uvicorn
joblib
scikit-learn==1.0.2
```

### Create a FastAPI Application

Now, we need to create an application that will serve our model. We are going to use the [FastAPI](https://fastapi.tiangolo.com/) framework, which is a lightweight, fast, and flexible framework for building REST APIs in Python.

Save the following in a file called `app.py` in the root of your project.

```python
import pandas as pd
import joblib

from fastapi import FastAPI, Request
import uvicorn

ENCODER_PATH = "models/ohe_fraud_encoder.joblib"
MODEL_PATH = "models/xgb_fraud_model.joblib"

app = FastAPI()


@app.get("/")
def health_check():
    return "Healthy!"


@app.post("/fraud-classfier")
async def fraud_prediction(request: Request):
    request_data = await request.json()
    df = pd.DataFrame(request_data)

    # Preprocessing
    categorical_cols = [
        "ProductCD",
        "P_emaildomain",
        "R_emaildomain",
        "card4",
        "M1",
        "M2",
        "M3",
    ]
    X = df[categorical_cols]
    enc = joblib.load(ENCODER_PATH)
    X = pd.DataFrame(
        enc.transform(X).toarray(), columns=enc.get_feature_names_out().reshape(-1)
    )
    X["TransactionAmt"] = df[["TransactionAmt"]].to_numpy()

    # XGBoost Classifier
    model = joblib.load(MODEL_PATH)
    pred = model.predict(X)

    response_map = {0: "Legitimate", 1: "Fraud"}
    return [response_map[prediction] for prediction in pred]


if __name__ == "__main__":
    uvicorn.run("app:app", host="0.0.0.0", port=8000)
```

As you can see our code hasn't changed much from our original model. What we have done is:

* Imported the encoder and model file that we trained
* Create a API endpoint for / to check the health status of our API as well as /fraud-classifer which will do all the work to return to the user their prediction
* Transform the variables from the API request and run the model predict function and return if the transaction was fraudulent or not.

You can then run the API with the command:
```python
python app.py
```
Test the application by going to http://localhost:8000/fraud-classfier, and sending a request with the following data:

```json
[{
    "TransactionID":3366167,
    "isFraud":0,
    "TransactionAmt":495.0,
    "ProductCD":"W",
    "card4":"visa",
    "P_emaildomain":"live.com",
    "R_emaildomain":null,
    "M1":"T",
    "M2":"T",
    "M3":"T"
}]
```

The output should be `[1]` indicating that the transaction is fraudulent.

### Deployment A: Gradient

Great! Now we have a model and an application. We need to deploy it.

Navigate to https://gradient.run/ and hit the *Sign up* button.

![Gradient Landing Page](media/grad_landing.png)

Once you have signed up, you should now be greeted with a page that prompts you to create a new project. However, in order to use Gradient, you will need to enroll in the **Pro Plan**, so let's do that first. Hit the upgrade button at the top of the page and follow the prompts to upgrade.

![Gradient Create Project](media/grad_new_project.png)

Once you're done, hit the *Back To Console* button to head back to the create a project screen, and then hit the *Create Project* button. Name the project *fraud-classifier*.

In order to deploy, we need to create a **Gradient Dataset**. Gradient will use this as local storage during application building. Navigate to the *Data* tab and hit the *Create A Dataset* button. Name the dataset *fraud-dataset*. 

![Gradient Create Dataset](media/grad_data.png)

You can then hit the back button to return to the *Data* tab and you should see the dataset.

![Gradient Dataset](media/grad_datasets.png)

Time to create a deployment! Navigate to the *Deployment* tab and hit the *Create Deployment* button. You should be greeted with a screen that looks like this:

![Gradient Create Deployment](media/grad_deployment.png)

Have your console at the ready! We are going to use the terminal to deploy our application. First thing to do is install gradient.

```bash
pip install -U gradient
```

Side note: I was getting a TypeError running any Gradient command. I use Conda as my package dependancy so instead of doing pip install -U gradient I did conda install gradient and then everything worked after that.

Once the package is installed, we need to authenticate. We will use the *gradient* CLI to do this. Use the `apiKey` command to authenticate, pasting in your API key as the argument.

In [None]:
# !gradient apiKey <your-api-key>

To continue, you will need to push your code to a Github repository. You can either create a new **public** repository and push your code to the master branch Note the url of the repository. You will need the following files as part of your repo:
- app.py
- requirements.txt
- xgb_fraud_model.joblib
- ohe_fraud_encoder.joblib

Instead of creating a new repository, you can also fork our repository and edit your deployment.yaml as specified below.

As I mentioned previously, Gradient uses a simple deployment file to deploy your application.
In the root of your project, create a file called `deployment.yaml` and save the following contents (fill out the id of the `fraud-dataset` and the URL of your Github repository). If you are forking the repository, just update the values in the deployment.yaml

```yaml

```yaml
image: python:3.8-bullseye
port: 8000
command:
  - /bin/sh
  - '-c'
  - |
    cd /opt/repos/repo
    pip install -r requirements.txt
    python app.py
repositories:
  dataset: <fraud-dataset-id>
  mountPath: /opt/repos
  repositories:
    - url: <your-github-repo-url>
      name: repo
      ref: master
resources:
  replicas: 1
  instanceType: C5
```



Key concepts to mention in this deployment file:
- `image`: The image to use for the container.
- `port`: The port to expose the application on.
- `command`: The series of commands to run in the container. In this case, we install our neeeded packages and then run the application.
- `repositories`: The repositories to mount in the container. We specify the dataset to use for local storage, the mount path and the Github repository to pull the code from. The dataset stays up to date with the git repository.
- `resources`: The resources to allocate to the container. We specify the number of replicas and the instance type (use the C5 instance to avoid any hiccups). 

Finally, we can deploy. Use the `deployments` command to do this.

In [None]:
# !gradient deployments create --name <your-service-name> --projectId <your-project-id> --spec ./deployment.yaml

We're done! You can watch the service build in the *Workflows* tab. 

![Gradient Workflows](media/grad_workflow.png)


Once it's built, navigate to the *Deployments* tab and select your service. Click the endpoint URL to see the application running (it should say *Healthy!*). Give it about 5 minutes once it has finished building before you check if its working.

![Gradient Final Deployment](media/grad_endpoint.png)

You can now send a POST request to endpoint as before to ensure inference is running correctly! Sent the following JSON Post request to `<YOUR_ENDPOINT>/fraud-classfier`

```
{
  "ProductCD": "H",
  "P_emaildomain": "gmail.com",
  "R_emaildomain": "",
  "card4": "visa",
  "M1": "",
  "M2": "",
  "M3": ""
}
```

The output should be: **Legitimate**

## Conslusion

That is it for our tutorial on how to deploy your ML model to production quickly! This is just one method of deployment among a sea of other however this should work well for most API based use-cases.