# Deploying a model using Google Colab and ngrok


## Overview

This lab is a continuation of the guided labs in Module 3.

In this lab, you will deploy a trained model and perform a prediction against the model. To make available the model, we'll use ngrok to expose a single endpoint to make predictions online.

## Introduction to the business scenario

You work for a healthcare provider, and want to improve the detection of abnormalities in orthopedic patients.

You are tasked with solving this problem by using machine learning (ML). You have access to a dataset that contains six biomechanical features and a target of _normal_ or _abnormal_. You can use this dataset to train an ML model to predict if a patient will have an abnormality.

## About this dataset

This biomedical dataset was built by Dr. Henrique da Mota during a medical residence period in the Group of Applied Research in Orthopaedics (GARO) of the Centre Médico-Chirurgical de Réadaptation des Massues, Lyon, France. The data has been organized in two different, but related, classification tasks.

The first task consists in classifying patients as belonging to one of three categories:

- _Normal_ (100 patients)
- _Disk Hernia_ (60 patients)
- _Spondylolisthesis_ (150 patients)

For the second task, the categories _Disk Hernia_ and _Spondylolisthesis_ were merged into a single category that is labeled as _abnormal_. Thus, the second task consists in classifying patients as belonging to one of two categories: _Normal_ (100 patients) or _Abnormal_ (210 patients).

## Attribute information

Each patient is represented in the dataset by six biomechanical attributes that are derived from the shape and orientation of the pelvis and lumbar spine (in this order):

- Pelvic incidence
- Pelvic tilt
- Lumbar lordosis angle
- Sacral slope
- Pelvic radius
- Grade of spondylolisthesis

The following convention is used for the class labels:

- DH (Disk Hernia)
- Spondylolisthesis (SL)
- Normal (NO)
- Abnormal (AB)

For more information about this dataset, see the [Vertebral Column dataset webpage](http://archive.ics.uci.edu/ml/datasets/Vertebral+Column).

## Dataset attributions

This dataset was obtained from:
Dua, D. and Graff, C. (2019). UCI Machine Learning Repository (http://archive.ics.uci.edu/ml). Irvine, CA: University of California, School of Information and Computer Science.


# Lab setup

Because this solution is split across several labs in the module, you run the following cells so that you can load the data and train the model to be deployed.

**Note:** The setup can take up to 5 minutes to complete.


## Imports

By running the following cells, the data will be imported and ready for use.

**Note:** The following cells represent the key steps in the previous labs.


In [None]:
import warnings
import requests
import zipfile
import io
import xgboost as xgb
import os
from scipy.io import arff
from sklearn.model_selection import train_test_split
import pandas as pd
import pickle as pkl
import numpy as np

warnings.simplefilter("ignore")

# Prepare datasets


In [None]:
f_zip = "http://archive.ics.uci.edu/ml/machine-learning-databases/00212/vertebral_column_data.zip"
r = requests.get(f_zip, stream=True)
Vertebral_zip = zipfile.ZipFile(io.BytesIO(r.content))
Vertebral_zip.extractall()

data = arff.loadarff("column_2C_weka.arff")
df = pd.DataFrame(data[0])

class_mapper = {b"Abnormal": 1, b"Normal": 0}
df["class"] = df["class"].replace(class_mapper)

cols = df.columns.tolist()
cols = cols[-1:] + cols[:-1]
df = df[cols]

train, test_and_validate = train_test_split(df, test_size=0.2, random_state=42, stratify=df["class"])
test, validate = train_test_split(test_and_validate, test_size=0.5, random_state=42, stratify=test_and_validate["class"])

x = {
    "train": train.drop(columns=["class"]).to_numpy(),
    "test": test.drop(columns=["class"]).to_numpy(),
    "validate": validate.drop(columns=["class"]).to_numpy(),
}
y = {"train": train["class"].to_numpy(), "test": test["class"].to_numpy(), "validate": validate["class"].to_numpy()}

# Build th ML Model


In [None]:
train.drop(columns=["class"]).to_numpy()

In [None]:
xgb_model = xgb.XGBClassifier(objective="binary:logistic", random_state=42, eval_metric="auc", verbosity=2)

xgb_model.fit(x["train"], y["train"])

print("ready for hosting!")

# Save the ML Model


In [None]:
model_name = "detect_abnormalities_in_orthopedic_patients.pkl"
pkl.dump(xgb_model, open(model_name, "+wb"))

# Step 1: Hosting the model

Now that you have a trained model, you can host it by using Amazon SageMaker hosting services.

The first step is to deploy the model. Because you have a model object, _xgb_model_, you can use the **deploy** method. For this lab, you will use a single ml.m4.xlarge instance.


In [None]:
!pip install fastapi
!pip install pydantic
!pip install pyngrok
!pip install nest-asyncio
!pip install uvloop
!pip install httptools

In [None]:
from fastapi import FastAPI
from fastapi.middleware.cors import CORSMiddleware
from pydantic import BaseModel
import json
from pyngrok import ngrok
import nest_asyncio
import uvicorn

In [None]:
app = FastAPI()
origins, methods, headers = ["*"], ["*"], ["*"]

app.add_middleware(CORSMiddleware, allow_origins=origins, allow_credentials=True, allow_methods=methods, allow_headers=headers)

In [None]:
class BiomechanicalAttributesInputRequest(BaseModel):
    pelvic_incidence: float
    pelvic_tilt: float
    lumbar_lordosis_angle: float
    sacral_slope: float
    pelvic_radius: float
    degree_spondylolisthesis: float

    def to_numpy(self):
        return np.array(list(self.__dict__.values())).reshape(1, len(self.__dict__.values()))


class BatchPredictionRequest(BaseModel):
    data: list[BiomechanicalAttributesInputRequest]
    threshold: float

    def to_numpy(self):
        return np.vstack([item.to_numpy() for item in self.data])


class PredictionsResponse(BaseModel):
    raw_predictions: list[float]
    predictions: list[float]

In [None]:
# sample = json.loads(df.head(1).to_json(orient="records"))[0]
# sample = BiomechanicalAttributesInputRequest(**sample)
# sample.to_numpy().shape

In [None]:
@app.get("/ping")
async def ping() -> str:
    return "pong"

In [None]:
NGROK_AUTH_TOKEN = "YOUR NGROK TOKEN GOES HERE"

In [None]:
@app.post("/predict")
async def predict(samples: BatchPredictionRequest) -> PredictionsResponse:

    results = xgb_model.predict_proba(samples.to_numpy())[:, 1]

    print(results)

    predictions_using_threshold = np.where(results >= samples.threshold, 1, 0)

    return PredictionsResponse(raw_predictions=results, predictions=predictions_using_threshold)

In [None]:
ngrok.set_auth_token(NGROK_AUTH_TOKEN)
ngrok_tunnel = ngrok.connect(8000)
print("Public URL:", ngrok_tunnel.public_url)
nest_asyncio.apply()
uvicorn.run(app, port=8000)

# A test case to hit the API

```
{
  "data": [
    {
      "pelvic_incidence": 63.0278175,
      "pelvic_tilt": 22.55258597,
      "lumbar_lordosis_angle": 39.60911701,
      "sacral_slope": 40.47523153,
      "pelvic_radius": 98.67291675,
      "degree_spondylolisthesis": -0.25439999
    },
    {
      "pelvic_incidence": 39.05695098,
      "pelvic_tilt": 10.06099147,
      "lumbar_lordosis_angle": 25.01537822,
      "sacral_slope": 28.99595951,
      "pelvic_radius": 114.4054254,
      "degree_spondylolisthesis": 4.564258645
    }
  ],
  "threshold": 0.5
}

```


# Congratulations!

You have completed this lab, and you can now end the lab by following the lab guide instructions.
