# Model Deployment using Numpy Linear Classifier

## 1. Introduction
In this workbook, we will look into the basics of deploying a model. For simplicity, we will consider a simple numpy linear classifier $$ \mathbf{Y} = \mathbf{W} \mathbf{X} + \mathbf{b}$$

For simplicity, we will consider $\mathbf{X}$ to be 6 dimensional ($\mathbb{R}^6$). i.e. 1 data point $x \in \mathbf{X}$ will be a numpy array of shape $(1,6)$. The output $\mathbf{Y}$ is 3 dimensional ($\mathbb{R}^3$). Then, the weights $\mathbf{W}$ will be a numpy array of shape $(3,6)$ and bias $\mathbf{b}$ will be a numpy array of shape $(,3)$. 

In this workbook, we will demonstrate how to deploy this numpy linear classifier as a server and how to perform query on this numpy linear classifier.

## 2. Imports and Dependencies.
The few packages needed are loaded next. Particularly, `numpy`, `mlflow` will be majorly used in this tutorial. `requests` package will be used for performing query. `json` is used to post and get response from the server.

In [None]:
import os
import sys
import mlflow
import numpy as np

# Suppress warnings
import warnings
warnings.filterwarnings("ignore")

## MLflow for experiment tracking and model deployment

MLflow is an open source platform for managing the end-to-end machine learning lifecycle. It tackles four primary functions:

- Tracking experiments to record and compare parameters and results (MLflow Tracking).
- Managing and deploying models from a variety of ML libraries to a variety of model serving and inference platforms (MLflow Models).
- Providing a central model store to collaboratively manage the full lifecycle of an MLflow Model, including model versioning, stage transitions, and annotations (MLflow Model Registry).

More information [here](https://www.mlflow.org/docs/latest/index.html#)



![image.png](https://www.mlflow.org/docs/latest/_images/scenario_4.png)

- localhost maps to the server on which the current notebook is running

- Tracking server maps to the server at environment variable `TRACKING_URL` that can be printed using `os.environ.get("TRACKING_URL")`

- Create an mlflow client that communicates with the tracking server

In [None]:
from mlflow import pyfunc

# Setting a tracking uri to log the mlflow logs in a particular location tracked by 
from mlflow.tracking import MlflowClient
tracking_uri = os.environ.get("TRACKING_URL")
client = MlflowClient(tracking_uri=tracking_uri)
mlflow.set_tracking_uri(tracking_uri)

## Create an experiment in mlflow database using mlflow client

- Get the list of all the experiments (Click on **Experiments** tab on the sidebar to see the list)
- Create a new experiment named *numpy_deployment* if it doesn't exist
- Set *numpy_deployment* as the new experiment under which different **runs** are tracked

## MLflow Entity Hierarchy

- Experiment 1
    - Run 1
        - Parameters
        - Metrics
        - Artifacts
            - Folder 1
                - File 1
                - File 2
            - Folder 2 
    - Run 2
    - Run 3

- Experiment 2
- Experiment 3        

In [None]:
# Setting a tracking project experiment name to keep the experiments organized
experiments = client.list_experiments()
experiment_names = []
for exp in experiments:
    experiment_names.append(exp.name)
experiment_name = "numpy_deployment"
if experiment_name not in experiment_names:
    mlflow.create_experiment(experiment_name)
mlflow.set_experiment(experiment_name)


## Python Class for inference

- ModelWrapper is derived from mlflow.pyfunc.PythonModel [more info](https://www.mlflow.org/docs/latest/python_api/mlflow.pyfunc.html)
- load_context() member function is used to load the model. In this case, it loads a numpy file with two arrays **weights** and **bias**
- predict member function takes a numpy array as input and outputs another numpy array
- An object of this class will be saved as a pickle file in blob storage

In [None]:
## Model Wrapper that takes 
class ModelWrapper(mlflow.pyfunc.PythonModel):
    def load_context(self,context):
        import numpy as np
        self.model = np.load(context.artifacts['model_path'], allow_pickle=True).tolist()
        print("Model initialized")
    
    def predict(self, context, model_input):
        import numpy as np
        import json
        json_txt = ", ".join(model_input.columns)
        data_list = json.loads(json_txt)
        inputs = np.array(data_list)
        if len(inputs.shape) == 2:
            print('batch inference')
            predictions = []
            for idx in range(inputs.shape[0]):
                prediction = np.matmul(inputs[idx,:],self.model['weights'].T) + self.model['bias']
                predictions.append(prediction.tolist())
        elif len(inputs.shape) == 1:
            print('single inference')
            predictions = self.model['weights'].T * inputs + self.model['bias']
            predictions = predictions.tolist()
        else:
            raise ValueError('invalid input shape')
        return json.dumps(predictions)

## Register a model using mlflow

- Log user-defined parameters in a remote database through a remote server
- Create a model_wrapper object using ModelWrapper() class in the above cell
- Create a default conda environment that need to be installed on the Docker conatiner that serves a REST API
- Save the model object as a pickle file and conda environment as artifacts (files) in S3 or Blob Storage

In [None]:
# instantiate the python inference model wrapper for the server
model_wrapper = ModelWrapper()


# define the model weights randomly
np_weights = np.random.rand(3,6)
np_bias = np.random.rand(3)

# checkpointing and logging the model in mlflow
artifact_path = './np_model'
np.save(artifact_path, {'weights':np_weights, 'bias':np_bias})
model_artifacts = {"model_path" : artifact_path+'.npy'}

#Conda environment
env = mlflow.sklearn.get_default_conda_env()
with mlflow.start_run():
    mlflow.log_param("features",6)
    mlflow.log_param("labels",3)
    mlflow.pyfunc.log_model("np_model", python_model=model_wrapper, artifacts=model_artifacts, conda_env=env)

## 4. Deploying the model
The above code logs a model in the experiments tab. For more info please refer [here](https://rocketml.gitbook.io/rocketml-user-guide/experiments). 

### 4.1 Find experiment in experiment list and click on it
![experiments_list](https://github.com/rocketmlhq/sciml/raw/e8abbef269c5bee9d2b69398495fc5ced7457708/03_Deployment/experiments_list.png)

### 4.2 Find run in runs list and click on it
![runs_list](https://github.com/rocketmlhq/sciml/raw/e8abbef269c5bee9d2b69398495fc5ced7457708/03_Deployment/runs_list.png)

### 4.3 Get run details and click on artifacts
![run_details](https://github.com/rocketmlhq/sciml/raw/e8abbef269c5bee9d2b69398495fc5ced7457708/03_Deployment/run_details.png)

### 4.4 Check different files logged as artifacts
![artifacts](https://github.com/rocketmlhq/sciml/raw/e8abbef269c5bee9d2b69398495fc5ced7457708/03_Deployment/artifacts.png)

- An MLflow Model is a standard format for packaging machine learning models that can be used in a variety of downstream tools [More Details](https://www.mlflow.org/docs/latest/models.html#storage-format)
- ModelWrapper() object is saved as pkl file
- conda.yaml and requirements.txt file are used to manage Python environment
- Numpy file is saved in artifacts folder within the main folder (np_model)

### 4.5 Deploy ML model as a REST API service

Click on **Convert To Model** and fill the form

![model_deployment](https://github.com/rocketmlhq/sciml/raw/e8abbef269c5bee9d2b69398495fc5ced7457708/03_Deployment/model_deployment.png)

### 4.6 Go to models tab and wait until the model turns to **ON** state
![model_list](https://github.com/rocketmlhq/sciml/raw/e8abbef269c5bee9d2b69398495fc5ced7457708/03_Deployment/model_list.png)

## 5. Use the Endpoint and Query from the server

There are two methods to perform query... The first is using `requests` library and the other using `curl` shell command.

In [None]:
import requests
import json

################################################################################
# *** SET MODEL URL HERE BEFORE RUNNING THIS CELL (instructions above) ***
# Example: https://<random_string>.sciml.rocketml.net/invocations
url = ""
################################################################################

if not url:
    raise ValueError('Model URL not set! Please read instructions on how to deploy model, set the correct URL, and try again.')

headers = {"Content-Type":"text/csv"}

# First case, run inference on single data point
np_array = np.random.rand(1,6).tolist()
json_data = json.dumps(np_array)

if url:
    response = requests.post(url,data=json_data,headers=headers)
    if response.status_code == 200:
        output = np.array(json.loads(response.json())).astype(np.float32)
        print(output)
    else:
        print(response.status_code)
        print("REST API deployment is in progress -- please try again in a few minutes!")
else:
    print("Make sure that the model is in ON state. Copy the Endpoint")

# Second case, run inference on multiple data points
np_array = np.random.rand(20,6).tolist()
json_data = json.dumps(np_array)

if url:
    response = requests.post(url,data=json_data,headers=headers)
    if response.status_code == 200:
        output = np.array(json.loads(response.json())).astype(np.float32)
        print(output)
    else:
        print(response.status_code)
        print("REST API deployment is in progress -- please try again in a few minutes!")
else:
    print("Make sure that the model is in ON state. Copy the Endpoint")
