# ML Deployment

### Steps

Step 1: **Make ML model in a virtual environment** :
Step 2: **Make an API endpoint for the ML model** : 
Step 3. **Dockerise it** : Helps productionise the API and makes the model more scalable
Step 4. **Deploy the docker container on EC2 or Kubernetes ** : 


#### Step 1: **Make ML model in a virtual environment** :
- Make a pickle file of the model and save it

#### Step 2 : **Make an API endpoint for the ML model** : 

- Create a flask app and define routes. The main route should include reading the pickle model. One of the other routes should include predicting the output when user passes the inputs in the URL. 

- WSGI (Web server gateway interface) : Any user that wants to get results of our model needs to go to Web Server where WSGI file will redirect it to the Flask App to get the results. 



#### Step 3 : **Dockerise it**

Introduction

- When we make an ML model, it uses libraries that might be dependent on the OS and a hardware configuration. So, it might or might not work on every machine. Requirements.txt file that comes bundled with ML models in a new environment is not enough. 

- So, we dockerise by making a container for the ML App and include all dependencies in the docker. When we switch machines, we unpack everything from the docker in the new machine. So, it essentially standardises the environment. Build once, deploy anywhere. 

- Hence, a container is a type of software that packages up an application and all its dependencies so the application runs reliably from one computing environment to another. Docker is a company that provides containers. 
 

- Virtual machines (alternative to Dockers) : They also help in environment standardisation where we can have different environments configured (Hardware and software) in different virtual machines. Virtual machines create virtual environments over a web server, with each virtual machine having its own RAM, HD and network capabilities. 

- Docker containers : Dockers use virtualisation. Each docker contains its own process ID (CPU), network config, user root folder. In Docker, the containers running share the host OS kernel. A container is just a set of processes that are isolated from the rest of the system. Each VM has Operating system (OS).

- Dockers are similar to virtual machines (both help in environment standardisation) except one major difference. These virtual machines can't use each others resources if one virtual machine is underutilised (since each machine has a separate OS) while one is over utilised. Docker containers can use each others resources.  

How to make a docker container

 - Make a dockerfile and execute it using docker build -t <Name of docker App>. Once you execute it, dockerfile would run which contains all commands. 
 - Ensure the main API file is in the same folder   
 - Dockerfile includes FROM (copy base image containing OS from docker), COPY (copy your API file, requirments.txt file, classifier pkl file into Docker root folder), EXPOSE (expose your docker image to a particular port), Working directory (usually your Docker root folder), RUN(install requirements.txt), CMD (Run your main app)
    
    
How to run a docker container

- Docker run -p <Port>:<Port> <Name of docker app>



#### Step 4. **Deploy the docker container on EC2 or Kubernetes ** : 

- Kubernetes is a system for running and coordinating containerized applications across a cluster of machines (includes scaling, load balancing etc.)

- Docker is a software that allows you to containerize applications while Kubernetes is a container management system that allows to create, scale and monitor hundreds and thousands of containers.

- Google Kubernetes Engine is implementation of Google’s open source Kubernetes on Google Cloud Platform. Other popular alternatives to GKE are Amazon ECS and Microsoft Azure Kubernetes Service.

- Steps include to upload the docker image to Google Container Registry. Then create a cluster (VM instance) running a Kubernetes. Then deploy the docker image from the GCR to Google Kubernetes Engine cluster. 




In [33]:
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split

df=pd.read_csv('./Data/BankNote_Authentication.csv')
print(df.shape)
print(df['class'].value_counts())
df.head()

X= df.iloc[:,:-1]
y= df.iloc[:,-1]

X_train,X_test,y_train, y_test = train_test_split(X,y,test_size=.3, random_state=0) 
print(df.columns)
df.iloc[1000,:]

(1372, 5)
0    762
1    610
Name: class, dtype: int64
Index(['variance', 'skewness', 'curtosis', 'entropy', 'class'], dtype='object')


variance   -2.8829
skewness    3.8964
curtosis   -0.1888
entropy    -1.1672
class       1.0000
Name: 1000, dtype: float64

In [26]:
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score

classifier=RandomForestClassifier()
classifier.fit(X_train,y_train)

y_pred=classifier.predict(X_test)

accuracy_score=accuracy_score(y_test,y_pred)
print(accuracy_score)

0.9878640776699029


In [27]:
import pickle 
pickle_out = open('classifier.pkl','wb')
pickle.dump(classifier, pickle_out)
pickle_out.close()