# 5. Deploying Machine Learning models 

We'll use the same model we trained and evaluated previously - the churn prediction model. Now we'll deploy it as a web service.

## 5.1 Intro / Session overview
We need to put the model that lives in our Jupyter Notebook into production, so other services can use the model to make decisions based on the output of our model.

Suppose we have a service for running marketing campaigns. For each customer, it needs to determine the probability of churn, and if it's high enough, it will send a promotional email with discounts. This service needs to use our model to decide whether it should send an email. 

What we will cover this week: 
* Saving models with Pickle
* Serving models with Flask
* Managing dependencies with Pipenv
* Making the service self-contained with Docker
* Deploying it to the cloud using AWS Elastic Beanstalk

In [1]:
import pandas as pd
import numpy as np

from sklearn.model_selection import train_test_split
from sklearn.model_selection import KFold

from sklearn.feature_extraction import DictVectorizer
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import roc_auc_score

In [2]:
data = 'https://raw.githubusercontent.com/alexeygrigorev/mlbookcamp-code/master/chapter-03-churn-prediction/WA_Fn-UseC_-Telco-Customer-Churn.csv'

In [3]:
# df = pd.read_csv('data-week-3.csv')
df = pd.read_csv(data)

df.columns = df.columns.str.lower().str.replace(' ', '_')

categorical_columns = list(df.dtypes[df.dtypes == 'object'].index)

for c in categorical_columns:
    df[c] = df[c].str.lower().str.replace(' ', '_')

df.totalcharges = pd.to_numeric(df.totalcharges, errors='coerce')
df.totalcharges = df.totalcharges.fillna(0)

df.churn = (df.churn == 'yes').astype(int)

In [4]:
df_full_train, df_test = train_test_split(df, test_size=0.2, random_state=1)

In [5]:
numerical = ['tenure', 'monthlycharges', 'totalcharges']

categorical = [
    'gender',
    'seniorcitizen',
    'partner',
    'dependents',
    'phoneservice',
    'multiplelines',
    'internetservice',
    'onlinesecurity',
    'onlinebackup',
    'deviceprotection',
    'techsupport',
    'streamingtv',
    'streamingmovies',
    'contract',
    'paperlessbilling',
    'paymentmethod',
]

In [6]:
def train(df_train, y_train, C=1.0):
    dicts = df_train[categorical + numerical].to_dict(orient='records')

    dv = DictVectorizer(sparse=False)
    X_train = dv.fit_transform(dicts)

    model = LogisticRegression(C=C, max_iter=3000)
    model.fit(X_train, y_train)
    
    return dv, model

In [7]:
def predict(df, dv, model):
    dicts = df[categorical + numerical].to_dict(orient='records')

    X = dv.transform(dicts)
    y_pred = model.predict_proba(X)[:, 1]

    return y_pred

In [8]:
C = 1.0
n_splits = 5

In [9]:
kfold = KFold(n_splits=n_splits, shuffle=True, random_state=1)

scores = []

for train_idx, val_idx in kfold.split(df_full_train):
    df_train = df_full_train.iloc[train_idx]
    df_val = df_full_train.iloc[val_idx]

    y_train = df_train.churn.values
    y_val = df_val.churn.values

    dv, model = train(df_train, y_train, C=C)
    y_pred = predict(df_val, dv, model)

    auc = roc_auc_score(y_val, y_pred)
    scores.append(auc)

print('C=%s %.3f +- %.3f' % (C, np.mean(scores), np.std(scores)))

C=1.0 0.842 +- 0.007


In [10]:
scores

[np.float64(0.8443806862337213),
 np.float64(0.8449563799496754),
 np.float64(0.83351796106763),
 np.float64(0.8347649005563726),
 np.float64(0.8517892441404411)]

In [11]:
dv, model = train(df_full_train, df_full_train.churn.values, C=1.0)
y_pred = predict(df_test, dv, model)

y_test = df_test.churn.values
auc = roc_auc_score(y_test, y_pred)
auc

np.float64(0.8583598751990639)

## 5.2 Saving and loading the model

* Saving the model to pickle
* Loading the model from pickle
* Turning our notebook into a Python script

To be able to use it outside of our notebook, we need to save it, and then later, another process can load and use it.

Pickle is a serialization/deserialization module that's already built into Python: using it, we can save an arbitrary Python object (with a few exceptions) to a file. Once we have a file, we can load the model from there in a different process.

Install the library with the command `pip install pickle-mixin` if you don't have it.

To save the model, we first import the `pickle` module, and then use the `dump` function:
  ```python
  import pickle

  with open('model.bin', 'wb') as f_out:  # 'wb' means write binary
      pickle.dump((dict_vectorizer, model), f_out)
  ```
  
To use the model, we need to open the binary file we saved and load the model using the `load` function.

  ```python
  import pickle

  with open('model.bin', 'rb') as f_in:  # 'rb' means read binary
      # Note: Never open a binary file from an untrusted source!
      dict_vectorizer, model = pickle.load(f_in)
  ```

#### Save the model

In [12]:
import pickle

In [13]:
output_file = f'model_C={C}.bin'

In [14]:
output_file

'model_C=1.0.bin'

In [15]:
# Write to the file in binary
f_out = open(output_file, 'wb') 
pickle.dump((dv, model), f_out)
f_out.close()

In [16]:
# !ls -lh *.bin

In [17]:
with open(output_file, 'wb') as f_out: 
    pickle.dump((dv, model), f_out)

#### Load the model

In [18]:
import pickle

In [19]:
input_file = 'model_C=1.0.bin'

Be careful when specifying the mode. Accidentally specifying an incorrect mode may result in data loss: if you open an existing file with the `w` mode instead of `r`, it will overwrite the content.

Also, unpickling objects found on the internet is not secure: it can execute arbitrary code on you machine. Use it only for things you trust and things you saved yourself.

In [20]:
with open(input_file, 'rb') as f_in: 
    dv, model = pickle.load(f_in)

In [21]:
dv, model

(DictVectorizer(sparse=False), LogisticRegression(max_iter=3000))

Notice that we did not import scikit-learn but we need to have scikit-learn installed on our computer for this to work. Otherwise, it will complain not knowing what this is (referring to these classes) when we try to load the pickle file and this is because scikit-learn is not installed on our computer. 

In [22]:
customer = {
    'gender': 'female',
    'seniorcitizen': 0,
    'partner': 'yes',
    'dependents': 'no',
    'phoneservice': 'no',
    'multiplelines': 'no_phone_service',
    'internetservice': 'dsl',
    'onlinesecurity': 'no',
    'onlinebackup': 'yes',
    'deviceprotection': 'no',
    'techsupport': 'no',
    'streamingtv': 'no',
    'streamingmovies': 'no',
    'contract': 'month-to-month',
    'paperlessbilling': 'yes',
    'paymentmethod': 'electronic_check',
    'tenure': 1,
    'monthlycharges': 29.85,
    'totalcharges': 29.85
}

In [23]:
# Turn this customer into feature matrix
X = dv.transform([customer])
X

array([[ 1.  ,  0.  ,  0.  ,  1.  ,  0.  ,  1.  ,  0.  ,  0.  ,  1.  ,
         0.  ,  1.  ,  0.  ,  0.  , 29.85,  0.  ,  1.  ,  0.  ,  0.  ,
         0.  ,  1.  ,  1.  ,  0.  ,  0.  ,  0.  ,  1.  ,  0.  ,  1.  ,
         0.  ,  0.  ,  1.  ,  0.  ,  1.  ,  0.  ,  0.  ,  1.  ,  0.  ,
         0.  ,  1.  ,  0.  ,  0.  ,  1.  ,  0.  ,  0.  ,  1.  , 29.85]])

In [24]:
model.predict_proba(X)

array([[0.37255464, 0.62744536]])

In [25]:
model.predict_proba(X)[:, 1]

array([0.62744536])

In [26]:
# Get the probability that this particular customer is going to churn
y_pred = model.predict_proba(X)[0, 1]
y_pred

np.float64(0.6274453618230489)

In [27]:
print('input:', customer)
print('output:', y_pred)

input: {'gender': 'female', 'seniorcitizen': 0, 'partner': 'yes', 'dependents': 'no', 'phoneservice': 'no', 'multiplelines': 'no_phone_service', 'internetservice': 'dsl', 'onlinesecurity': 'no', 'onlinebackup': 'yes', 'deviceprotection': 'no', 'techsupport': 'no', 'streamingtv': 'no', 'streamingmovies': 'no', 'contract': 'month-to-month', 'paperlessbilling': 'yes', 'paymentmethod': 'electronic_check', 'tenure': 1, 'monthlycharges': 29.85, 'totalcharges': 29.85}
output: 0.6274453618230489


This way, we can load the model and apply it to the customer we specified in the script. 

Of course, we aren't going to manually put the information about customers in the script. In the next section, we'll cover a more practical approach where we will be putting the model into a web service. 

In [28]:
# Refer to train.py and predict.py

## 5.3 Web services: introduction to Flask
The easiest way to implement a web service in Python is to use Flask. It's quite light-weight, requires little code to get started, and hides most of the complexity of dealing with HTTP requests and responses. 

Before we put our model inside a web service, let's cover the basics of using Flask. For that, we'll create a simple function and make it available as a web service. 
* Writing a simple ping/pong app
* Querying it with `curl` and browser

Web service:
- A web service is a method used to communicate between electronic devices.
- Below are some methods in web services that we can use to satisfy our problems:
    - **GET:** A method used to retrieve files. For example, when we are searching for a cat image in google, we're actually requesting cat images with GET method.
    - **POST:** The second most common method used in web services. It enables sending data to a server to create or update a resource. For example, during a sign up process, we are submitting our name, username, password, etc. to a server that is using a web service. (Note that there is no specification on where the data goes)
    - **PUT:** Same as POST, but we are specifying where the data is going to.
    - **DELETE:** A method that is used to delete some data from the server.
- For more information, google "HTTP methods".

In [29]:
# Refer to ping.py

This is the content in the `ping.py` file
```python
# ping.py
def ping():
    return "PONG"
```

Run the program by executing this command on the terminal
```bash
ipython
```
Then, enter the following code in the Interactive Python mode:
```python
import ping

ping.ping()
```

Now, we want to turn this function into a web service and we'll use Flask for that
```bash
pip install flask
```

Decorator is just a way to add some extra functionality to our functions and this extra functionality that we're going to add will allow us to turn this function into a web service.

By putting `@app.route` on top of the function definition, we assign the `/ping` address of the web service to the `ping` function. 

```python
from flask import Flask

app = Flask('ping')  # Give an identity to your web service

# Use decorator to add Flask's functionality to our function
@app.route('/ping', methods=['GET'])  
def ping():
    return "PONG"

if __name__ == "__main__":
    # Run the code in local machine with debugging mode true and port 9696
    app.run(debug=True, host='0.0.0.0', port=9696)
```

The `run` method of `app` starts the service. We specify three parameters:
- `debug=True`: Restarts our application automatically when there are changes in the code.
- `host='0.0.0.0'`: Makes the web service public; otherwise, it won't be possible to reach it when it's hosted on a remote machine (e.g., in AWS).
- `port=9696`: The port that we use to access the application.

To start our service, execute this on the terminal:
```bash
python ping.py
```

`curl` is a special command line utility for communicating with web services. 

You can use `0.0.0.0` for localhost and then specify the port `9696`.

```bash
curl http://0.0.0.0:9696/ping
```

In [30]:
!curl http://127.0.0.1:9696/ping

PONG


  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed

  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
100     4  100     4    0     0    351      0 --:--:-- --:--:-- --:--:--   363


In [31]:
!curl http://localhost:9696/ping

PONG

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed

  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
100     4  100     4    0     0     19      0 --:--:-- --:--:-- --:--:--    19





## 5.4 Serving the churn model with Flask
In this session, we talked about implementing the functionality of prediction to our churn web service and how to make it usable in development environment. 
* Wrapping the predict script into a Flask app
* Querying it with `requests` 
* Preparing for production: gunicorn
* Running it on Windows with waitress

To load the saved model, we use the code below:
```python
import pickle

with open('churn-model.bin', 'rb') as f_in:
    dv, model = pickle.load(f_in)
```

To predict a value for a customer, we need a function like below:
```python
def predict_single(customer, dv, model):
    # Apply the one-hot encoding feature to the customer data
    X = dv.transform([customer]) 
    y_pred = model.predict_proba(X)[:, 1]
    return y_pred[0]
```

At last, we create the final function used for implementing the web service:
```python
# To send the customer information, we need to post its data
@app.route('/predict', methods=['POST'])
def predict():
    # Web services work best with JSON format
    customer = request.get_json()  # Access the body of JSON

    prediction = predict_single(customer, dv, model)
    churn = prediction >= 0.5

    result = {
        # Cast numpy float type to Python native float type
        'churn_probability': float(prediction), 
        'churn': bool(churn), # Cast the value using bool method
    }
    # Send back the result in JSON format to the user
    return jsonify(result) 
```

To get a response, we post customer data as `json`:
```python
# A new customer information
customer = {
    'gender': 'female',
    'seniorcitizen': 0,
    'partner': 'yes',
    'dependents': 'no',
    'phoneservice': 'no',
    'multiplelines': 'no_phone_service',
    'internetservice': 'dsl',
    'onlinesecurity': 'no',
    'onlinebackup': 'yes',
    'deviceprotection': 'no',
    'techsupport': 'no',
    'streamingtv': 'no',
    'streamingmovies': 'no',
    'contract': 'month-to-month',
    'paperlessbilling': 'yes',
    'paymentmethod': 'electronic_check',
    'tenure': 1,
    'monthlycharges': 29.85,
    'totalcharges': 29.85
}

import requests # We need the requests library to use the POST method

url = 'http://localhost:9696/predict' # The route we made for prediction
# Post the customer information in JSON format
response = requests.post(url, json=customer)
result = response.json() # Get the server response
print(result)
```

To fix the "This is a development server. Do not use it in a production deployment. Use a production WSGI server instead." warning:
* Consider creating a WSGI server using gunicorn. Use the command `pip install gunicorn` to install it. To run the WSGI server, simply execute the command `gunicorn --bind 0.0.0.0:9696 churn:app`. Note that in **churn:app**, 'churn' is the name we set for the file containing the code `app = Flask('churn')` (e.g., churn.py). You may need to change it to match the name of your Flask app file. 
* Windows users need to use an alternative library, `waitress`, because the windows system do not support some dependencies of the gunicorn library. Use the command `pip install waitress` to install it.
* To run the waitress WSGI server, use the command `waitress-serve --listen=0.0.0.0:9696 churn:app`. 
* To test it, you can run the code above and the result will be the same.

Making requests

In [None]:
import requests

In [None]:
url = 'http://localhost:9696/predict'

In [None]:
customer = {
    'gender': 'female',
    'seniorcitizen': 0,
    'partner': 'yes',
    'dependents': 'no',
    'phoneservice': 'no',
    'multiplelines': 'no_phone_service',
    'internetservice': 'dsl',
    'onlinesecurity': 'no',
    'onlinebackup': 'yes',
    'deviceprotection': 'no',
    'techsupport': 'no',
    'streamingtv': 'no',
    'streamingmovies': 'no',
    'contract': 'two_year',
    'paperlessbilling': 'yes',
    'paymentmethod': 'electronic_check',
    'tenure': 1,
    'monthlycharges': 29.85,
    'totalcharges': 29.85
}

In [None]:
response = requests.post(url, json=customer).json()

In [None]:
response

In [None]:
if response['churn']:
    print('sending email to', 'asdx-123d')

## 5.5 Python virtual environment: Pipenv

* Dependency and environment management
* Why we need virtual environment
* Installing Pipenv
* Installing libraries with Pipenv
* Running things with Pipenv

## 5.6 Environment management: Docker

* Why we need Docker
* Running a Python image with docker
* Dockerfile
* Building a docker image
* Running a docker image

## 5.7 Deployment to the cloud: AWS Elastic Beanstalk (optional)

* Installing the eb cli
* Running eb locally
* Deploying the model

## 5.8 Summary

* Save models with picke
* Use Flask to turn the model into a web service
* Use a dependency & env manager
* Package it in Docker
* Deploy to the cloud

## 5.9 Explore more

* Flask is not the only framework for creating web services. Try others, e.g. FastAPI
* Experiment with other ways of managing environment, e.g. virtual env, conda, poetry.
* Explore other ways of deploying web services, e.g. GCP, Azure, Heroku, Python Anywhere, etc