*Python Machine Learning 2nd Edition* by [Sebastian Raschka](https://sebastianraschka.com), Packt Publishing Ltd. 2017

Code Repository: https://github.com/trungngv/python-machine-learning-book-2nd-edition

Code License: [MIT License](https://github.com/rasbt/python-machine-learning-book-2nd-edition/blob/master/LICENSE.txt)

# Python Machine Learning - Code Examples

# Week 6 - Deploying a Machine Learning Model

Slides: [https://](https://)

### Overview

- [Week 5 recap - Training a model for movie review classification](#Chapter-6-recap---Training-a-model-for-movie-review-classification)

- [Serializing fitted scikit-learn estimators](#Serializing-fitted-scikit-learn-estimators)
- [Deploying model as a Flask web application](#Deploying-model-as-a-Flask-web-application)
    - [Local deployment](#Local-deployment)
    - [Cloud deployment with Heroku](#Cloud-deployment-with-Heroku)
- [Using Amazon SageMaker](#Using-Amazon-SageMaker)
- [Cognitive services API](#Cognitive-Services)
    - [API made easy with Postman](#API-made-easy-with-Postman)


# Week 5 recap - Training a model for movie review classification

This section is a recap of the logistic regression model that was trained in the last section of Chapter 5. Execute the folling code blocks to train a model that we will serialize in the next section.

In [1]:
import pandas as pd
from sklearn.model_selection import train_test_split

df = pd.read_csv('movie_data.csv.gz')
df.head()

Unnamed: 0,review,sentiment
0,"In 1974, the teenager Martha Moxley (Maggie Gr...",1
1,OK... so... I really like Kris Kristofferson a...,0
2,"***SPOILER*** Do not read this, if you think a...",0
3,hi for all the people who have seen this wonde...,1
4,"I recently bought the DVD, forgetting just how...",0


In [2]:
X_train, X_test, y_train, y_test = train_test_split(df.review, df.sentiment, test_size=0.3)

In [3]:
from sklearn.feature_extraction.text import HashingVectorizer
from sklearn.linear_model import SGDClassifier
from sklearn.pipeline import Pipeline

vect = HashingVectorizer(decode_error='ignore', 
                         n_features=2**21,
                         preprocessor=None,
                         lowercase=True,
                         stop_words='english',
                         ngram_range=(1,2))
clf = SGDClassifier(loss='log', random_state=1, max_iter=1)

pipeline = Pipeline([
    ('feat', vect),
    ('clf', clf)
])

pipeline.fit(X_train, y_train)

Pipeline(memory=None,
     steps=[('feat', HashingVectorizer(alternate_sign=True, analyzer='word', binary=False,
         decode_error='ignore', dtype=<class 'numpy.float64'>,
         encoding='utf-8', input='content', lowercase=True,
         n_features=2097152, ngram_range=(1, 2), non_negative=False,
         norm='l2', pr...lty='l2', power_t=0.5, random_state=1, shuffle=True,
       tol=None, verbose=0, warm_start=False))])

**Note**

If you are using scikit-learn < 0.19, please replace `n_iter` by `max_iter` in the code example above.

In [4]:
print('Accuracy: %.3f' % pipeline.score(X_test, y_test))

Accuracy: 0.853


# Serializing fitted scikit-learn estimators

After we trained the logistic regression model as shown above, we now save the pipeline as a serialized object to our local disk so that we can use the fitted classifier in our web application later.

In [5]:
from sklearn.externals import joblib
import os

joblib.dump(pipeline, os.path.join('movieclassifier', 'classifier.pkl'))

['movieclassifier/classifier.pkl']

In [6]:
!ls -thal movieclassifier/

total 32816
-rw-r--r--   1 trung  staff    16M Sep  4 21:22 classifier.pkl
drwxr-xr-x   8 trung  staff   272B Sep  4 21:21 [34m..[m[m
drwxr-xr-x  12 trung  staff   408B Sep  4 20:41 [34m.[m[m
-rw-r--r--   1 trung  staff    50B Sep  4 20:41 requirements.txt
drwxr-xr-x   4 trung  staff   136B Sep  4 20:35 [34mbin[m[m
-rw-r--r--   1 trung  staff    12B Sep  4 20:32 Procfile
-rwxr-xr-x   1 trung  staff   1.0K Sep  3 22:03 [31mapp.py[m[m
-rw-r--r--   1 trung  staff   2.0K Sep  3 22:03 movie_classifier.log
drwxr-xr-x   4 trung  staff   136B Sep  3 21:53 [34m__pycache__[m[m
drwxr-xr-x   5 trung  staff   170B Sep  3 21:53 [34m.ipynb_checkpoints[m[m
-rw-r--r--   1 trung  staff     0B Sep  3 21:53 __init__.py
-rw-r--r--   1 trung  staff   253B Sep  3 21:36 preprocess.py


Try loading the classifier and make a prediction.

In [7]:
clf = joblib.load(os.path.join('movieclassifier', 'classifier.pkl'))

In [8]:
import numpy as np
label = {0:'negative', 1:'positive'}

example = ['I love this movie']
print('Prediction: %s\nProbability: %.2f%%' %\
      (label[clf.predict(example)[0]], 
       np.max(clf.predict_proba(example))*100))

Prediction: positive
Probability: 90.96%


# Deploying model as a Flask web application

## Local deployment

We will use [Flask](http://flask.pocoo.org/), a microframework for Web development to expose the model as an API.  

Install by 

    * pip install Flask

To run the web applications locally, `cd` into the respective directory (as listed above) and execute the main-application script, for example,

    cd ./movie_classifier
    python3 app.py
    
Now, you should see something like
    
     * Running on http://127.0.0.1:5000/
     * Restarting with reloader
     
in your terminal.
Next, open a web browser and enter the address displayed in your terminal (typically http://127.0.0.1:5000/) to view the web application.

## Cloud deployment with Heroku

Two limitations of the local deployment

- Model deployed locally cannot be used by the public. We want to deploy the model on a machine hosted in a cloud such as AWS, Google Cloud, Azure, or interal computing cluster of your company
- Flask is mainly for development purpose, not a fully-fledge web server. We need to use a web server which supports Flask for production deployment.



### Using Heroku

- Heroku is a cloud platform which offer fully-managed services that let companies build, deliver, monitor and scale apps quickly without having to manage infrastructures.
- Heroku runs your apps inside dynos — smart containers on a reliable, fully managed runtime environment. 

We will use the Python Runtime to deploy our model as a Python web app.

Beside the web app code in `app.py` and the model artifact, we need to provide the 3 following files:

    - requirements.txt: libraries to install in the run time environment
    - Procfile: configure the start script
    - bin/web: the start script


In [11]:
!cat movieclassifier/requirements.txt

numpy==1.14.3
scipy==1.1.0
pandas==0.23.0
scikit-learn==0.19.1
gunicorn==19.9.0
Babel==0.9.6
click==6.7
docutils==0.11
Flask==0.12.2
itsdangerous==0.24
Jinja2==2.6
MarkupSafe==0.11
nose==1.3.0
Pygments==1.5
simplejson==3.2.0
Sphinx==1.1.3
virtualenv==13.1.0
Werkzeug==0.14.0


These libraries will be installed by Heroku on the machine where the web app is hosted. They must be specified in the file named `requirements.txt`.

In [10]:
!cat movieclassifier/Procfile

web: bin/web

This Procfile tells Heroku to run the script in `bin/web` to start the web app.

In [13]:
!cat movieclassifier/bin/web

python app.py &
gunicorn -b '0.0.0.0:'$PORT --log-level INFO app:app

In the start script, we run the Flask application and gunicorn web server which will manage load and route request to the Flask app.

## Live demo 

Live demo to show how to deploy an app. It's as simple as pushing code to a git repo! The repo for this purpose can be cloned from [this repository](https://github.com/trungngv/heroku-python-ml-example).

To try the model API, open your browser and paste the following link:

https://movie-classifier.herokuapp.com/review?text=i%20love%20this%20movie

You should see the output `{"sentiment": "positive"}`

# Using Amazon SageMaker 

Live walk through + see the notebook `sagemaker_telecom_customer_churn.ipynb`

# Cognitive Services 

Instead of building models, we can use services provided by others. This is becoming more and more popular as AI is being more democratized. 

## Main service providers:
- Google
- Microsoft
- IBM 
- Amazon 
- Other vendors with specialized solutions

## Cognitive services:
- Natural language
    - Sentiment analysis
    - Keyphrases detection
    - Entity recognition / identification
    - Language detection
    - Content classification
    - Topic modelling
    - Relation detection
- Computer Vision
    - Image classification
    - Objects detection
    - Scene and activity recognition
    - Celebrity recognition
    - Landmark recognition
    - Optical character recognition (OCR)
    - Handwriting recognition
    - Face detection
    - Person identification 
    - Emotion recognition / facial analysis
    - Similar face recognition and grouping
    - Logo detection
- Speech
    - Speech-to-text
    - Text-to-speech
    - Translation
    
## API made easy with Postman


https://www.getpostman.com/