# ![](https://ga-dash.s3.amazonaws.com/production/assets/logo-9f88ae6c9c3871690e33280fcf557f33.png) Deploying your work



# Learning objectives
*You will be able to...*

- Identify use cases for *deploying* your ML models
- Stand up a Flask server and API endpoint
- Respond to an HTTP request with model predictions
- Pickle your trained models for later use
- Pass arguments to an endpoint via a simple web form


### Why deploy your models?

You've worked on amazing projects in this class.

Your notebooks are great for prototypes, but let's set these models free!

### Why deploy your models?

*Productionalizing* a model means making it ready for real-world usage. This might require:
- Dealing with high volumes of data
- Being robust to malformed inputs
- Integrating with existing tools through an API
- Allowing access only to certain users
- Handling edge cases, like empty inputs or outputs

*Deploying* a productionalized model means making it available to use.

### How do you productionalize?

This gets into *engineering*, and alas is not something we can cover in-depth. If you want to learn more, begin with:
- Logging
- Error handling
- Feature branching
- Database resiliency
- Credentials
- The causes of technical debt, in software engineering generally and machine learning specifically

There is recommended reading on all of this in the Further Reading section.

### How will *we* productionalize and deploy?

You can turn your models into a usable (protoype) services with three steps:
- *Pickling* a trained model
- Setting up a local web server
- Hooking your trained model to an API endpoint on that server

# Pickling

This is the process of "serializing" a Python object, i.e. writing an object as a byte stream to a file.

You can save this to disk and use it later!

Let's say you have an object you need to work with again. Take a list as an example:

In [None]:
sample = ['This', 'list', 'is', 'too', 'big']

Maybe that object is too big and is clogging up your notebook. Or maybe it took you a while to create that object, and you want to save it.

Let's pickle it!

In [None]:
import pickle

In [None]:
with open('my_pickled_sample.pkl', 'w') as picklefile:
    pickle.dump(sample, picklefile)

The object is still here:

In [None]:
sample

And it's in our folder:

In [None]:
!ls

But we can delete the object:

In [None]:
del(sample)

It's gone!

In [None]:
print(sample)

And reload it from the pickled file:

In [None]:
with open('my_pickled_sample.pkl', 'r') as picklefile:
    the_same_sample = pickle.load(picklefile)

It's back!

In [None]:
print(the_same_sample)

You might be asking **wait, why would I pickle something instead of writing it to a file?**

Good question. Saving things as .csv, .txt, and .json is still useful if you need to save a file. Especially if other people need to use it. 



But pickling is useful because it compresses your data **as is** -- you can pickle almost every Python data structure. That includes:
- functions
- classes
- numpy arrays
- fitted models

If you have a model that took a long time to fit, and you want to save that model instead of refitting it, you can do something like this:

```python
X = df[features]
y = df['price']

model = lm.LinearRegression
model.fit(X,y)

#Now the model is fitted, so save it!

with open('pickled_model.pkl', 'w') as picklefile:
    pickle.dump(model, picklefile)
```

    

### Let's do that with a Titanic survival model

Our classifier algorithm will be a random forest, which as you know is relatively slow to train.

In [None]:
import numpy as np
import pandas as pd
from sklearn.ensemble import RandomForestClassifier

df = pd.read_csv('assets/datasets/titanic.csv')
include = ['Pclass', 'Sex', 'Age', 'Fare', 'SibSp', 'Survived']

# Create dummies and drop NaNs
df['Sex'] = df['Sex'].apply(lambda x: 0 if x == 'male' else 1)
df = df[include].dropna()

X = df[['Pclass', 'Sex', 'Age', 'Fare', 'SibSp']]
y = df['Survived']

## Note we're fitting on the entire dataset; we'll assume
## that 100 estimators and default hyperparameters are optimal
PREDICTOR = RandomForestClassifier(n_estimators=100).fit(X, y)

In [None]:
with open('titanic_rfc.pkl', 'w') as picklefile:
    pickle.dump(PREDICTOR, picklefile)

# Warning

### Never open a pickled file from an unknown source!

## Introduction to Flask
![flask logo](http://flask.pocoo.org/static/logo/flask.png)


Flask is a fast, lightweight way to connect your Python scripts to a server. It's a simple and robust framework that can do small tasks (create a microblog, stand up a simple API) or complex ones (Pinterest's API, create a twitter clone).

Let's jump in with a simple example. Then, we'll expand it to show what it can do with your models. But first you may need to:

```bash
$ pip install Flask
```



## Hello, world.
Create a new file called `hello.py` . Type in this code line by line. No copy pasting!

```Python
import flask
app = flask.Flask(__name__)

@app.route("/")
def hello():
    return "Hello World!"

if __name__ == '__main__':
    app.run(debug=True)
```




Three things happen here:
- initialize the app
- use built-in decorators\* to define what happens on a page
- launch the app

\* I'll explain soon...

Now launch the file from your command line:

```bash
$ python hello.py
* Running on http://127.0.0.1:5000/ (Press CTRL+C to quit)
```

Go to that URL$^*$ in your browser to see your app running on your `localhost`. (Or use `curl`!)


$^*$ **Your port may not be 5000. There may also be a longer URL which you need to copy-paste exactly.**


### Arguments and styling

Add the following route underneath the hello() function:

```Python
@app.route('/greet/<name>')
def greet(name):
    '''Say hello to your first parameter'''
    return "Hello, {}!".format(name)
```

Save, and the app should automatically restart; if it doesn't, you can `ctrl-C` and launch it again yourself. Now navigate to `http://127.0.0.1:5000/greet/Roger`. Your function should respond to that input!



### Decorators

What's a decorator? The short story: they put wrap one function around another. So the previous function is actually:

```Python
app.route(greet(name))
```

See the Further Reading section for a full tutorial. For now, we just need to know that the `@app.route(endpoint)` pattern is how you tell Flask to listen to a particular URL, and what to do if requests are sent there.

Since the `return` statement sends text to an HTML page, you can style it with HTML tags:

```Python
@app.route("/")
def hello():
    return '''
    <body>
    <h2> Hello World! <h2>
    </body>
    '''
```

(If you make any coding mistakes, your server may shut down with an error message. Fix the code and rerun!)



## Add in machine learning
We can use Flask to share our machine learning predictions.

Create a new file `application.py`. Import and initialize the flask app, and launch the server at the bottom. Leave room in the middle to add in your model and routes later on.




```Python
import flask
app = flask.Flask(__name__)

#-------- MODEL GOES HERE -----------#

#-------- ROUTES GO HERE -----------#

if __name__ == '__main__':
    '''Connects to the server'''

    HOST = '127.0.0.1'
    PORT = '4000'
    app.run(HOST, PORT)
```

Note that this time we specifed the host and port we want the app to run on.


### Deploying an SMS spam detector

Here's an idea for a first ML app: something to identify text message spam.

You already know how to make this! Let's hook it up to an endpoint.

Import the libraries we need, load our SMS spam dataset from the NLP lesson, and define our target and feature variables.

```Python

#-------- MODEL GOES HERE -----------#
import pandas as pd
import numpy as np
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.naive_bayes import MultinomialNB

df = pd.read_csv('./assets/datasets/SMSSpamCollection.txt', sep='\t', header=None)
df.columns = ['target', 'msg']
y = df['target']
X = df['msg']
```

Now vectorize the features (with help from TfidfVectorizer) and fit a Naive Bayes model.

```Python
# Tfidf, filter stop words, 300 features
cvec = TfidfVectorizer(stop_words='english', max_features = 300)
X = cvec.fit_transform(X)
clf = MultinomialNB()
clf.fit(X, y)
```


### Make a simple API
Here's the fun part. Now that we have a classifier, we need to get some values to make our predictions.

One way to do this is to get information from the **URL parameters**. These are the part of a URL that come after the `?` and are matched by key:value pairs. For example, if you navigate to
`http://localhost:4000/is_spam?msg="50% off fidget spinners!"` Flask can retrieve that message data for you. Let's write a route to do just that:


This will look for a `GET` request containing a `msg` key at the `/is_spam` endpoint.

So visiting `http://localhost:4000/is_spam?msg="50% off fidget spinners!"` will set `flask.request.args['msg']` to "50% off fidget spinners!". Then we transform this into something the model can read.

```Python
#-------- ROUTES GO HERE -----------#
@app.route('/is_spam', methods=["GET"])
def is_spam():
    msg = pd.Series(flask.request.args['msg'])
    X_new = cvec.transform(msg)
    score = clf.predict(X_new)
    results = {'prediction': score[0]}
    return flask.jsonify(results)
```

And... voila! Save the file. Launch your app. (Give it 5-10 seconds to start.) You now have a simple API for your model. Try visiting:

`http://localhost:4000/is_spam?msg="50% off fidget spinners!"`

Or

`http://localhost:4000/is_spam?msg="pick up milk at the store"`

Play with the message in the URL. You should get a single prediction of spam or ham in neatly formatted JSON. (`localhost` is just a variable set to 127.0.0.1 by default.)


### Customizing your API

`request.args` is just a dictionary holding the key:value pairs we sent to Flask.

Let's write a new route for a Titanic predictor endpoint! Add this above your `is_spam()` function.


```Python
#-------- ROUTES GO HERE -----------#
@app.route('/predict', methods=["GET"])
def predict():
    pclass = flask.request.args['pclass']
    sex = flask.request.args['sex']
    age = flask.request.args['age']
    fare = flask.request.args['fare']
    sibsp = flask.request.args['sibsp']

    item = [pclass, sex, age, fare, sibsp]
    score = PREDICTOR.predict_proba(item)
    results = {'survival chances': score[0,1], 'death chances': score[0,0]}
    return flask.jsonify(results)
```

### Aren't we forgetting something...?

We also need the model, `PREDICTOR`! We could rebuild it on the spot, like we did with our Naive Bayes model, but instead of doing all that work let's use the pickled version.

(Plus, Random Forests are non-deterministic models -- so you need to use the pickled model to ensure you get the same results every time.)

Add this above your spam/ham model code.

```python
#-------- MODEL GOES HERE -----------#
import pickle

with open('titanic_rfc.pkl', 'r') as picklefile:
    PREDICTOR = pickle.load(picklefile)
```

Done! Save the file. Restart the app:

```bash
* Running on http://127.0.0.1:4000/ (Press CTRL+C to quit)
> ^C
> python application.py
```
Then try visiting:

http://localhost:4000/predict?pclass=2&sex=1&age=18&fare=500&sibsp=1

Play with the parameters in the URL. You should get the predicted probability of death and survival.



Now try with `curl` at your command line:

`curl -X GET "http://localhost:4000/predict?pclass=2&sex=1&age=18&fare=500&sibsp=1"`

(Hitting the spam/ham endpoint with `curl` is a little more complicated, because you need encode special characters like whitespaces.)

## Make a simple webform.
Well that was exciting. But it doesn't look very nice. Let's create a simple webform to read in the inputs. Create a file named `page.html` with this HTML:

```html
<html>
  <head>
    <title> Titanic Survivor-O-Matic </title>
  </head>
   <body>
      <form action = "http://localhost:4000/result" method = "POST">
         <p>Class <input type = "int" name = "pclass" /></p>
         <p>Sex <input type = "int" name = "sex" /></p>
         <p>Age <input type = "int" name = "age" /></p>
         <p>Fare <input type ="int" name = "fare" /></p>
         <p># of siblings <input type ="int" name = "sibsp" /></p>

         <p><input type = "submit" value = "submit" /></p>
      </form>
   </body>
</html>
```



Flask knows how to read `form` tags in an HTML file that have been `POST`ed to the server.

Add two new decorators in below your first one.

```Python
#---------- CREATING AN API, METHOD 2 ----------------#

# This method takes input via an HTML page
@app.route('/page')
def page():
   with open("page.html", 'r') as viz_file:
       return viz_file.read()

@app.route('/result', methods=['POST', 'GET'])
def result():
    '''Gets prediction using the HTML form'''
    if flask.request.method == 'POST':

       inputs = flask.request.form

       pclass = inputs['pclass'][0]
       sex = inputs['sex'][0]
       age = inputs['age'][0]
       fare = inputs['fare'][0]
       sibsp = inputs['sibsp'][0]

       item = np.array([pclass, sex, age, fare, sibsp])
       score = PREDICTOR.predict_proba(item)
       results = {'survival chances': score[0,1], 'death chances': score[0,0]}
       return flask.jsonify(results)

```

Save, close, and relaunch the app. Go to `http://127.0.0.1:4000/page` and type in your inputs.

Both methods should still be there. You can either play with the URL parameters at `/predict` or enter them at `/page`




## Takehome exercises
See if you can customize and play around with the app you just built. Try the following things:
- Comment through the code so you understand what's happening.
- Make the app look nicer by playing with the HTML.
- Change the model or the features used for prediction.
- See if you can return more values to the page, like the model's parameters.
- Modularize! Take the modeling code out of this file. For the spam/ham model, fit the vectorizer *and* the model in a different .py file; pickle both and load those files into the app.



## Examples
Here are some examples of Flask apps in action. Fork and clone the apps you like so you can play with them and edit them on your local machine.

Two apps using scikit-learn:
- [Visualizing the Iris dataset using Flask and Angular JS](https://github.com/ColCarroll/flask_angular_example)
- [Using Neural Nets to recognize images](https://github.com/mdlai/digit_recognition)

More websites built in Flask:
- [The Flask Website itself!](http://flask.pocoo.org/)
- [A reddit clone](https://github.com/codelucas/flask_reddit)



# Further reading

### Flask

- [The Flask Documentation](http://flask.pocoo.org/docs/0.11/)
- [A Flask tutorial to follow along with](https://github.com/miguelgrinberg/flask-pycon2014)
- [Another tutorial that gets into CSS styling](https://code.tutsplus.com/tutorials/an-introduction-to-pythons-flask-framework--net-28822)
- [The Flask mega tutorial](http://blog.miguelgrinberg.com/post/the-flask-mega-tutorial-part-ii-templates)
- [Flask's development server is not for production](https://stackoverflow.com/questions/12269537/is-the-server-bundled-with-flask-safe-to-use-in-production)
- [Setting up Flask on AWS EC2](http://bathompso.com/blog/Flask-AWS-Setup/). This should be your next step if you want to share your model with the world!
- [A great guide to those weird "decorators"](http://simeonfranklin.com/blog/2012/jul/1/python-decorators-in-12-steps/).

### Production coding

- Add [logging](https://fangpenlin.com/posts/2012/08/26/good-logging-practice-in-python/) to your code; you'll be very glad you did.
- Think ahead and include [error handling](https://eli.thegreenplace.net/2008/08/21/robust-exception-handling/), via [try/except clauses](https://jeffknupp.com/blog/2013/02/06/write-cleaner-python-use-exceptions/)
- Get more comfortable with git, including [feature branching](https://www.atlassian.com/git/tutorials/comparing-workflows/feature-branch-workflow).
- Include [unit tests](http://www.diveintopython.net/unit_testing/index.html); the [pytest module](http://pythontesting.net/framework/pytest/pytest-introduction/) is great.
- [Integrate databases](http://zetcode.com/db/sqlitepythontutorial/)!
- Beware technical debt, especially [machine learning technical debt](https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/43146.pdf).