# Colab Hosting Example
Sample Jupyter notebook demonstrating how to host a ML model for demos. Tested on Colab but reusable anywhere Jupyter notebooks can be ran.

## Install dependencies

In [1]:
!pip install numpy scikit-learn flask flask-ngrok



## Load and split a dataset

In [2]:
from sklearn.datasets import load_boston
from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(*load_boston(return_X_y=True), test_size=0.33, random_state=8)

## Train some ML models

In [3]:
# Simple linear regression model
from sklearn.linear_model import LinearRegression

simple_est = LinearRegression()
simple_est.fit(X_train, y_train)
simple_est.score(X_test, y_test)

0.6921420412834411

In [4]:
# Boosting ensemble model
from sklearn.ensemble import GradientBoostingRegressor

boosting_est = GradientBoostingRegressor()
boosting_est.fit(X_train, y_train)
boosting_est.score(X_test, y_test)

0.8493915796629993

In [5]:
# Bagging ensemble model
from sklearn.ensemble import ExtraTreesRegressor

bagging_est = ExtraTreesRegressor()
bagging_est.fit(X_train, y_train)
bagging_est.score(X_test, y_test)

0.8804546279217711

## (Optional) Serialize and save ML models to disk

In [6]:
import pickle

for model in (simple_est, boosting_est, bagging_est):
    filename = type(model).__name__ + '.pickle'
    with open(filename, 'wb') as f:
        pickle.dump(model, f)

## (Optional) Load ML models from disk

In [7]:
import pickle

models = []

# Assuming same naming convention used in the previous code block
for filename in ('LinearRegression.pickle', 'GradientBoostingRegressor.pickle', 'ExtraTreesRegressor.pickle'):
    with open(filename, 'rb') as f:
        models.append(pickle.load(f))

simple_est, boosting_est, bagging_est = models

## Start a Flask server and tunnel to ngrok to expose externally

In [8]:
import numpy as np
from flask import Flask, request
from flask_ngrok import run_with_ngrok

app = Flask(__name__)
run_with_ngrok(app)

# Default to mean value if no param is provided by user
DEFAULT_PARAMS = {
    'CRIM': 3.61352,
    'ZN': 11.36,
    'INDUS': 11.14,
    'CHAS': 0.06917,
    'NOX': 0.5547,
    'RM': 6.285,
    'AGE': 68.57,
    'DIS': 3.795,
    'RAD': 9.549,
    'TAX': 408.2,
    'PTRATIO': 18.46,
    'B': 356.67,
    'LSTAT': 12.65,
}

def fill_params(initial, fill_with):
    out = {}
    for k in fill_with:
        out[k] = initial.get(k, False) or fill_with[k]
    return out

### Test fill_params
# test_input = {'INDUS': 19, 'NOX': 0.8}
# test_val = fill_params(test_input, DEFAULT_PARAMS)
# print(test_val)

def predict_price(model, X):
    # Models require data in shape [[x1, x2, x3, ...], [x1, x2, x3, ...], ...]
    X = [np.fromiter(X.values(), dtype=np.float64)]
    Y = None
    if model == 'simple':
        Y = simple_est.predict(X)
    elif model == 'boosting':
        Y = boosting_est.predict(X)
    else:
        Y = bagging_est.predict(X)
    # Models return data in shape [Y1, Y2, Y3, ...]
    Y = Y[0]
    return f'${Y*1000:,.2f}'

### Test predict_price
# test_val = {'CRIM': 3.61352, 'ZN': 11.36, 'INDUS': 19, 'CHAS': 0.06917, 'NOX': 0.8, 'RM': 6.285, 'AGE': 68.57, 'DIS': 3.795, 'RAD': 9.549, 'TAX': 408.2, 'PTRATIO': 18.46, 'B': 356.67, 'LSTAT': 12.65}
# test_price = predict_price('boosting', test_val)
# print(test_price)

@app.route('/api/v1/simple-est')
def simple_est_endpoint():
    return predict_price('simple', fill_params(request.args.to_dict(), DEFAULT_PARAMS))

@app.route('/api/v1/boosting-est')
def boosting_est_endpoint():
    return predict_price('boosting', fill_params(request.args.to_dict(), DEFAULT_PARAMS))

@app.route('/api/v1/bagging-est')
def bagging_est_endpoint():
    return predict_price('bagging', fill_params(request.args.to_dict(), DEFAULT_PARAMS))

app.run()

 * Serving Flask app "__main__" (lazy loading)
 * Environment: production
   Use a production WSGI server instead.
 * Debug mode: off
 * Running on http://127.0.0.1:5000/ (Press CTRL+C to quit)
 * Running on http://017e51068ca4.ngrok.io
 * Traffic stats available on http://127.0.0.1:4040


## Using the endpoints
If the code block above ran successfully, you should see some output that looks like this
```
* Serving Flask app "__main__" (lazy loading)
* Environment: production
 WARNING: This is a development server. Do not use it in a production deployment.
 Use a production WSGI server instead.
* Debug mode: off
* Running on http://127.0.0.1:5000/ (Press CTRL+C to quit)
* Running on http://xxxxxxxxxxxx.ngrok.io
* Traffic stats available on http://127.0.0.1:4040
```

You can now query the `ngrok` URL like a normal REST API to use the ML models. Examples:

### Use the `bagging` model to predict the price of a Boston house that has 8 rooms:
```
http://xxxxxxxxxxxx.ngrok.io/api/v1/bagging-est?RM=8
```

### Use the `simple` model to predict the price of a Boston house on a tract that's next to the Charles River and has a pupil-teacher ratio of 13:
```
http://xxxxxxxxxxxx.ngrok.io/api/v1/simple-est?CHAS=1&PTRATIO=13
```

## API Reference
See [this notebook](https://rstudio-pubs-static.s3.amazonaws.com/364346_811c9012a14847428c9b1fc1e956431a.html) for a more detailed explanation of this dataset.

### Parameters (same for all 3 endpoints)
- `CRIM`: per capita crime rate by town
- `ZN`: proportion of residential land zoned for lots over 25,000 sq.ft.
- `INDUS`: proportion of non-retail business acres per town
- `CHAS`: Charles River dummy variable (= 1 if tract bounds river; 0 otherwise)
- `NOX`: nitric oxides concentration (parts per 10 million)
- `RM`: average number of rooms per dwelling
- `AGE`: proportion of owner-occupied units built prior to 1940
- `DIS`: weighted distances to five Boston employment centres
- `RAD`: index of accessibility to radial highways
- `TAX`: full-value property-tax rate per \$10,000
- `PTRATIO`: pupil-teacher ratio by town
- `B`: 1000(Bk - 0.63)^2 where Bk is the proportion of blacks by town
- `LSTAT`: % lower status of the population
- `MEDV`: Median value of owner-occupied homes in \$1000's

If no input is passed for a specific parameter, the mean of the dataset is used.

### Endpoints
- GET /api/v1/simple-est
- GET /api/v1/boosting-est
- GET /api/v1/bagging-est

All endpoints return the predicted price of the house.
