# Deployment (Including Serialization)
This notebook walks through the basics of how to set up a model to be served from a webserver.

In [1]:
%matplotlib inline 
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import joblib
plt.style.use('ggplot')

We can use the `joblib` library to deserialize the serialized pipeline.  HOWEVER... we need to make sure we have loaded all of the transformer classes into the scope here, or else deserialization will fail:

In [2]:
# pipe = joblib.load("train_pipe.joblib")

I've put all the relevant transformers in a separate script called `pipeline.py`, and we can import them all in one go:

In [3]:
from pipeline import *

In [24]:
import pipeline
dir(pipeline)

['BaseEstimator',
 'DateTimeExpander',
 'FeatureSelector',
 'FeelsLikeExpander',
 'LagExpander',
 'TargetDropper',
 'Temp',
 'TransformerMixin',
 '__builtins__',
 '__cached__',
 '__doc__',
 '__file__',
 '__loader__',
 '__name__',
 '__package__',
 '__spec__',
 'feels_like',
 'pd']

In [4]:
! cat pipeline.py

import pandas as pd
from sklearn.base import BaseEstimator, TransformerMixin

class FeatureSelector(BaseEstimator, TransformerMixin):

    def __init__(self, feature_names, ts_index):
        self.feature_names = feature_names
        self.index = ts_index

    def fit(self, X, y=None):
        return self

    def transform(self, X):
        X = X.set_index(pd.to_datetime(X[self.index]))
        return X[self.feature_names]

class DateTimeExpander(BaseEstimator, TransformerMixin):

    def __init__(self):
        pass

    def fit(self, X, y=None):
        return self

    def transform(self, X):
        dts = pd.Series(X.index).dt
        X["dts_month"] = dts.month.values
        X["dts_hour"] = dts.hour.values
        X["dts_day_of_week"] = dts.dayofweek.values

        return X

from meteocalc import Temp, feels_like
class FeelsLikeExpander(BaseEstimator, TransformerMixin):

    def __init__(self, temp_col, hum_col, windspeed_col, atemp_col):
   

Now the pipeline can be deserialized correctly

In [5]:
pipe = joblib.load("train_pipe.joblib")

We can see that the steps from the pipeline are perfectly preserved:

In [7]:
pipe.steps



[('feat_pipe',
  Pipeline(steps=[('feat_select',
                   FeatureSelector(feature_names=['temp', 'hum', 'windspeed',
                                                  'cnt'],
                                   ts_index=None)),
                  ('feat_dts', DateTimeExpander()),
                  ('feat_feels',
                   FeelsLikeExpander(atemp_col='atemp', hum_col=None,
                                     temp_col=None, windspeed_col=None)),
                  ('feat_lag', LagExpander(lag_col=None)),
                  ('target_dropper', TargetDropper(target_col=None))])),
 ('scaler', MinMaxScaler()),
 ('regressor', LinearRegression())]

Now we can load in some data for testing the deserialized pipeline.  We don't need to worry about train/test split here... this is just to verify that it works.

In [6]:
dat = pd.read_csv("../data/bike-hour-raw.csv")

Since the sklearn apis are vectorized, we can request and retrieve many predictions at once:

In [8]:
pipe.predict(dat[:10])

array([  6.30861094,  10.18964081,  19.43346503,  33.97426817,
        43.16755554,  37.72667015,  56.25009971,  60.13230485,
        80.01827321, 110.57811815])

When we want to make requests against a webserver, we'll need to *serialize* the data on our end in order to transmit it as a web request.

(Launch server from other notebook)

In [27]:
serialized_input = dat[:1].to_json()
serialized_input

'{"temp":{"0":3.28},"hum":{"0":81.0},"windspeed":{"0":0.0},"casual":{"0":3},"registered":{"0":13},"cnt":{"0":16},"dtetime":{"0":"2011-01-01 00:00:00"}}'

With properly serialized data, we can pass the payload as *POST* data inside a request, and our server can pick it up from there.

In [28]:
import requests 
  
url = "http://127.0.0.1:5000"
response = requests.post(url, data={"input": serialized_input})
response.json()

[6.308610938795027]