# MLflow Prophet Tutorial

This `train.pynb` Jupyter notebook predicts page views of a wikipedia page using [Prophet](https://facebook.github.io/prophet/).  

> This is the Jupyter notebook version of the `train.py` example

Attribution
* The data set used in this example is from https://github.com/facebook/prophet/blob/master/examples/example_wp_log_peyton_manning.csv
* The data was scraped from https://en.wikipedia.org/wiki/Peyton_Manning


In [30]:

import mlflow.pyfunc
import cloudpickle
import fbprophet
from fbprophet import Prophet

class FbProphetWrapper(mlflow.pyfunc.PythonModel):

    m = None

    def __init__(self, model):
        m = model
        super(FbProphetWrapper, self).__init__()


    def load_context(self, context):
        from fbprophet import Prophet
        return

    def predict(self, context, model_input):
        future = m.make_future_dataframe(periods=model_input['periods'])
        return m.predict(future)

conda_env = {
    'channels': ['defaults'],
    'dependencies': [
      'fbprophet={}'.format(fbprophet.__version__),
      'cloudpickle={}'.format(cloudpickle.__version__),
    ],
    'name': 'fbp_env'
}

In [31]:
# page view stats
def train(rolling_window):
    import os
    import warnings
    import sys

    import pandas as pd
    import numpy as np
    # Python
    from fbprophet import Prophet
    from fbprophet.diagnostics import cross_validation
    from fbprophet.diagnostics import performance_metrics

    import mlflow
    import mlflow.pyfunc
    
    import logging
    logging.basicConfig(level=logging.WARN)
    logger = logging.getLogger(__name__)

 
    warnings.filterwarnings("ignore")
    np.random.seed(40)

    # Read the csv file from the URL
    csv_url =\
        'https://raw.githubusercontent.com/facebook/prophet/e21a05f4f9290649255a2a306855e8b4620816d7/examples/example_wp_log_peyton_manning.csv'
    try:
        df = pd.read_csv(csv_url)
    except Exception as e:
        logger.exception(
            "Unable to download training & test CSV, check your internet connection. Error: %s", e)

    
    # Useful for multiple runs (only doing one run in this sample notebook)    
    with mlflow.start_run():
        m = Prophet()
        m.fit(df)

        # Evaluate Metrics
        df_cv = cross_validation(m, initial='730 days', period='180 days', horizon = '365 days')
        df_p = performance_metrics(df_cv, rolling_window=rolling_window)

        # Print out metrics
        print("Prophet model (rolling_window=%f):" % (rolling_window))
        print("  CV: \n%s" % df_cv.head())
        print("  Perf: \n%s" % df_p.head())

        # Log parameter, metrics, and model to MLflow
        mlflow.log_param("rolling_window", rolling_window)
        mlflow.log_metric("rmse", df_p.loc[0,'rmse'])

        mlflow.pyfunc.log_model("model", conda_env=conda_env, python_model=FbProphetWrapper(m) )

In [32]:
train(0.1)

INFO:fbprophet:Disabling daily seasonality. Run prophet with daily_seasonality=True to override this.
INFO:fbprophet:Making 11 forecasts with cutoffs between 2010-02-15 00:00:00 and 2015-01-20 00:00:00
Prophet model (rolling_window=0.100000):
  CV: 
          ds      yhat  yhat_lower  yhat_upper         y     cutoff
0 2010-02-16  8.960441    8.459464    9.468232  8.242493 2010-02-15
1 2010-02-17  8.726966    8.233446    9.192014  8.008033 2010-02-15
2 2010-02-18  8.610869    8.096453    9.075837  8.045268 2010-02-15
3 2010-02-19  8.532795    8.037051    9.053431  7.928766 2010-02-15
4 2010-02-20  8.274904    7.807443    8.790022  7.745003 2010-02-15
  Perf: 
  horizon       mse      rmse       mae      mape  coverage
0 37 days  0.495161  0.703677  0.505237  0.058536  0.689127
1 38 days  0.500957  0.707783  0.510231  0.059114  0.687985
2 39 days  0.523235  0.723350  0.516340  0.059715  0.685244
3 40 days  0.530583  0.728411  0.519241  0.060026  0.688899
4 41 days  0.538145  0.733584  0.