<a href="https://colab.research.google.com/github/paullo0106/prophet_anomaly_detection/blob/master/prophet_anomaly_detection.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Anomaly detection using Facebook Prophet:

*This notebook is based on the following [tutorial](https://medium.com/analytics-vidhya/time-series-forecast-anomaly-detection-with-facebook-prophet-558136be4b8d) written by Paul Lo. It makes use of the open-source project [Prophet](https://facebook.github.io/prophet/), a forecasting procedure implemented in R and Python, based on the paper of [Taylor and Letham, 2017](https://peerj.com/preprints/3190/).*

**Goal of the script:**

Here, we aim to the detect activity of resident over time, defined pattern as a **transient**, reversible and significant change in one single parameter(s).
Examples of patterns:
* Someone goes to bed (CO2 decreases or increases and stay stable)
* Someone wakes up in the morning
* Someone wakes up nightly for a bathroom breaks
* Unknown activities that nevertheless occur regularly

**Motivations to use a forecasting method to detect activity:**

In the notebook Data_Exploration_Guillaume.ipynb, I plotted the data in 2 and 3 dimensions. The limitations were the following:
- 2D plotting could be used to detect transient changes in time series, but the significant changes between day and night baselines make the detection difficult.
- 3D plotting and k-means clustering between three parameters (light, CO$_{2}$, temperature or humidity) might provide information on the physical relationships between the parameters, but not on the activity of the resident.

Since ederly people tends to have the same rythm every day (e.g. wakes up, has a walk, goes to bed at the same time of the day), we reasoned that the trend in CO$_{2}$ should be quite similar between days or weekdays. This seasonality is well handled by the algorith [Prophet](https://facebook.github.io/prophet/) from Facebook, dedicated to perform 'time series forecasting' from seasonal data.

![Prophet_figure.png](Prophet_figure.png)

We took advantage of the continuous recording of CO$_{2}$ from Caru’s sensor, to model the activity patterns of resident. The data from the ten last days of recording were used to make a model of the typical activity pattern of the resident. For instance, if the sensor is located in the bedroom of the resident, the model will predict that the CO$_{2}$ will increase and stay stable from the time the resident goes to bed until he wakes up and leaves the bedroom.

A GridSearch function was designed to optimize the model by tuning the hyperparameters. The GridSearch returned a table containing the tested parameters and sorted according to the **Mean Average Percentage Error**. Then the optimized model is used to predict how the activity pattern of today should be. The predicted pattern from the model is shown in blue. By comparing the predicted results with the latest measurements, we detect how often, and when, a resident had an unusual activity. Unusual activity is detected in red in the graph.


**Structure of the script:**

This script is designed to:
1. Load a device-specific dataset:
    - either located in the folder ../data/interim of the repository
    - or located in an Amazon S3 bucket.
2. Pre-visualize one part of the device-specific dataset and choose:
    - the parameter to study
    - begin and end of dataframe
    - the period of data averaging (example: use one point every 1min instead of 1 point every 20sec)
3. Training a Prophet model:
    * **Run a single instance** with specific parameters of Prophet
    * **Hyperparameter tuning**: execute a GridSearch to test several ranges of parameters of Prophet. This returns a table of the tested parameters and the median of the mean average percentage error calculated on the predicted data. Since the data varies from one device to the other, this function helps to determine the best "device-specific" parameters for prediction. As an example, I ran 4 tests using the device 33:
        * Test 1: Testing different sampling periods, changepoint_prior_scale and daily_fo of the Prophet model
        * Test 2: Testing different shorter sampling periods, changepoint_prior_scale and daily_fo of the Prophet model
        * Test 3:  Testing with different training period duration, changepoint_prior_scale and daily_fo of the Prophet model
        * Test 4: Testing different training periods, daily_fo of the Prophet model



**Run the script:**

How to run the script:
* Load the script in jupyter notebook (not jupyter lab because the interactive plots are not shown in jupyter lab)
* **To load the data:** Choose your device number.
* **To visualize the data before analysis:** Look at the data using df_generator. The arguments are:
```python
df, predict_n, today_index, lookback_n = df_generator(
                df_dev, # device-specific dataframe
                parameter, # String. among 'light', 'temperature', 'humidity', 'co2'
                begin, # String. Day of the beginning of the new dataframe
                end, # String. Day of the end of the new dataframe
                sampling_period_st, # String. Duration of bin for data downsampling
                sampling_period_num, # Float. Duration of bin for data downsampling
                graph=None, # Set to None to show the graph and a value not to show it
                predict_day=1 # Number of days predicted. 1 by default
)
```
* **To predict the data:** Use and customize the cell containing:
```python
df_p, df_pred = prophet(
                df_dev,
                device,
                parameter='co2',
                begin='2019-03-26',
                end='2019-04-03',
                sampling_period_min=5,
                graph=1, 
                predict_day=1,
                interval_width=0.6, # Anomaly threshold
                changepoint_prior_scale=0.01, # Adjusting trend flexibility
                daily_fo = 12 # Fourier order defined for the seasonality
)
```
* **Hyperparameter tuning:** Use and customize the cell containing:

```python
# Define the parameters to test
prophet_grid = {'df_dev' : [df_dev],
                'device' : [device],
                'parameter' : ['co2'],
                'begin' : ['2019-03-22', '2019-03-24'],
                'end' : ['2019-04-03'],
                'sampling_period_min' : [1, 5, 30],
                'graph' : [1],
                'predict_day' : [1],
                'interval_width' : [0.6],
                'changepoint_prior_scale' : list((0.01, 15)),
                'daily_fo' : [3, 9, 18],
               }

# Run GridSearch_Prophet
mape_table = GridSearch_Prophet(list(ParameterGrid(prophet_grid)), metric='mape')
```

**Further thoughts and improvements:**

* Forecasting events such as human activity using Caru's sensor data are difficults because:
1. They don't occur regularly. In this sense Dynamic Time Warping may help, but does not provide definitely absolutely clean results.
2. The baselines exhibit trends of a significant amplitudes (see humidity data for instance). When the amplitude of events is smaller than the trend, transient activities.
* If the time allows, take advantage of the following [tutorial](https://towardsdatascience.com/implementing-facebook-prophet-efficiently-c241305405a3) for Prophet optimization.
* Once the activity pattern are identified, the script should be continued to cluster activity pattern (make use of the dataframe **df_pred** produced by get_outliers, it contains all the data necessary to do it).
* Outlook of the next milestones:
    1. times someone goes to bed or wakes up in the morning
    2. nightly bathroom breaks
    3. Identify the wrong days
        * Make prediction over the 7 last days.
        * Count the outliers over the next day.
        * When the number of outlier is greater than the mean of outlier + 2SD, outline the day.

> Questions:
> Contact Guillaume Azarias at guillaume.azarias@hotmail.com

## 1 - Import the relevant library

In [1]:
import pandas as pd
import numpy as np
import time
import re

import seaborn as sns
sns.set()
import matplotlib.pyplot as plt
from matplotlib.dates import DateFormatter

# Note that the interactive plot may not work in Jupyter lab, but only in Jupyter Notebook (conflict of javascripts)
%matplotlib widget 

from datetime import datetime, timedelta
from pytz import timezone

In [2]:
import fbprophet
from fbprophet import Prophet
from fbprophet.diagnostics import cross_validation, performance_metrics
from fbprophet.plot import plot_cross_validation_metric

In [3]:
fbprophet.__version__

'0.6'

In [4]:
from sklearn.model_selection import ParameterGrid
import itertools
from random import sample

In [5]:
# Import the functions from the helper.py
from helper import load_ds, df_dev_formater, find_index, df_generator, prophet_fit, prophet_plot, get_outliers, execute_cross_validation_and_performance_loop, prophet, GridSearch_Prophet

## Load a device-specific dataset

**Note on the dataframe**
- To speed up the processing, this script was designed to work with a device-specific dataframe that should be located in the folder ../data/interim. The user enters the number of the device in a two-digit string format ('01', '02'... '51') and follows the instructions.
- If the device-specific dataframes do not exist yet, it will be automatically generated. This script expects to find the following files in the folder ../Data/raw:'
        * 20200331_propulsionlab_caru_data_part1.csv
        * 20200331_propulsionlab_caru_data_part2.csv
        * 20200331_propulsionlab_caru_data_part3.csv
        * 20200331_propulsionlab_caru_data_part4.csv
        * 20200331_propulsionlab_caru_data_part5.csv

### Possibility 1: Load data from the Amazon S3 bucket:

This is a function to be able to use a cloud to store data such as AWS. Here is the
[link to the caru bucket on Amazon](https://s3.console.aws.amazon.com/s3/buckets/carudata/?region=eu-north-1&tab=overview) *(credentials required)*

### Possibility 2:  Load a local file

In [7]:
# Enter the number of the device as a string with two digits and execute the cell
device_nb = '33'

# Load the device-specific dataframe. Show the most important information
assert isinstance(device_nb, str) and len(device_nb)==2 and sum(d.isdigit() for d in device_nb)==2, 'WARNING: device_nb must be a string of 2-digits!'
assert int(device_nb)>=1 and int(device_nb)<=51, 'This device does not belong to the dataframe'
device, df_dev = load_ds(device_nb)

# Convert the variable device from a np.array to a string
regex = re.compile('[^A-Za-z0-9]')
device = regex.sub('', str(device))

# Show the full device-specific dataframe
print('\nShowing the dataframe. Use it to find the time window you would like to investigate:')
start = df_dev['ds'][0].strftime('%Y-%m-%d')
last = df_dev['ds'][df_dev.shape[0]-1].strftime('%Y-%m-%d')
df, predict_n, today_index, lookback_n = df_generator(df_dev, device, 'co2', start, last,  '1T', 0.016)
plt.show()

Check report:
##############################################
['Device contained in the dataset: device33']
['Tenant using the device: tenant09']

Data types:
device                                object
tenant                                object
ds             datetime64[ns, Europe/Zurich]
light                                float64
temperature                          float64
humidity                             float64
co2                                  float64
dtype: object

Available data from the 2019-03-07 to the 2019-05-01.

Showing the dataframe. Use it to find the time window you would like to investigate:


Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

Full dataset: 2019-03-07 to the 2019-05-01. Analysed data the 2019-03-07 to the 2019-05-01.


## Data visualisation

In [7]:
start = '2019-03-26'
last = '2019-04-03'
df, predict_n, today_index, lookback_n = df_generator(df_dev, device, 'co2', start, last,  '5T', 0.08)

Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

Full dataset: 2019-03-07 to the 2019-05-01. Analysed data the 2019-03-26 to the 2019-04-03.


In [8]:
find_index(df, '2019-03-08', '20:30')

2291


## Model training

### Running a Prophet instance

In [9]:
# Single instance check
df_p, df_pred = prophet(
            df_dev,
            device,
            parameter='co2',
            begin='2019-03-26',
            end='2019-04-03',
            sampling_period_min=5,
            graph=1, 
            predict_day=1,
            interval_width=0.6,
            changepoint_prior_scale=0.01,
            daily_fo = 12
)

Full dataset: 2019-03-07 to the 2019-05-01. Analysed data the 2019-03-26 to the 2019-04-03.
o Trained on the data from the 2019-03-26 to the 2019-04-02 (6 days).
o Predict from the 2019-04-02 to the 2019-04-03 (1 days).


Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

INFO:fbprophet:Making 2 forecasts with cutoffs between 2019-04-01 06:54:43.412000 and 2019-04-01 18:54:43.412000


**Plotting the Cross validation performance metrics over time**

In [10]:
df_p.dtypes

horizon     timedelta64[ns]
mse                 float64
rmse                float64
mae                 float64
mape                float64
mdape               float64
coverage            float64
dtype: object

In [14]:
today = datetime.today()
df_p['dt'] = today + df_p['horizon']
plt.close()
ax = sns.lineplot(x="dt", y="mape", data=df_p)
myFmt = DateFormatter("%H:%M")
ax.xaxis.set_major_formatter(myFmt)
plt.xlabel('Daytime of the predicted day', fontsize=10)
plt.ylabel('Mean Average Percentage Error', fontsize=10)
plt.title('Mean Average Percentage Error of the predicted day', fontsize=10)
plt.show()

Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

In [16]:
# Single instance example
df_p, df_pred = prophet(
            df_dev,
            device,
            parameter='co2',
            begin='2019-03-24',
            end='2019-04-03',
            sampling_period_min=1,
            graph=1, 
            predict_day=1,
            interval_width=0.6,
            changepoint_prior_scale=0.001,
            daily_fo = 3
)

Full dataset: 2019-03-07 to the 2019-05-01. Analysed data the 2019-03-24 to the 2019-04-03.
o Trained on the data from the 2019-03-24 to the 2019-04-02 (8 days).
o Predict from the 2019-04-02 to the 2019-04-03 (1 days).


Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

INFO:fbprophet:Making 6 forecasts with cutoffs between 2019-03-30 06:58:44.135000 and 2019-04-01 18:58:44.135000


### Running a GridSearch of Prophet: Hyperparameter optimization

**See the result analysis in ../docs/Prophet_notes.md**

See this the [official Prophet documentation](https://facebook.github.io/prophet/docs/seasonality,_holiday_effects,_and_regressors.html#fourier-order-for-seasonalities) and this [tutorial](https://towardsdatascience.com/implementing-facebook-prophet-efficiently-c241305405a3) for a comprehensive guide of seasonality settings.

**Parameters to optimize**
* *interval_width*=0.9, Anomaly threshold or tolerance. If you decrease it you increase the sensitivity to detect outliers.
* *yearly_seasonality*=False, weekly_seasonality=False, daily_seasonality=False *--> Use your own seasonility.*
* *changepoint_prior_scale*=0.01 # Adjusting trend flexibility. low *--> toward overfit. Does not make sense to have more than 0.1. Just increasing tolerance.*
* *model.add_seasonality*(name='daily', period=1, (you can add as many seasonalities as you want. Period (float number)in days.
    * fourier_order=12 (it is better to use a fourier order>10)
    * prior_scale=0.1
* *n_changepoints* seems to be a parameter that would be helpful in this project (did not have the time to check it yet).

**Test 1: Testing different sampling periods, changepoint_prior_scale and daily_fo of the Prophet model**

In [20]:
# GridSearch.
start_time = time.time()

# Parameters
prophet_grid = {'df_dev' : [df_dev],
                'device' : [device],
                'parameter' : ['co2'],
                'begin' : ['2019-03-22', '2019-03-24'],
                'end' : ['2019-04-03'],
                'sampling_period_min' : [1, 5, 30],
                'graph' : [1],
                'predict_day' : [1],
                'interval_width' : [0.6],
                'changepoint_prior_scale' : list((0.01, 15)), # list(np.arange(0.01,30,1).tolist()),
                'daily_fo' : [3, 9, 18],
#                 'holidays_prior_scale' : list((1000,100,10,1,0.1)),
               }

# Run GridSearch_Prophet
mape_table = GridSearch_Prophet(list(ParameterGrid(prophet_grid)), metric='mape')


end_time = time.time()
dur_min = int((end_time - start_time)/60)


print('GridSearch finished '+ str(dur_min) + " minutes.")



prophet_instance nb 0
Tested model:
Full dataset: 2019-03-07 to the 2019-05-01. Analysed data the 2019-03-22 to the 2019-04-03.
o Trained on the data from the 2019-03-22 to the 2019-04-02 (10 days).
o Predict from the 2019-04-02 to the 2019-04-03 (1 days).


Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

INFO:fbprophet:Making 10 forecasts with cutoffs between 2019-03-28 06:58:44.135000 and 2019-04-01 18:58:44.135000



prophet_instance nb 1
Tested model:
Full dataset: 2019-03-07 to the 2019-05-01. Analysed data the 2019-03-22 to the 2019-04-03.
o Trained on the data from the 2019-03-22 to the 2019-04-02 (10 days).
o Predict from the 2019-04-02 to the 2019-04-03 (1 days).


Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

INFO:fbprophet:Making 10 forecasts with cutoffs between 2019-03-28 06:54:43.412000 and 2019-04-01 18:54:43.412000



prophet_instance nb 2
Tested model:
Full dataset: 2019-03-07 to the 2019-05-01. Analysed data the 2019-03-22 to the 2019-04-03.
o Trained on the data from the 2019-03-23 to the 2019-04-02 (10 days).
o Predict from the 2019-04-02 to the 2019-04-03 (1 days).


Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

INFO:fbprophet:Making 10 forecasts with cutoffs between 2019-03-28 06:29:58.738000 and 2019-04-01 18:29:58.738000



prophet_instance nb 3
Tested model:
Full dataset: 2019-03-07 to the 2019-05-01. Analysed data the 2019-03-22 to the 2019-04-03.
o Trained on the data from the 2019-03-22 to the 2019-04-02 (10 days).
o Predict from the 2019-04-02 to the 2019-04-03 (1 days).


Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

INFO:fbprophet:Making 10 forecasts with cutoffs between 2019-03-28 06:58:44.135000 and 2019-04-01 18:58:44.135000



prophet_instance nb 4
Tested model:
Full dataset: 2019-03-07 to the 2019-05-01. Analysed data the 2019-03-22 to the 2019-04-03.
o Trained on the data from the 2019-03-22 to the 2019-04-02 (10 days).
o Predict from the 2019-04-02 to the 2019-04-03 (1 days).


Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

INFO:fbprophet:Making 10 forecasts with cutoffs between 2019-03-28 06:54:43.412000 and 2019-04-01 18:54:43.412000



prophet_instance nb 5
Tested model:
Full dataset: 2019-03-07 to the 2019-05-01. Analysed data the 2019-03-22 to the 2019-04-03.
o Trained on the data from the 2019-03-23 to the 2019-04-02 (10 days).
o Predict from the 2019-04-02 to the 2019-04-03 (1 days).


Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

INFO:fbprophet:Making 10 forecasts with cutoffs between 2019-03-28 06:29:58.738000 and 2019-04-01 18:29:58.738000



prophet_instance nb 6
Tested model:
Full dataset: 2019-03-07 to the 2019-05-01. Analysed data the 2019-03-22 to the 2019-04-03.
o Trained on the data from the 2019-03-22 to the 2019-04-02 (10 days).
o Predict from the 2019-04-02 to the 2019-04-03 (1 days).


Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

INFO:fbprophet:Making 10 forecasts with cutoffs between 2019-03-28 06:58:44.135000 and 2019-04-01 18:58:44.135000



prophet_instance nb 7
Tested model:
Full dataset: 2019-03-07 to the 2019-05-01. Analysed data the 2019-03-22 to the 2019-04-03.
o Trained on the data from the 2019-03-22 to the 2019-04-02 (10 days).
o Predict from the 2019-04-02 to the 2019-04-03 (1 days).


Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

INFO:fbprophet:Making 10 forecasts with cutoffs between 2019-03-28 06:54:43.412000 and 2019-04-01 18:54:43.412000



prophet_instance nb 8
Tested model:
Full dataset: 2019-03-07 to the 2019-05-01. Analysed data the 2019-03-22 to the 2019-04-03.
o Trained on the data from the 2019-03-23 to the 2019-04-02 (10 days).
o Predict from the 2019-04-02 to the 2019-04-03 (1 days).


Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

INFO:fbprophet:Making 10 forecasts with cutoffs between 2019-03-28 06:29:58.738000 and 2019-04-01 18:29:58.738000



prophet_instance nb 9
Tested model:
Full dataset: 2019-03-07 to the 2019-05-01. Analysed data the 2019-03-22 to the 2019-04-03.
o Trained on the data from the 2019-03-22 to the 2019-04-02 (10 days).
o Predict from the 2019-04-02 to the 2019-04-03 (1 days).


Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

INFO:fbprophet:Making 10 forecasts with cutoffs between 2019-03-28 06:58:44.135000 and 2019-04-01 18:58:44.135000



prophet_instance nb 10
Tested model:
Full dataset: 2019-03-07 to the 2019-05-01. Analysed data the 2019-03-22 to the 2019-04-03.
o Trained on the data from the 2019-03-22 to the 2019-04-02 (10 days).
o Predict from the 2019-04-02 to the 2019-04-03 (1 days).


Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

INFO:fbprophet:Making 10 forecasts with cutoffs between 2019-03-28 06:54:43.412000 and 2019-04-01 18:54:43.412000



prophet_instance nb 11
Tested model:
Full dataset: 2019-03-07 to the 2019-05-01. Analysed data the 2019-03-22 to the 2019-04-03.
o Trained on the data from the 2019-03-23 to the 2019-04-02 (10 days).
o Predict from the 2019-04-02 to the 2019-04-03 (1 days).


Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

INFO:fbprophet:Making 10 forecasts with cutoffs between 2019-03-28 06:29:58.738000 and 2019-04-01 18:29:58.738000



prophet_instance nb 12
Tested model:
Full dataset: 2019-03-07 to the 2019-05-01. Analysed data the 2019-03-22 to the 2019-04-03.
o Trained on the data from the 2019-03-22 to the 2019-04-02 (10 days).
o Predict from the 2019-04-02 to the 2019-04-03 (1 days).


Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

INFO:fbprophet:Making 10 forecasts with cutoffs between 2019-03-28 06:58:44.135000 and 2019-04-01 18:58:44.135000



prophet_instance nb 13
Tested model:
Full dataset: 2019-03-07 to the 2019-05-01. Analysed data the 2019-03-22 to the 2019-04-03.
o Trained on the data from the 2019-03-22 to the 2019-04-02 (10 days).
o Predict from the 2019-04-02 to the 2019-04-03 (1 days).


Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

INFO:fbprophet:Making 10 forecasts with cutoffs between 2019-03-28 06:54:43.412000 and 2019-04-01 18:54:43.412000



prophet_instance nb 14
Tested model:
Full dataset: 2019-03-07 to the 2019-05-01. Analysed data the 2019-03-22 to the 2019-04-03.
o Trained on the data from the 2019-03-23 to the 2019-04-02 (10 days).
o Predict from the 2019-04-02 to the 2019-04-03 (1 days).


Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

INFO:fbprophet:Making 10 forecasts with cutoffs between 2019-03-28 06:29:58.738000 and 2019-04-01 18:29:58.738000



prophet_instance nb 15
Tested model:
Full dataset: 2019-03-07 to the 2019-05-01. Analysed data the 2019-03-22 to the 2019-04-03.
o Trained on the data from the 2019-03-22 to the 2019-04-02 (10 days).
o Predict from the 2019-04-02 to the 2019-04-03 (1 days).


Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

INFO:fbprophet:Making 10 forecasts with cutoffs between 2019-03-28 06:58:44.135000 and 2019-04-01 18:58:44.135000



prophet_instance nb 16
Tested model:
Full dataset: 2019-03-07 to the 2019-05-01. Analysed data the 2019-03-22 to the 2019-04-03.
o Trained on the data from the 2019-03-22 to the 2019-04-02 (10 days).
o Predict from the 2019-04-02 to the 2019-04-03 (1 days).


Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

INFO:fbprophet:Making 10 forecasts with cutoffs between 2019-03-28 06:54:43.412000 and 2019-04-01 18:54:43.412000



prophet_instance nb 17
Tested model:
Full dataset: 2019-03-07 to the 2019-05-01. Analysed data the 2019-03-22 to the 2019-04-03.
o Trained on the data from the 2019-03-23 to the 2019-04-02 (10 days).
o Predict from the 2019-04-02 to the 2019-04-03 (1 days).


Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

INFO:fbprophet:Making 10 forecasts with cutoffs between 2019-03-28 06:29:58.738000 and 2019-04-01 18:29:58.738000



prophet_instance nb 18
Tested model:
Full dataset: 2019-03-07 to the 2019-05-01. Analysed data the 2019-03-24 to the 2019-04-03.
o Trained on the data from the 2019-03-24 to the 2019-04-02 (8 days).
o Predict from the 2019-04-02 to the 2019-04-03 (1 days).


Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

INFO:fbprophet:Making 6 forecasts with cutoffs between 2019-03-30 06:58:44.135000 and 2019-04-01 18:58:44.135000



prophet_instance nb 19
Tested model:
Full dataset: 2019-03-07 to the 2019-05-01. Analysed data the 2019-03-24 to the 2019-04-03.
o Trained on the data from the 2019-03-24 to the 2019-04-02 (8 days).
o Predict from the 2019-04-02 to the 2019-04-03 (1 days).


Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

INFO:fbprophet:Making 6 forecasts with cutoffs between 2019-03-30 06:54:43.412000 and 2019-04-01 18:54:43.412000



prophet_instance nb 20
Tested model:
Full dataset: 2019-03-07 to the 2019-05-01. Analysed data the 2019-03-24 to the 2019-04-03.
o Trained on the data from the 2019-03-24 to the 2019-04-02 (8 days).
o Predict from the 2019-04-02 to the 2019-04-03 (1 days).


Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

INFO:fbprophet:Making 6 forecasts with cutoffs between 2019-03-30 06:29:58.738000 and 2019-04-01 18:29:58.738000



prophet_instance nb 21
Tested model:
Full dataset: 2019-03-07 to the 2019-05-01. Analysed data the 2019-03-24 to the 2019-04-03.
o Trained on the data from the 2019-03-24 to the 2019-04-02 (8 days).
o Predict from the 2019-04-02 to the 2019-04-03 (1 days).


Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

INFO:fbprophet:Making 6 forecasts with cutoffs between 2019-03-30 06:58:44.135000 and 2019-04-01 18:58:44.135000



prophet_instance nb 22
Tested model:
Full dataset: 2019-03-07 to the 2019-05-01. Analysed data the 2019-03-24 to the 2019-04-03.
o Trained on the data from the 2019-03-24 to the 2019-04-02 (8 days).
o Predict from the 2019-04-02 to the 2019-04-03 (1 days).


Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

INFO:fbprophet:Making 6 forecasts with cutoffs between 2019-03-30 06:54:43.412000 and 2019-04-01 18:54:43.412000



prophet_instance nb 23
Tested model:
Full dataset: 2019-03-07 to the 2019-05-01. Analysed data the 2019-03-24 to the 2019-04-03.
o Trained on the data from the 2019-03-24 to the 2019-04-02 (8 days).
o Predict from the 2019-04-02 to the 2019-04-03 (1 days).


Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

INFO:fbprophet:Making 6 forecasts with cutoffs between 2019-03-30 06:29:58.738000 and 2019-04-01 18:29:58.738000



prophet_instance nb 24
Tested model:
Full dataset: 2019-03-07 to the 2019-05-01. Analysed data the 2019-03-24 to the 2019-04-03.
o Trained on the data from the 2019-03-24 to the 2019-04-02 (8 days).
o Predict from the 2019-04-02 to the 2019-04-03 (1 days).


Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

INFO:fbprophet:Making 6 forecasts with cutoffs between 2019-03-30 06:58:44.135000 and 2019-04-01 18:58:44.135000



prophet_instance nb 25
Tested model:
Full dataset: 2019-03-07 to the 2019-05-01. Analysed data the 2019-03-24 to the 2019-04-03.
o Trained on the data from the 2019-03-24 to the 2019-04-02 (8 days).
o Predict from the 2019-04-02 to the 2019-04-03 (1 days).


Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

INFO:fbprophet:Making 6 forecasts with cutoffs between 2019-03-30 06:54:43.412000 and 2019-04-01 18:54:43.412000



prophet_instance nb 26
Tested model:
Full dataset: 2019-03-07 to the 2019-05-01. Analysed data the 2019-03-24 to the 2019-04-03.
o Trained on the data from the 2019-03-24 to the 2019-04-02 (8 days).
o Predict from the 2019-04-02 to the 2019-04-03 (1 days).


Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

INFO:fbprophet:Making 6 forecasts with cutoffs between 2019-03-30 06:29:58.738000 and 2019-04-01 18:29:58.738000



prophet_instance nb 27
Tested model:
Full dataset: 2019-03-07 to the 2019-05-01. Analysed data the 2019-03-24 to the 2019-04-03.
o Trained on the data from the 2019-03-24 to the 2019-04-02 (8 days).
o Predict from the 2019-04-02 to the 2019-04-03 (1 days).


Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

INFO:fbprophet:Making 6 forecasts with cutoffs between 2019-03-30 06:58:44.135000 and 2019-04-01 18:58:44.135000



prophet_instance nb 28
Tested model:
Full dataset: 2019-03-07 to the 2019-05-01. Analysed data the 2019-03-24 to the 2019-04-03.
o Trained on the data from the 2019-03-24 to the 2019-04-02 (8 days).
o Predict from the 2019-04-02 to the 2019-04-03 (1 days).


Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

INFO:fbprophet:Making 6 forecasts with cutoffs between 2019-03-30 06:54:43.412000 and 2019-04-01 18:54:43.412000



prophet_instance nb 29
Tested model:
Full dataset: 2019-03-07 to the 2019-05-01. Analysed data the 2019-03-24 to the 2019-04-03.
o Trained on the data from the 2019-03-24 to the 2019-04-02 (8 days).
o Predict from the 2019-04-02 to the 2019-04-03 (1 days).


Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

INFO:fbprophet:Making 6 forecasts with cutoffs between 2019-03-30 06:29:58.738000 and 2019-04-01 18:29:58.738000



prophet_instance nb 30
Tested model:
Full dataset: 2019-03-07 to the 2019-05-01. Analysed data the 2019-03-24 to the 2019-04-03.
o Trained on the data from the 2019-03-24 to the 2019-04-02 (8 days).
o Predict from the 2019-04-02 to the 2019-04-03 (1 days).


Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

INFO:fbprophet:Making 6 forecasts with cutoffs between 2019-03-30 06:58:44.135000 and 2019-04-01 18:58:44.135000



prophet_instance nb 31
Tested model:
Full dataset: 2019-03-07 to the 2019-05-01. Analysed data the 2019-03-24 to the 2019-04-03.
o Trained on the data from the 2019-03-24 to the 2019-04-02 (8 days).
o Predict from the 2019-04-02 to the 2019-04-03 (1 days).


Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

INFO:fbprophet:Making 6 forecasts with cutoffs between 2019-03-30 06:54:43.412000 and 2019-04-01 18:54:43.412000



prophet_instance nb 32
Tested model:
Full dataset: 2019-03-07 to the 2019-05-01. Analysed data the 2019-03-24 to the 2019-04-03.
o Trained on the data from the 2019-03-24 to the 2019-04-02 (8 days).
o Predict from the 2019-04-02 to the 2019-04-03 (1 days).


Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

INFO:fbprophet:Making 6 forecasts with cutoffs between 2019-03-30 06:29:58.738000 and 2019-04-01 18:29:58.738000



prophet_instance nb 33
Tested model:
Full dataset: 2019-03-07 to the 2019-05-01. Analysed data the 2019-03-24 to the 2019-04-03.
o Trained on the data from the 2019-03-24 to the 2019-04-02 (8 days).
o Predict from the 2019-04-02 to the 2019-04-03 (1 days).


Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

INFO:fbprophet:Making 6 forecasts with cutoffs between 2019-03-30 06:58:44.135000 and 2019-04-01 18:58:44.135000



prophet_instance nb 34
Tested model:
Full dataset: 2019-03-07 to the 2019-05-01. Analysed data the 2019-03-24 to the 2019-04-03.
o Trained on the data from the 2019-03-24 to the 2019-04-02 (8 days).
o Predict from the 2019-04-02 to the 2019-04-03 (1 days).


Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

INFO:fbprophet:Making 6 forecasts with cutoffs between 2019-03-30 06:54:43.412000 and 2019-04-01 18:54:43.412000



prophet_instance nb 35
Tested model:
Full dataset: 2019-03-07 to the 2019-05-01. Analysed data the 2019-03-24 to the 2019-04-03.
o Trained on the data from the 2019-03-24 to the 2019-04-02 (8 days).
o Predict from the 2019-04-02 to the 2019-04-03 (1 days).


Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

INFO:fbprophet:Making 6 forecasts with cutoffs between 2019-03-30 06:29:58.738000 and 2019-04-01 18:29:58.738000


GridSearch finished 98 minutes.


In [30]:
# Almost 3min per model on average
mape_table.shape[0]

36

In [22]:
mape_table.to_csv('dev33_co2_2019-04-03_SP_IW_FO_CPS.csv')

In [24]:
mape_table.sort_values('mape_average')

Unnamed: 0,device,parameter,begin,end,sampling_period_min,interval_width,daily_fo,changepoint_prior_scale,mape_average
11,[device33],co2,2019-03-22,2019-04-03,30,0.6,3,15.0,0.116839
10,[device33],co2,2019-03-22,2019-04-03,5,0.6,3,15.0,0.122786
9,[device33],co2,2019-03-22,2019-04-03,1,0.6,3,15.0,0.123446
0,[device33],co2,2019-03-22,2019-04-03,1,0.6,3,0.01,0.123622
17,[device33],co2,2019-03-22,2019-04-03,30,0.6,18,15.0,0.128037
29,[device33],co2,2019-03-24,2019-04-03,30,0.6,3,15.0,0.128436
14,[device33],co2,2019-03-22,2019-04-03,30,0.6,9,15.0,0.128777
15,[device33],co2,2019-03-22,2019-04-03,1,0.6,18,15.0,0.129947
16,[device33],co2,2019-03-22,2019-04-03,5,0.6,18,15.0,0.130079
32,[device33],co2,2019-03-24,2019-04-03,30,0.6,9,15.0,0.131068


In [41]:
mape_table.to_csv('dev33_co2_2019-04-03_SP_IW_FO_CPS_2.csv')

**Conclusions**

'changepoint_prior_scale' (CPS)
	- greater than 1 just make a aberrantly wide prediction: 15 IS NOT GOOD !
	- Higher than 1 significantly decreases the accuracy!
	- try changepoint_prior_scale : 0.001, 0.01, 0.1

'sampling_period_min' : [1, 5, 30]:
	- you miss events with 30min, but are faster
	- try 1, 2 and 5

'begin':
	- 10 days is better than 6 days but much longer analysis
	- Keep 6 days and may be for the final round increase the number of days

'daily_fo' : [3, 9, 18]
	- Lower is definitely better than greater! 
	- try 2, 3, 6

**Test 2: Testing different shorter sampling periods, changepoint_prior_scale and daily_fo of the Prophet model**

In [42]:
mape_table.sort_values('mape_average')

Unnamed: 0,device,parameter,begin,end,sampling_period_min,interval_width,daily_fo,changepoint_prior_scale,mape_average
18,[device33],co2,2019-03-24,2019-04-03,1,0.6,2,0.1,0.135834
24,[device33],co2,2019-03-24,2019-04-03,1,0.6,6,0.1,0.136738
21,[device33],co2,2019-03-24,2019-04-03,1,0.6,3,0.1,0.138159
19,[device33],co2,2019-03-24,2019-04-03,2,0.6,2,0.1,0.141241
25,[device33],co2,2019-03-24,2019-04-03,2,0.6,6,0.1,0.142547
22,[device33],co2,2019-03-24,2019-04-03,2,0.6,3,0.1,0.143527
23,[device33],co2,2019-03-24,2019-04-03,5,0.6,3,0.1,0.148306
26,[device33],co2,2019-03-24,2019-04-03,5,0.6,6,0.1,0.148327
12,[device33],co2,2019-03-24,2019-04-03,1,0.6,3,0.01,0.155955
15,[device33],co2,2019-03-24,2019-04-03,1,0.6,6,0.01,0.157499


**Conclusions**

'sampling_period_min' : [1, 2, 5]:
	- the smaller is the better: use 1

'changepoint_prior_scale' : [0.001, 0.01, 0.1]
	- 0.1 is too large ! The prediction goes wide and the prediction is meaningless.
	- 0.001 seems to give visually better results than 0.01 even if the median mape is lower for 0.01 than 0.01
	- Keep both 0.01 and 0.001

'daily_fo' : [2, 3, 6]:
	- Would choose 3. The dynamic is not only binary. choose 3 or 6. 6.

**Test 3: Testing with different training period duration, changepoint_prior_scale and daily_fo of the Prophet model**

In [47]:
# GridSearch. To do a randomSearch check below
start_time = time.time()

# Parameters
prophet_grid = {'df_dev' : [df_dev],
                'device' : [device],
                'parameter' : ['co2'],
                'begin' : ['2019-03-22', '2019-03-24', '2019-03-26'],
                'end' : ['2019-04-03'],
                'sampling_period_min' : [1],
                'graph' : [1],
                'predict_day' : [1],
                'interval_width' : [0.6],
                'changepoint_prior_scale' : [0.001, 0.01], # list(np.arange(0.01,30,1).tolist()),
                'daily_fo' : [3, 6, 9],
#                 'holidays_prior_scale' : list((1000,100,10,1,0.1)),
               }

# Run GridSearch_Prophet
mape_table = GridSearch_Prophet(list(ParameterGrid(prophet_grid)), metric='mape')


end_time = time.time()
dur_min = int((end_time - start_time)/60)

print('GridSearch finished '+ str(dur_min) + " minutes.")
mape_table.sort_values('mape_average')


prophet_instance nb 0
Tested model:
Full dataset: 2019-03-07 to the 2019-05-01. Analysed data the 2019-03-22 to the 2019-04-03.
o Trained on the data from the 2019-03-22 to the 2019-04-02 (10 days).
o Predict from the 2019-04-02 to the 2019-04-03 (1 days).


Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

INFO:fbprophet:Making 10 forecasts with cutoffs between 2019-03-28 06:58:44.135000 and 2019-04-01 18:58:44.135000



prophet_instance nb 1
Tested model:
Full dataset: 2019-03-07 to the 2019-05-01. Analysed data the 2019-03-22 to the 2019-04-03.
o Trained on the data from the 2019-03-22 to the 2019-04-02 (10 days).
o Predict from the 2019-04-02 to the 2019-04-03 (1 days).


Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

INFO:fbprophet:Making 10 forecasts with cutoffs between 2019-03-28 06:58:44.135000 and 2019-04-01 18:58:44.135000



prophet_instance nb 2
Tested model:
Full dataset: 2019-03-07 to the 2019-05-01. Analysed data the 2019-03-22 to the 2019-04-03.
o Trained on the data from the 2019-03-22 to the 2019-04-02 (10 days).
o Predict from the 2019-04-02 to the 2019-04-03 (1 days).


Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

INFO:fbprophet:Making 10 forecasts with cutoffs between 2019-03-28 06:58:44.135000 and 2019-04-01 18:58:44.135000



prophet_instance nb 3
Tested model:
Full dataset: 2019-03-07 to the 2019-05-01. Analysed data the 2019-03-22 to the 2019-04-03.
o Trained on the data from the 2019-03-22 to the 2019-04-02 (10 days).
o Predict from the 2019-04-02 to the 2019-04-03 (1 days).


Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

INFO:fbprophet:Making 10 forecasts with cutoffs between 2019-03-28 06:58:44.135000 and 2019-04-01 18:58:44.135000



prophet_instance nb 4
Tested model:
Full dataset: 2019-03-07 to the 2019-05-01. Analysed data the 2019-03-22 to the 2019-04-03.
o Trained on the data from the 2019-03-22 to the 2019-04-02 (10 days).
o Predict from the 2019-04-02 to the 2019-04-03 (1 days).


Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

INFO:fbprophet:Making 10 forecasts with cutoffs between 2019-03-28 06:58:44.135000 and 2019-04-01 18:58:44.135000



prophet_instance nb 5
Tested model:
Full dataset: 2019-03-07 to the 2019-05-01. Analysed data the 2019-03-22 to the 2019-04-03.
o Trained on the data from the 2019-03-22 to the 2019-04-02 (10 days).
o Predict from the 2019-04-02 to the 2019-04-03 (1 days).


Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

INFO:fbprophet:Making 10 forecasts with cutoffs between 2019-03-28 06:58:44.135000 and 2019-04-01 18:58:44.135000



prophet_instance nb 6
Tested model:
Full dataset: 2019-03-07 to the 2019-05-01. Analysed data the 2019-03-24 to the 2019-04-03.
o Trained on the data from the 2019-03-24 to the 2019-04-02 (8 days).
o Predict from the 2019-04-02 to the 2019-04-03 (1 days).


Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

INFO:fbprophet:Making 6 forecasts with cutoffs between 2019-03-30 06:58:44.135000 and 2019-04-01 18:58:44.135000



prophet_instance nb 7
Tested model:
Full dataset: 2019-03-07 to the 2019-05-01. Analysed data the 2019-03-24 to the 2019-04-03.
o Trained on the data from the 2019-03-24 to the 2019-04-02 (8 days).
o Predict from the 2019-04-02 to the 2019-04-03 (1 days).


Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

INFO:fbprophet:Making 6 forecasts with cutoffs between 2019-03-30 06:58:44.135000 and 2019-04-01 18:58:44.135000



prophet_instance nb 8
Tested model:
Full dataset: 2019-03-07 to the 2019-05-01. Analysed data the 2019-03-24 to the 2019-04-03.
o Trained on the data from the 2019-03-24 to the 2019-04-02 (8 days).
o Predict from the 2019-04-02 to the 2019-04-03 (1 days).


Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

INFO:fbprophet:Making 6 forecasts with cutoffs between 2019-03-30 06:58:44.135000 and 2019-04-01 18:58:44.135000



prophet_instance nb 9
Tested model:
Full dataset: 2019-03-07 to the 2019-05-01. Analysed data the 2019-03-24 to the 2019-04-03.
o Trained on the data from the 2019-03-24 to the 2019-04-02 (8 days).
o Predict from the 2019-04-02 to the 2019-04-03 (1 days).


Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

INFO:fbprophet:Making 6 forecasts with cutoffs between 2019-03-30 06:58:44.135000 and 2019-04-01 18:58:44.135000



prophet_instance nb 10
Tested model:
Full dataset: 2019-03-07 to the 2019-05-01. Analysed data the 2019-03-24 to the 2019-04-03.
o Trained on the data from the 2019-03-24 to the 2019-04-02 (8 days).
o Predict from the 2019-04-02 to the 2019-04-03 (1 days).


Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

INFO:fbprophet:Making 6 forecasts with cutoffs between 2019-03-30 06:58:44.135000 and 2019-04-01 18:58:44.135000



prophet_instance nb 11
Tested model:
Full dataset: 2019-03-07 to the 2019-05-01. Analysed data the 2019-03-24 to the 2019-04-03.
o Trained on the data from the 2019-03-24 to the 2019-04-02 (8 days).
o Predict from the 2019-04-02 to the 2019-04-03 (1 days).


Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

INFO:fbprophet:Making 6 forecasts with cutoffs between 2019-03-30 06:58:44.135000 and 2019-04-01 18:58:44.135000



prophet_instance nb 12
Tested model:
Full dataset: 2019-03-07 to the 2019-05-01. Analysed data the 2019-03-26 to the 2019-04-03.
o Trained on the data from the 2019-03-26 to the 2019-04-02 (6 days).
o Predict from the 2019-04-02 to the 2019-04-03 (1 days).


Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

INFO:fbprophet:Making 2 forecasts with cutoffs between 2019-04-01 06:58:44.135000 and 2019-04-01 18:58:44.135000



prophet_instance nb 13
Tested model:
Full dataset: 2019-03-07 to the 2019-05-01. Analysed data the 2019-03-26 to the 2019-04-03.
o Trained on the data from the 2019-03-26 to the 2019-04-02 (6 days).
o Predict from the 2019-04-02 to the 2019-04-03 (1 days).


Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

INFO:fbprophet:Making 2 forecasts with cutoffs between 2019-04-01 06:58:44.135000 and 2019-04-01 18:58:44.135000



prophet_instance nb 14
Tested model:
Full dataset: 2019-03-07 to the 2019-05-01. Analysed data the 2019-03-26 to the 2019-04-03.
o Trained on the data from the 2019-03-26 to the 2019-04-02 (6 days).
o Predict from the 2019-04-02 to the 2019-04-03 (1 days).


Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

INFO:fbprophet:Making 2 forecasts with cutoffs between 2019-04-01 06:58:44.135000 and 2019-04-01 18:58:44.135000



prophet_instance nb 15
Tested model:
Full dataset: 2019-03-07 to the 2019-05-01. Analysed data the 2019-03-26 to the 2019-04-03.
o Trained on the data from the 2019-03-26 to the 2019-04-02 (6 days).
o Predict from the 2019-04-02 to the 2019-04-03 (1 days).


Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

INFO:fbprophet:Making 2 forecasts with cutoffs between 2019-04-01 06:58:44.135000 and 2019-04-01 18:58:44.135000



prophet_instance nb 16
Tested model:
Full dataset: 2019-03-07 to the 2019-05-01. Analysed data the 2019-03-26 to the 2019-04-03.
o Trained on the data from the 2019-03-26 to the 2019-04-02 (6 days).
o Predict from the 2019-04-02 to the 2019-04-03 (1 days).


Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

INFO:fbprophet:Making 2 forecasts with cutoffs between 2019-04-01 06:58:44.135000 and 2019-04-01 18:58:44.135000



prophet_instance nb 17
Tested model:
Full dataset: 2019-03-07 to the 2019-05-01. Analysed data the 2019-03-26 to the 2019-04-03.
o Trained on the data from the 2019-03-26 to the 2019-04-02 (6 days).
o Predict from the 2019-04-02 to the 2019-04-03 (1 days).


Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

INFO:fbprophet:Making 2 forecasts with cutoffs between 2019-04-01 06:58:44.135000 and 2019-04-01 18:58:44.135000


GridSearch finished 78 minutes.


Unnamed: 0,device,parameter,begin,end,sampling_period_min,interval_width,daily_fo,changepoint_prior_scale,mape_average
3,[device33],co2,2019-03-22,2019-04-03,1,0.6,3,0.01,0.123622
4,[device33],co2,2019-03-22,2019-04-03,1,0.6,6,0.01,0.130818
5,[device33],co2,2019-03-22,2019-04-03,1,0.6,9,0.01,0.131206
9,[device33],co2,2019-03-24,2019-04-03,1,0.6,3,0.01,0.155955
10,[device33],co2,2019-03-24,2019-04-03,1,0.6,6,0.01,0.157499
11,[device33],co2,2019-03-24,2019-04-03,1,0.6,9,0.01,0.159407
12,[device33],co2,2019-03-26,2019-04-03,1,0.6,3,0.001,0.17232
14,[device33],co2,2019-03-26,2019-04-03,1,0.6,9,0.001,0.172851
13,[device33],co2,2019-03-26,2019-04-03,1,0.6,6,0.001,0.173043
6,[device33],co2,2019-03-24,2019-04-03,1,0.6,3,0.001,0.182858


In [48]:
mape_table.sort_values('mape_average')

Unnamed: 0,device,parameter,begin,end,sampling_period_min,interval_width,daily_fo,changepoint_prior_scale,mape_average
3,[device33],co2,2019-03-22,2019-04-03,1,0.6,3,0.01,0.123622
4,[device33],co2,2019-03-22,2019-04-03,1,0.6,6,0.01,0.130818
5,[device33],co2,2019-03-22,2019-04-03,1,0.6,9,0.01,0.131206
9,[device33],co2,2019-03-24,2019-04-03,1,0.6,3,0.01,0.155955
10,[device33],co2,2019-03-24,2019-04-03,1,0.6,6,0.01,0.157499
11,[device33],co2,2019-03-24,2019-04-03,1,0.6,9,0.01,0.159407
12,[device33],co2,2019-03-26,2019-04-03,1,0.6,3,0.001,0.17232
14,[device33],co2,2019-03-26,2019-04-03,1,0.6,9,0.001,0.172851
13,[device33],co2,2019-03-26,2019-04-03,1,0.6,6,0.001,0.173043
6,[device33],co2,2019-03-24,2019-04-03,1,0.6,3,0.001,0.182858


**Conclusion**

'changepoint_prior_scale' : [0.001, 0.01, 0.1]
	- 0.1 is too large ! The prediction goes wide and the prediction is meaningless.
	- 0.01 may be better than 0.001
	- Keep both 0.01 and 0.001


'begin' : ['2019-03-22', '2019-03-24', '2019-03-26']
	- In this case, the more data is used for the model fitting, the lower is the median_mape

**Test 4: Testing different training periods, daily_fo of the Prophet model**

In [51]:
# GridSearch. To do a randomSearch check below
start_time = time.time()

# Parameters
prophet_grid = {'df_dev' : [df_dev],
                'device' : [device],
                'parameter' : ['co2'],
                'begin' : ['2019-03-22', '2019-03-24'],
                'end' : ['2019-04-03'],
                'sampling_period_min' : [1],
                'graph' : [1],
                'predict_day' : [1],
                'interval_width' : [0.6],
                'changepoint_prior_scale' : [0.01], # list(np.arange(0.01,30,1).tolist()),
                'daily_fo' : [3, 6, 9],
#                 'holidays_prior_scale' : list((1000,100,10,1,0.1)),
               }

# Run GridSearch_Prophet
mape_0403 = GridSearch_Prophet(list(ParameterGrid(prophet_grid)), metric='mape')


end_time = time.time()
dur_min = int((end_time - start_time)/60)

print('GridSearch finished '+ str(dur_min) + " minutes.")
mape_0403.sort_values('mape_average')


prophet_instance nb 0
Tested model:
Full dataset: 2019-03-07 to the 2019-05-01. Analysed data the 2019-03-22 to the 2019-04-03.
o Trained on the data from the 2019-03-22 to the 2019-04-02 (10 days).
o Predict from the 2019-04-02 to the 2019-04-03 (1 days).


Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

INFO:fbprophet:Making 10 forecasts with cutoffs between 2019-03-28 06:58:44.135000 and 2019-04-01 18:58:44.135000



prophet_instance nb 1
Tested model:
Full dataset: 2019-03-07 to the 2019-05-01. Analysed data the 2019-03-22 to the 2019-04-03.
o Trained on the data from the 2019-03-22 to the 2019-04-02 (10 days).
o Predict from the 2019-04-02 to the 2019-04-03 (1 days).


Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

INFO:fbprophet:Making 10 forecasts with cutoffs between 2019-03-28 06:58:44.135000 and 2019-04-01 18:58:44.135000



prophet_instance nb 2
Tested model:
Full dataset: 2019-03-07 to the 2019-05-01. Analysed data the 2019-03-22 to the 2019-04-03.
o Trained on the data from the 2019-03-22 to the 2019-04-02 (10 days).
o Predict from the 2019-04-02 to the 2019-04-03 (1 days).


Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

INFO:fbprophet:Making 10 forecasts with cutoffs between 2019-03-28 06:58:44.135000 and 2019-04-01 18:58:44.135000



prophet_instance nb 3
Tested model:
Full dataset: 2019-03-07 to the 2019-05-01. Analysed data the 2019-03-24 to the 2019-04-03.
o Trained on the data from the 2019-03-24 to the 2019-04-02 (8 days).
o Predict from the 2019-04-02 to the 2019-04-03 (1 days).


Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

INFO:fbprophet:Making 6 forecasts with cutoffs between 2019-03-30 06:58:44.135000 and 2019-04-01 18:58:44.135000



prophet_instance nb 4
Tested model:
Full dataset: 2019-03-07 to the 2019-05-01. Analysed data the 2019-03-24 to the 2019-04-03.
o Trained on the data from the 2019-03-24 to the 2019-04-02 (8 days).
o Predict from the 2019-04-02 to the 2019-04-03 (1 days).


Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

INFO:fbprophet:Making 6 forecasts with cutoffs between 2019-03-30 06:58:44.135000 and 2019-04-01 18:58:44.135000



prophet_instance nb 5
Tested model:
Full dataset: 2019-03-07 to the 2019-05-01. Analysed data the 2019-03-24 to the 2019-04-03.
o Trained on the data from the 2019-03-24 to the 2019-04-02 (8 days).
o Predict from the 2019-04-02 to the 2019-04-03 (1 days).


Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

INFO:fbprophet:Making 6 forecasts with cutoffs between 2019-03-30 06:58:44.135000 and 2019-04-01 18:58:44.135000


GridSearch finished 32 minutes.


Unnamed: 0,device,parameter,begin,end,sampling_period_min,interval_width,daily_fo,changepoint_prior_scale,mape_average
0,[device33],co2,2019-03-22,2019-04-03,1,0.6,3,0.01,0.123622
1,[device33],co2,2019-03-22,2019-04-03,1,0.6,6,0.01,0.130818
2,[device33],co2,2019-03-22,2019-04-03,1,0.6,9,0.01,0.131206
3,[device33],co2,2019-03-24,2019-04-03,1,0.6,3,0.01,0.155955
4,[device33],co2,2019-03-24,2019-04-03,1,0.6,6,0.01,0.157499
5,[device33],co2,2019-03-24,2019-04-03,1,0.6,9,0.01,0.159407


In [56]:
mape_0403.to_csv('dev33_co2_2019-04-03.csv')

In [52]:
# GridSearch.
start_time = time.time()

# Parameters
prophet_grid = {'df_dev' : [df_dev],
                'device' : [device],
                'parameter' : ['co2'],
                'begin' : ['2019-03-21', '2019-03-23'],
                'end' : ['2019-04-02'],
                'sampling_period_min' : [1],
                'graph' : [1],
                'predict_day' : [1],
                'interval_width' : [0.6],
                'changepoint_prior_scale' : [0.01], # list(np.arange(0.01,30,1).tolist()),
                'daily_fo' : [3, 6, 9],
#                 'holidays_prior_scale' : list((1000,100,10,1,0.1)),
               }

# Run GridSearch_Prophet
mape_0402 = GridSearch_Prophet(list(ParameterGrid(prophet_grid)), metric='mape')


end_time = time.time()
dur_min = int((end_time - start_time)/60)

print('GridSearch finished '+ str(dur_min) + " minutes.")
mape_0402.sort_values('mape_average')


prophet_instance nb 0
Tested model:
Full dataset: 2019-03-07 to the 2019-05-01. Analysed data the 2019-03-21 to the 2019-04-02.
o Trained on the data from the 2019-03-21 to the 2019-04-01 (10 days).
o Predict from the 2019-04-01 to the 2019-04-02 (1 days).


Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

INFO:fbprophet:Making 10 forecasts with cutoffs between 2019-03-27 06:58:54.472000 and 2019-03-31 18:58:54.472000



prophet_instance nb 1
Tested model:
Full dataset: 2019-03-07 to the 2019-05-01. Analysed data the 2019-03-21 to the 2019-04-02.
o Trained on the data from the 2019-03-21 to the 2019-04-01 (10 days).
o Predict from the 2019-04-01 to the 2019-04-02 (1 days).


Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

INFO:fbprophet:Making 10 forecasts with cutoffs between 2019-03-27 06:58:54.472000 and 2019-03-31 18:58:54.472000



prophet_instance nb 2
Tested model:
Full dataset: 2019-03-07 to the 2019-05-01. Analysed data the 2019-03-21 to the 2019-04-02.
o Trained on the data from the 2019-03-21 to the 2019-04-01 (10 days).
o Predict from the 2019-04-01 to the 2019-04-02 (1 days).


Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

INFO:fbprophet:Making 10 forecasts with cutoffs between 2019-03-27 06:58:54.472000 and 2019-03-31 18:58:54.472000



prophet_instance nb 3
Tested model:
Full dataset: 2019-03-07 to the 2019-05-01. Analysed data the 2019-03-23 to the 2019-04-02.
o Trained on the data from the 2019-03-23 to the 2019-04-01 (8 days).
o Predict from the 2019-04-01 to the 2019-04-02 (1 days).


Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

INFO:fbprophet:Making 6 forecasts with cutoffs between 2019-03-29 06:58:54.472000 and 2019-03-31 18:58:54.472000



prophet_instance nb 4
Tested model:
Full dataset: 2019-03-07 to the 2019-05-01. Analysed data the 2019-03-23 to the 2019-04-02.
o Trained on the data from the 2019-03-23 to the 2019-04-01 (8 days).
o Predict from the 2019-04-01 to the 2019-04-02 (1 days).


Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

INFO:fbprophet:Making 6 forecasts with cutoffs between 2019-03-29 06:58:54.472000 and 2019-03-31 18:58:54.472000



prophet_instance nb 5
Tested model:
Full dataset: 2019-03-07 to the 2019-05-01. Analysed data the 2019-03-23 to the 2019-04-02.
o Trained on the data from the 2019-03-23 to the 2019-04-01 (8 days).
o Predict from the 2019-04-01 to the 2019-04-02 (1 days).


Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

INFO:fbprophet:Making 6 forecasts with cutoffs between 2019-03-29 06:58:54.472000 and 2019-03-31 18:58:54.472000


GridSearch finished 32 minutes.


Unnamed: 0,device,parameter,begin,end,sampling_period_min,interval_width,daily_fo,changepoint_prior_scale,mape_average
0,[device33],co2,2019-03-21,2019-04-02,1,0.6,3,0.01,0.112987
2,[device33],co2,2019-03-21,2019-04-02,1,0.6,9,0.01,0.118718
1,[device33],co2,2019-03-21,2019-04-02,1,0.6,6,0.01,0.119256
3,[device33],co2,2019-03-23,2019-04-02,1,0.6,3,0.01,0.122369
5,[device33],co2,2019-03-23,2019-04-02,1,0.6,9,0.01,0.132954
4,[device33],co2,2019-03-23,2019-04-02,1,0.6,6,0.01,0.13606


In [57]:
mape_0402.to_csv('dev33_co2_2019-04-02.csv')

In [53]:
# GridSearch
start_time = time.time()

# Parameters
prophet_grid = {'df_dev' : [df_dev],
                'device' : [device],
                'parameter' : ['co2'],
                'begin' : ['2019-03-20', '2019-03-22'],
                'end' : ['2019-04-01'],
                'sampling_period_min' : [1],
                'graph' : [1],
                'predict_day' : [1],
                'interval_width' : [0.6],
                'changepoint_prior_scale' : [0.01], # list(np.arange(0.01,30,1).tolist()),
                'daily_fo' : [3, 6, 9],
#                 'holidays_prior_scale' : list((1000,100,10,1,0.1)),
               }

# Run GridSearch_Prophet
mape_0401 = GridSearch_Prophet(list(ParameterGrid(prophet_grid)), metric='mape')


end_time = time.time()
dur_min = int((end_time - start_time)/60)

print('GridSearch finished '+ str(dur_min) + " minutes.")
mape_0401.sort_values('mape_average')


prophet_instance nb 0
Tested model:
Full dataset: 2019-03-07 to the 2019-05-01. Analysed data the 2019-03-20 to the 2019-04-01.
o Trained on the data from the 2019-03-20 to the 2019-03-31 (10 days).
o Predict from the 2019-03-31 to the 2019-04-01 (1 days).


Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

INFO:fbprophet:Making 10 forecasts with cutoffs between 2019-03-26 06:58:45.925000 and 2019-03-30 18:58:45.925000



prophet_instance nb 1
Tested model:
Full dataset: 2019-03-07 to the 2019-05-01. Analysed data the 2019-03-20 to the 2019-04-01.
o Trained on the data from the 2019-03-20 to the 2019-03-31 (10 days).
o Predict from the 2019-03-31 to the 2019-04-01 (1 days).


Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

INFO:fbprophet:Making 10 forecasts with cutoffs between 2019-03-26 06:58:45.925000 and 2019-03-30 18:58:45.925000



prophet_instance nb 2
Tested model:
Full dataset: 2019-03-07 to the 2019-05-01. Analysed data the 2019-03-20 to the 2019-04-01.
o Trained on the data from the 2019-03-20 to the 2019-03-31 (10 days).
o Predict from the 2019-03-31 to the 2019-04-01 (1 days).


Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

INFO:fbprophet:Making 10 forecasts with cutoffs between 2019-03-26 06:58:45.925000 and 2019-03-30 18:58:45.925000



prophet_instance nb 3
Tested model:
Full dataset: 2019-03-07 to the 2019-05-01. Analysed data the 2019-03-22 to the 2019-04-01.
o Trained on the data from the 2019-03-22 to the 2019-03-31 (8 days).
o Predict from the 2019-03-31 to the 2019-04-01 (1 days).


Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

INFO:fbprophet:Making 6 forecasts with cutoffs between 2019-03-28 06:58:45.925000 and 2019-03-30 18:58:45.925000



prophet_instance nb 4
Tested model:
Full dataset: 2019-03-07 to the 2019-05-01. Analysed data the 2019-03-22 to the 2019-04-01.
o Trained on the data from the 2019-03-22 to the 2019-03-31 (8 days).
o Predict from the 2019-03-31 to the 2019-04-01 (1 days).


Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

INFO:fbprophet:Making 6 forecasts with cutoffs between 2019-03-28 06:58:45.925000 and 2019-03-30 18:58:45.925000



prophet_instance nb 5
Tested model:
Full dataset: 2019-03-07 to the 2019-05-01. Analysed data the 2019-03-22 to the 2019-04-01.
o Trained on the data from the 2019-03-22 to the 2019-03-31 (8 days).
o Predict from the 2019-03-31 to the 2019-04-01 (1 days).


Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

INFO:fbprophet:Making 6 forecasts with cutoffs between 2019-03-28 06:58:45.925000 and 2019-03-30 18:58:45.925000


GridSearch finished 32 minutes.


Unnamed: 0,device,parameter,begin,end,sampling_period_min,interval_width,daily_fo,changepoint_prior_scale,mape_average
3,[device33],co2,2019-03-22,2019-04-01,1,0.6,3,0.01,0.106763
5,[device33],co2,2019-03-22,2019-04-01,1,0.6,9,0.01,0.122046
4,[device33],co2,2019-03-22,2019-04-01,1,0.6,6,0.01,0.124156
2,[device33],co2,2019-03-20,2019-04-01,1,0.6,9,0.01,0.269513
0,[device33],co2,2019-03-20,2019-04-01,1,0.6,3,0.01,0.271532
1,[device33],co2,2019-03-20,2019-04-01,1,0.6,6,0.01,0.273855


In [58]:
mape_0401.to_csv('dev33_co2_2019-04-01.csv')

In [54]:
# GridSearch.
start_time = time.time()

# Parameters
prophet_grid = {'df_dev' : [df_dev],
                'device' : [device],
                'parameter' : ['co2'],
                'begin' : ['2019-03-19', '2019-03-21'],
                'end' : ['2019-03-31'],
                'sampling_period_min' : [1],
                'graph' : [1],
                'predict_day' : [1],
                'interval_width' : [0.6],
                'changepoint_prior_scale' : [0.01], # list(np.arange(0.01,30,1).tolist()),
                'daily_fo' : [3, 6, 9],
#                 'holidays_prior_scale' : list((1000,100,10,1,0.1)),
               }

# Run GridSearch_Prophet
mape_0331 = GridSearch_Prophet(list(ParameterGrid(prophet_grid)), metric='mape')


end_time = time.time()
dur_min = int((end_time - start_time)/60)

print('GridSearch finished '+ str(dur_min) + " minutes.")
mape_0331.sort_values('mape_average')


prophet_instance nb 0
Tested model:
Full dataset: 2019-03-07 to the 2019-05-01. Analysed data the 2019-03-19 to the 2019-03-31.
o Trained on the data from the 2019-03-19 to the 2019-03-30 (10 days).
o Predict from the 2019-03-30 to the 2019-03-31 (1 days).


Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

INFO:fbprophet:Making 10 forecasts with cutoffs between 2019-03-25 06:58:55.069000 and 2019-03-29 18:58:55.069000



prophet_instance nb 1
Tested model:
Full dataset: 2019-03-07 to the 2019-05-01. Analysed data the 2019-03-19 to the 2019-03-31.
o Trained on the data from the 2019-03-19 to the 2019-03-30 (10 days).
o Predict from the 2019-03-30 to the 2019-03-31 (1 days).


Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

INFO:fbprophet:Making 10 forecasts with cutoffs between 2019-03-25 06:58:55.069000 and 2019-03-29 18:58:55.069000



prophet_instance nb 2
Tested model:
Full dataset: 2019-03-07 to the 2019-05-01. Analysed data the 2019-03-19 to the 2019-03-31.
o Trained on the data from the 2019-03-19 to the 2019-03-30 (10 days).
o Predict from the 2019-03-30 to the 2019-03-31 (1 days).


Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

INFO:fbprophet:Making 10 forecasts with cutoffs between 2019-03-25 06:58:55.069000 and 2019-03-29 18:58:55.069000



prophet_instance nb 3
Tested model:
Full dataset: 2019-03-07 to the 2019-05-01. Analysed data the 2019-03-21 to the 2019-03-31.
o Trained on the data from the 2019-03-21 to the 2019-03-30 (8 days).
o Predict from the 2019-03-30 to the 2019-03-31 (1 days).


Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

INFO:fbprophet:Making 6 forecasts with cutoffs between 2019-03-27 06:58:55.069000 and 2019-03-29 18:58:55.069000



prophet_instance nb 4
Tested model:
Full dataset: 2019-03-07 to the 2019-05-01. Analysed data the 2019-03-21 to the 2019-03-31.
o Trained on the data from the 2019-03-21 to the 2019-03-30 (8 days).
o Predict from the 2019-03-30 to the 2019-03-31 (1 days).


Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

INFO:fbprophet:Making 6 forecasts with cutoffs between 2019-03-27 06:58:55.069000 and 2019-03-29 18:58:55.069000



prophet_instance nb 5
Tested model:
Full dataset: 2019-03-07 to the 2019-05-01. Analysed data the 2019-03-21 to the 2019-03-31.
o Trained on the data from the 2019-03-21 to the 2019-03-30 (8 days).
o Predict from the 2019-03-30 to the 2019-03-31 (1 days).


Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

INFO:fbprophet:Making 6 forecasts with cutoffs between 2019-03-27 06:58:55.069000 and 2019-03-29 18:58:55.069000


GridSearch finished 35 minutes.


Unnamed: 0,device,parameter,begin,end,sampling_period_min,interval_width,daily_fo,changepoint_prior_scale,mape_average
3,[device33],co2,2019-03-21,2019-03-31,1,0.6,3,0.01,0.095746
5,[device33],co2,2019-03-21,2019-03-31,1,0.6,9,0.01,0.099277
4,[device33],co2,2019-03-21,2019-03-31,1,0.6,6,0.01,0.100746
1,[device33],co2,2019-03-19,2019-03-31,1,0.6,6,0.01,0.387377
0,[device33],co2,2019-03-19,2019-03-31,1,0.6,3,0.01,0.39095
2,[device33],co2,2019-03-19,2019-03-31,1,0.6,9,0.01,0.391391


In [59]:
mape_0331.to_csv('dev33_co2_2019-03-31.csv')

In [55]:
# GridSearch. To do a randomSearch check below
start_time = time.time()

# Parameters
prophet_grid = {'df_dev' : [df_dev],
                'device' : [device],
                'parameter' : ['co2'],
                'begin' : ['2019-03-18', '2019-03-20'],
                'end' : ['2019-03-30'],
                'sampling_period_min' : [1],
                'graph' : [1],
                'predict_day' : [1],
                'interval_width' : [0.6],
                'changepoint_prior_scale' : [0.01], # list(np.arange(0.01,30,1).tolist()),
                'daily_fo' : [3, 6, 9],
#                 'holidays_prior_scale' : list((1000,100,10,1,0.1)),
               }

# Run GridSearch_Prophet
mape_0330 = GridSearch_Prophet(list(ParameterGrid(prophet_grid)), metric='mape')


end_time = time.time()
dur_min = int((end_time - start_time)/60)

print('GridSearch finished '+ str(dur_min) + " minutes.")
mape_0330.sort_values('mape_average')


prophet_instance nb 0
Tested model:
Full dataset: 2019-03-07 to the 2019-05-01. Analysed data the 2019-03-18 to the 2019-03-30.
o Trained on the data from the 2019-03-18 to the 2019-03-29 (10 days).
o Predict from the 2019-03-29 to the 2019-03-30 (1 days).


Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

INFO:fbprophet:Making 10 forecasts with cutoffs between 2019-03-24 07:58:56.151000 and 2019-03-28 19:58:56.151000



prophet_instance nb 1
Tested model:
Full dataset: 2019-03-07 to the 2019-05-01. Analysed data the 2019-03-18 to the 2019-03-30.
o Trained on the data from the 2019-03-18 to the 2019-03-29 (10 days).
o Predict from the 2019-03-29 to the 2019-03-30 (1 days).


Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

INFO:fbprophet:Making 10 forecasts with cutoffs between 2019-03-24 07:58:56.151000 and 2019-03-28 19:58:56.151000



prophet_instance nb 2
Tested model:
Full dataset: 2019-03-07 to the 2019-05-01. Analysed data the 2019-03-18 to the 2019-03-30.
o Trained on the data from the 2019-03-18 to the 2019-03-29 (10 days).
o Predict from the 2019-03-29 to the 2019-03-30 (1 days).


Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

INFO:fbprophet:Making 10 forecasts with cutoffs between 2019-03-24 07:58:56.151000 and 2019-03-28 19:58:56.151000



prophet_instance nb 3
Tested model:
Full dataset: 2019-03-07 to the 2019-05-01. Analysed data the 2019-03-20 to the 2019-03-30.
o Trained on the data from the 2019-03-20 to the 2019-03-29 (8 days).
o Predict from the 2019-03-29 to the 2019-03-30 (1 days).


Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

INFO:fbprophet:Making 6 forecasts with cutoffs between 2019-03-26 07:58:56.151000 and 2019-03-28 19:58:56.151000



prophet_instance nb 4
Tested model:
Full dataset: 2019-03-07 to the 2019-05-01. Analysed data the 2019-03-20 to the 2019-03-30.
o Trained on the data from the 2019-03-20 to the 2019-03-29 (8 days).
o Predict from the 2019-03-29 to the 2019-03-30 (1 days).


Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

INFO:fbprophet:Making 6 forecasts with cutoffs between 2019-03-26 07:58:56.151000 and 2019-03-28 19:58:56.151000



prophet_instance nb 5
Tested model:
Full dataset: 2019-03-07 to the 2019-05-01. Analysed data the 2019-03-20 to the 2019-03-30.
o Trained on the data from the 2019-03-20 to the 2019-03-29 (8 days).
o Predict from the 2019-03-29 to the 2019-03-30 (1 days).


Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

INFO:fbprophet:Making 6 forecasts with cutoffs between 2019-03-26 07:58:56.151000 and 2019-03-28 19:58:56.151000


GridSearch finished 39 minutes.


Unnamed: 0,device,parameter,begin,end,sampling_period_min,interval_width,daily_fo,changepoint_prior_scale,mape_average
5,[device33],co2,2019-03-20,2019-03-30,1,0.6,9,0.01,0.298872
4,[device33],co2,2019-03-20,2019-03-30,1,0.6,6,0.01,0.300329
3,[device33],co2,2019-03-20,2019-03-30,1,0.6,3,0.01,0.30464
2,[device33],co2,2019-03-18,2019-03-30,1,0.6,9,0.01,0.388522
1,[device33],co2,2019-03-18,2019-03-30,1,0.6,6,0.01,0.390094
0,[device33],co2,2019-03-18,2019-03-30,1,0.6,3,0.01,0.393925


In [60]:
mape_0330.to_csv('dev33_co2_2019-03-30.csv')

**Conclusion**

- The fourier order of 3 is most often (but not always) the best. 
- But the prediction is not necessarily better with more days (compare prediction of 04 01 and 03 31) --> then it make sense to do a grid search with increasing number of days. This probably depends on how variable were the days used for the model fitting.

## Labo for individual testing

In [8]:
# Single instance example
df_p, df_pred = prophet(
            df_dev,
            device,
            parameter='co2',
            begin='2019-03-15',
            end='2019-03-27',
            sampling_period_min=1,
            graph=1, 
            predict_day=1,
            interval_width=0.6,
            changepoint_prior_scale=0.01,
            daily_fo = 3
)
df_p.to_csv('df_p_03-26.csv')
df_pred.to_csv('df_pred_03-26.csv')

Full dataset: 2019-03-07 to the 2019-05-01. Analysed data the 2019-03-15 to the 2019-03-27.
o Trained on the data from the 2019-03-15 to the 2019-03-26 (10 days).
o Predict from the 2019-03-26 to the 2019-03-27 (1 days).


Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

INFO:fbprophet:Making 10 forecasts with cutoffs between 2019-03-21 07:58:44.425000 and 2019-03-25 19:58:44.425000


In [19]:
# Single instance example
df_p, df_pred = prophet(
            df_dev,
            device,
            parameter='co2',
            begin='2019-04-09',
            end='2019-04-21',
            sampling_period_min=1,
            graph=1, 
            predict_day=1,
            interval_width=0.6,
            changepoint_prior_scale=0.01,
            daily_fo = 3
)
df_p.to_csv('df_p_04-20.csv')
df_pred.to_csv('df_pred_04-20.csv')

Full dataset: 2019-03-07 to the 2019-05-01. Analysed data the 2019-04-09 to the 2019-04-21.
o Trained on the data from the 2019-04-09 to the 2019-04-20 (10 days).
o Predict from the 2019-04-20 to the 2019-04-21 (1 days).


Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

INFO:fbprophet:Making 10 forecasts with cutoffs between 2019-04-15 06:58:49.580000 and 2019-04-19 18:58:49.580000


### GridSearch over dataset:

*This function is not yet done. This is the workplan:*

1. Make a function out of the single instance to run a GridSearch
2. Make a table with different prediction time
3. Feed the Gridsearch function. Save parameters, data and graph

In [None]:
# GridSearch. To do a randomSearch check below
start_time = time.time()

# Parameters
prophet_grid = {'df_dev' : [df_dev],
                'device' : [device],
                'parameter' : ['co2'],
                'begin' : ['2019-03-20', '2019-03-22'],
                'end' : ['2019-04-01'],
                'sampling_period_min' : [1],
                'graph' : [1],
                'predict_day' : [1],
                'interval_width' : [0.6],
                'changepoint_prior_scale' : [0.01], # list(np.arange(0.01,30,1).tolist()),
                'daily_fo' : [3, 6, 9],
#                 'holidays_prior_scale' : list((1000,100,10,1,0.1)),
               }

# Run GridSearch_Prophet
mape_0401 = GridSearch_Prophet(list(ParameterGrid(prophet_grid)), metric='mape')


end_time = time.time()
dur_min = int((end_time - start_time)/60)

print('GridSearch finished '+ str(dur_min) + " minutes.")
mape_0401.sort_values('mape_average')

In [139]:
# Create a dataframe with the date to use

dates = pd.DataFrame(columns={'date_minus_10',
                              'date_minus_8',
                              'date_minus_6',
                              'date_predict'})
dates = dates[['date_minus_10', 'date_minus_8', 'date_minus_6', 'date_predict']]

# List of unique dates in the dataframe
dates['date_minus_10'] = df_dev['ds'].unique().strftime('%Y-%m-%d')
dates = dates.drop_duplicates(subset=['date_minus_10'])
dates = dates.reset_index(drop=True)

# Fill the other columns and drop the 10 last columns
dates['date_minus_8'] = dates.iloc[2:, 0].reset_index(drop=True)
dates['date_minus_6'] = dates.iloc[4:, 0].reset_index(drop=True)
dates['date_predict'] = dates.iloc[10:, 0].reset_index(drop=True)
dates = dates[:-10] # Drop the 10 last rows
dates

Unnamed: 0,date_minus_10,date_minus_8,date_minus_6,date_predict
0,2019-03-07,2019-03-09,2019-03-11,2019-03-17
1,2019-03-08,2019-03-10,2019-03-12,2019-03-18
2,2019-03-09,2019-03-11,2019-03-13,2019-03-19
3,2019-03-10,2019-03-12,2019-03-14,2019-03-20
4,2019-03-11,2019-03-13,2019-03-15,2019-03-21
5,2019-03-12,2019-03-14,2019-03-16,2019-03-22
6,2019-03-13,2019-03-15,2019-03-17,2019-03-23
7,2019-03-14,2019-03-16,2019-03-18,2019-03-24
8,2019-03-15,2019-03-17,2019-03-19,2019-03-25
9,2019-03-16,2019-03-18,2019-03-20,2019-03-26


### Storage of code
**Cell backup**

*This cell works! Keep it as backup.*

In [14]:


# GridSearch. To do a randomSearch check below
prophet_grid = {'df_dev' : [df_dev],
                'device' : [device],
                'parameter' : ['co2'],
                'begin' : ['2019-03-22', '2019-03-24'],
                'end' : ['2019-04-03'],
                'sampling_period_min' : [5],
                'graph' : [1],
                'predict_day' : [1],
                'interval_width' : [0.6],
                'changepoint_prior_scale' : list((0.6, 15)), # list(np.arange(0.01,30,1).tolist()),
                'daily_fo' : [12],
#                 'holidays_prior_scale' : list((1000,100,10,1,0.1)),
               }


def GridSearch_Prophet(prophet_grid, metric='mape'):
    """
    mape_table summarizes the mean of mape according to tested parameters.
    then Loop Prophet over the prophet_grid
    """
    
    # mape_table summarizes the mean of mape according to tested parameters 
    mape_table = pd.DataFrame.from_dict(prophet_grid)
    mape_table = mape_table[['device',
                         'parameter',
                         'begin',
                         'end',
                         'sampling_period_min',
                         'interval_width',
                         'daily_fo',
                         'changepoint_prior_scale']]

    mape_table['mape_average'] = np.nan
    
    # Loop Prophet over the prophet_grid
    a=0
    for prophet_instance in prophet_grid:
        print('prophet_instance nb ' + str(a))
#         print(mape_table)
        print('Tested model:')
        df_p, df_pred = prophet(**prophet_instance)
        
        # store the median of mape for 1 day in the table
        mape_table.iloc[a, 8] = df_p.mape.median()
        print(mape_table)
        a+=1
    
    return mape_table

        
mape_table = GridSearch_Prophet(list(ParameterGrid(prophet_grid)), metric='mape')

prophet_instance nb 0
Tested model:
Full dataset: 2019-03-07 to the 2019-05-01. Analysed data the 2019-03-22 to the 2019-04-03.
o Trained on the data from the 2019-03-22 to the 2019-04-02 (10 days).
o Predict from the 2019-04-02 to the 2019-04-03 (1 days).


Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

INFO:fbprophet:Making 10 forecasts with cutoffs between 2019-03-28 06:54:43.412000 and 2019-04-01 18:54:43.412000


       device parameter       begin         end  sampling_period_min  \
0  [device33]       co2  2019-03-22  2019-04-03                    5   
1  [device33]       co2  2019-03-22  2019-04-03                    5   
2  [device33]       co2  2019-03-24  2019-04-03                    5   
3  [device33]       co2  2019-03-24  2019-04-03                    5   

   interval_width  daily_fo  changepoint_prior_scale  mape_average  
0             0.6        12                      0.6      0.128048  
1             0.6        12                     15.0           NaN  
2             0.6        12                      0.6           NaN  
3             0.6        12                     15.0           NaN  
prophet_instance nb 1
Tested model:
Full dataset: 2019-03-07 to the 2019-05-01. Analysed data the 2019-03-22 to the 2019-04-03.
o Trained on the data from the 2019-03-22 to the 2019-04-02 (10 days).
o Predict from the 2019-04-02 to the 2019-04-03 (1 days).


Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

INFO:fbprophet:Making 10 forecasts with cutoffs between 2019-03-28 06:54:43.412000 and 2019-04-01 18:54:43.412000


       device parameter       begin         end  sampling_period_min  \
0  [device33]       co2  2019-03-22  2019-04-03                    5   
1  [device33]       co2  2019-03-22  2019-04-03                    5   
2  [device33]       co2  2019-03-24  2019-04-03                    5   
3  [device33]       co2  2019-03-24  2019-04-03                    5   

   interval_width  daily_fo  changepoint_prior_scale  mape_average  
0             0.6        12                      0.6      0.128048  
1             0.6        12                     15.0      0.130214  
2             0.6        12                      0.6           NaN  
3             0.6        12                     15.0           NaN  
prophet_instance nb 2
Tested model:
Full dataset: 2019-03-07 to the 2019-05-01. Analysed data the 2019-03-24 to the 2019-04-03.
o Trained on the data from the 2019-03-24 to the 2019-04-02 (8 days).
o Predict from the 2019-04-02 to the 2019-04-03 (1 days).


Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

INFO:fbprophet:Making 6 forecasts with cutoffs between 2019-03-30 06:54:43.412000 and 2019-04-01 18:54:43.412000


       device parameter       begin         end  sampling_period_min  \
0  [device33]       co2  2019-03-22  2019-04-03                    5   
1  [device33]       co2  2019-03-22  2019-04-03                    5   
2  [device33]       co2  2019-03-24  2019-04-03                    5   
3  [device33]       co2  2019-03-24  2019-04-03                    5   

   interval_width  daily_fo  changepoint_prior_scale  mape_average  
0             0.6        12                      0.6      0.128048  
1             0.6        12                     15.0      0.130214  
2             0.6        12                      0.6      0.137364  
3             0.6        12                     15.0           NaN  
prophet_instance nb 3
Tested model:
Full dataset: 2019-03-07 to the 2019-05-01. Analysed data the 2019-03-24 to the 2019-04-03.
o Trained on the data from the 2019-03-24 to the 2019-04-02 (8 days).
o Predict from the 2019-04-02 to the 2019-04-03 (1 days).


Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

INFO:fbprophet:Making 6 forecasts with cutoffs between 2019-03-30 06:54:43.412000 and 2019-04-01 18:54:43.412000


       device parameter       begin         end  sampling_period_min  \
0  [device33]       co2  2019-03-22  2019-04-03                    5   
1  [device33]       co2  2019-03-22  2019-04-03                    5   
2  [device33]       co2  2019-03-24  2019-04-03                    5   
3  [device33]       co2  2019-03-24  2019-04-03                    5   

   interval_width  daily_fo  changepoint_prior_scale  mape_average  
0             0.6        12                      0.6      0.128048  
1             0.6        12                     15.0      0.130214  
2             0.6        12                      0.6      0.137364  
3             0.6        12                     15.0      0.136153  


Unnamed: 0,device,parameter,begin,end,sampling_period_min,interval_width,daily_fo,changepoint_prior_scale,mape_average
0,[device33],co2,2019-03-22,2019-04-03,5,0.6,12,0.6,0.128048
1,[device33],co2,2019-03-22,2019-04-03,5,0.6,12,15.0,0.130214
2,[device33],co2,2019-03-24,2019-04-03,5,0.6,12,0.6,0.137364
3,[device33],co2,2019-03-24,2019-04-03,5,0.6,12,15.0,0.136153


#### RandomSearch (with sample function) to set up if relevant

*Interactive Plot Mode from Prophet*

In [None]:
from fbprophet.plot import plot_plotly
import plotly.offline as py
py.init_notebook_mode()

fig_int = plot_plotly(model, forecast)  # This returns a plotly Figure
py.iplot(fig_int)

*Old function to test a model*

In [None]:


df, fig, predict_n, today_index, lookback_n = df_generator(df_dev, device, 'co2', '2019-03-26', '2019-04-03',  '5T', 0.08)
print(lookback_n)
# config the model
model = Prophet(interval_width=0.6, # anomaly threshold,
                yearly_seasonality=False, weekly_seasonality=False, daily_seasonality=False,
                changepoint_prior_scale=0.01) # Adjusting trend flexibility. should be <0.1 low --> toward overfit
model.add_seasonality(name='daily', period=1, fourier_order=12) # prior scale
# model.add_seasonality(name='half_day', period=0.5, fourier_order=10)

# Fit the model, flag outliers, and visualize
assert today_index>lookback_n, 'Not enough data for prediction (lookback_n<today_index)'
p_fig, forecast, model = prophet_fit(df, model, today_index, '5T', 0.08, lookback_days=lookback_n, predict_days=predict_n)   
outliers, df_pred = get_outliers(df, forecast, today_index, predict_days=predict_n)
prophet_plot(df, p_fig, today_index, predict_days=predict_n, outliers=outliers)
plt.show()
param_grid = {'model' : [model],
              'initial' : ['3 days'], # If not provided, 3 * horizon is used. Same units as horizon
              'period'  : ['0.5 days'], # Integer amount of time between cutoff dates. If not provided, 0.5 * horizon is used.
              'horizon' : ['1 days']} # A forecast is made for every observed point between cutoff and cutoff + horizon
execute_cross_validation_and_performance_loop(list(ParameterGrid(param_grid)), metric = 'mape')

### Model evaluation by cross-validation

this is the following part of the test function:

`param_grid = {'model' : [model],
              'initial' : ['3 days'], # If not provided, 3 * horizon is used. Same units as horizon
              'period'  : ['0.5 days'], # Integer amount of time between cutoff dates. If not provided, 0.5 * horizon is used.
              'horizon' : ['1 days']} # A forecast is made for every observed point between cutoff and cutoff + horizon}
execute_cross_validation_and_performance_loop(list(ParameterGrid(param_grid)), metric = 'mape')`

Check this [tutorial](https://medium.com/@jeanphilippemallette/prophet-auto-selection-with-cross-validation-7ba2c0a3beef) for the original code.

*What it does:*
* Show the performance metrics including MSE, RMSE, MAP, MAPE (see [Prophet docs](https://facebook.github.io/prophet/docs/diagnostics.html) for details)

*Args:*
- model: Prophet model defined above.
- initial: String formated as 'n days', where n is a number of days. Define the first day taken in account for cross-validation. *Initial should be smaller than the total duration of the dataframe.*
- period: Same format as initial. Spacing between cutoff dates.
- horizon: Same format as initial. Duration of prediction. A forecast is made for every observed point between cutoff and cutoff + horizon
    
*Returns:*
- dataframe containing: time (ds), predicted value, predicted value -  tolerance, predicted value +  tolerance, real value