# Diagnostics

Prophet includes functionality for time series cross validation to measure forecast error using historical data. This is done by selecting cutoff points in the history, and for each of them fitting the model using data only up to that cutoff point. we can then compare the forecasted values to the actual values. This figure illustrates a simulated historical forecast on the Peyton Manning dataset, where the model was fit to a initial history of 5 years, and a forecast was made on a one year horizon

[The Prophet paper](https://peerj.com/preprints/3190.pdf) gives further description of simulated historical forecasts.

The cross validation procedure can be done automatically fore a range of historical cutoffs using the `cross_validation` function. We specify the forecast horizon(`horizon`), and then optionally the size of the initial training period (`initial`) and the spacing between cutoff dates(`period`). By default, the initail training period is set to three times the horizon, and cutoffs are made every half a horizon.

The output of `cross_validation` is a dataframe with the true values `y` and the out-of-sample forecast values `yhat`, at each simuated forecast date and for each cutoff date. In particular, a forecast is made for every observed point between `cutoff` and `cutoff + horizon`. This dataframe ca then be used to compute error measures of `yhat` vs. `y`.

Here we do cross-validation to assess prediction performance on a horizon of 365 days, starting with 730 days of training data in the first cutoff and then making predictions every 180 days. On this 8 year time series, this corresponds to 11 total forecasts.

In [2]:
import pandas as pd
from fbprophet import Prophet

In [5]:
# Python
df = pd.read_csv('./examples/example_wp_log_R.csv')
m = Prophet().fit(df)

INFO:fbprophet.forecaster:Disabling daily seasonality. Run prophet with daily_seasonality=True to override this.
  elif np.issubdtype(np.asarray(v).dtype, float):


In [6]:
from fbprophet.diagnostics import cross_validation
df_cv = cross_validation(m, initial='730 days', period='180 days', horizon = '365 days')
df_cv.head()

INFO:fbprophet.diagnostics:Making 11 forecasts with cutoffs between 2010-01-26 00:00:00 and 2014-12-31 00:00:00
  elif np.issubdtype(np.asarray(v).dtype, float):


Unnamed: 0,ds,yhat,yhat_lower,yhat_upper,y,cutoff
0,2010-01-27,6.960101,6.616322,7.323974,6.886532,2010-01-26
1,2010-01-28,6.928621,6.55958,7.263112,6.823286,2010-01-26
2,2010-01-29,6.807895,6.463627,7.161609,6.767343,2010-01-26
3,2010-01-30,6.432575,6.102169,6.774662,6.447306,2010-01-26
4,2010-01-31,6.406211,6.030778,6.731408,6.487684,2010-01-26


In [7]:
# Python
from fbprophet.diagnostics import performance_metrics
df_p = performance_metrics(df_cv)
df_p.head()

Unnamed: 0,horizon,mse,rmse,mae,mape,coverage
1118,37 days,0.058103,0.241046,0.183102,0.025146,0.824121
3658,37 days,0.058093,0.241026,0.183016,0.025132,0.824121
2569,37 days,0.058766,0.242417,0.184232,0.025297,0.821608
1842,37 days,0.058758,0.2424,0.184092,0.025279,0.821608
3296,37 days,0.059156,0.243221,0.184741,0.025361,0.819095


In [9]:
# Python
from fbprophet.plot import plot_cross_validation_metric
fig = plot_cross_validation_metric(df_cv, metric='mape')
fig

TypeError: cannot astype a timedelta from [timedelta64[ns]] to [int32]