# Using FBProphet to forecast the total price 
Based on [Sales Forecasting Using Facebook’s Prophet
](https://medium.com/fritzheartbeat/sales-forecasting-using-facebooks-prophet-f9ae0214f196)

Sales forecasting is one the most common tasks in many sales-driven organizations. When done well, it enables organizations to adequately plan for the future with a degree of confidence. In this tutorial, we’ll use Prophet, a package developed by Facebook to show how one can achieve this. This package is available in both Python and R. We assume that the reader has a basic understanding of handling time series data in Python.

In [None]:
# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python Docker image: https://github.com/kaggle/docker-python
# For example, here's several helpful packages to load

import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)


In [None]:
transactions_train = pd.read_csv("/kaggle/input/h-and-m-personalized-fashion-recommendations/transactions_train.csv")

In [None]:
transactions_train.tail()

In [None]:
transactions = transactions_train.groupby('t_dat')['price'].sum().reset_index()

In [None]:
transactions['ds'] = transactions['t_dat']
transactions['y'] = transactions['price']
transactions = transactions[['ds','y']]

In [None]:
transactions.tail()

Prophet works best with hourly and weekly data over several months. When working with Prophet, yearly data is most preferred

In [None]:
from fbprophet import Prophet

Start by creating an instance of the Prophet class and then fit it to our dataset.

In [None]:
model = Prophet()
model.add_country_holidays(country_name='US')
model.fit(transactions)

### Making Future Predictions


The next step is to prepare our model to make future predictions. This is achieved using the `Prophet.make_future_dataframe` method and passing the number of days we’d like to predict in the future. We use the periods attribute to specify this. This also include the historical dates. We’ll use these historical dates to compare the predictions with the actual values in the `ds` column.

In [None]:
future = model.make_future_dataframe(periods=7)
future.tail()

### Obtaining the Forecasts

We use the `predict` method to make future predictions. This will generate a dataframe with a `yhat` column that will contain the predictions.

In [None]:
forecast = model.predict(future)

If we check the head for our forecast dataframe, we’ll notice that it has a lot of columns. However, we are mainly interested in `ds`, `yhat`, `yhat_lower`, and `yhat_upper`. `yhat` is our predicted forecast, `yhat_lower` is the lower bound for our predictions, and `yhat_upper` is the upper bound.

In [None]:
forecast.sample()

In [None]:
forecast[['ds', 'yhat', 'yhat_lower', 'yhat_upper']].tail()


### Plotting the Forecasts


Prophet has an inbuilt feature that enables us to plot the forecasts we just generated. This is achieved using `model.plot()` and passing in our forecasts as the argument. The blue line in the graph represents the predicted values while the black dots represents the data in our dataset.

In [None]:
plot = model.plot(forecast)
# The blue line in the graph represents the predicted values while the black dots 
# represents the data in our dataset.

#### Plotting the Forecast Components

The `plot_components` method plots the trend, yearly, and weekly seasonality of the time series data.

In [None]:
plot2 = model.plot_components(forecast)

In [None]:
customers = pd.read_csv("/kaggle/input/h-and-m-personalized-fashion-recommendations/customers.csv")

### Cross Validation


Next let’s measure the forecast error using the historical data. We’ll do this by comparing the predicted values with the actual values. In order to perform this operation, we select cut off points in the data history and fit the model with data up to that cut off point.
Afterwards, we compare the actual values to the predicted values. The `cross_validation` method allows us to do this in Prophet. This method takes the following parameters, as explained below:

- horizon the forecast horizon

- initial the size of the initial training period

- period the spacing between cutoff dates

The output of the cross_validation method is a dataframe containing `y` (the true values) and `yhat` (the predicted values). We’ll use this dataframe to compute the prediction errors.

### Time series cross validation to measure forecast error using historical data.
Select a cut off points in the past
Fit the model to the data up to that cut off point
Compare the forecasted values to the actual values.

In [None]:
from fbprophet.diagnostics import cross_validation #measure forecast error using historical data
# This is done by selecting cutoff points in the history, and for each of them fitting the model using 
# data only up to that cutoff point. We can then compare the forecasted values to the actual values
df_cv = cross_validation(model,horizon = '50 days') #  forecast horizon
# By default, the initial training period is set to three times the horizon
# cutoffs are made every half a horizon
df_cv.head()

In [None]:
from fbprophet.diagnostics import performance_metrics
df_p = performance_metrics(df_cv)
df_p.tail()

In [None]:
from fbprophet.plot import plot_cross_validation_metric
fig = plot_cross_validation_metric(df_cv, metric='rmse')

### Obtaining the Performance Metrics
We use the `performance_metrics` utility to compute the Mean Squared Error(MSE), Root Mean Squared Error(RMSE), Mean Absolute Error(MAE), Mean Absolute Percentage Error(MAPE) and the coverage of the `yhat_lower` and `yhat_upper` estimates.

In [None]:
from fbprophet.diagnostics import performance_metrics
df_p = performance_metrics(df_cv)
df_p.head()