In [2]:
import pandas as pd
import numpy as np
import datetime
import fbprophet
from sklearn import metrics
import warnings
warnings.filterwarnings("ignore")

First, we load the demand data. We aren't really using the other data we collected, so we are only loading what we need.

In [3]:
demand =pd.read_csv("data/demand.csv")
#print demand head to check for obvious issues
demand.head()

Unnamed: 0.1,Unnamed: 0,date,US48,CAL,CAR,CENT,FLA,MIDA,MIDW,NE,...,SWPP,SPA,TEC,TVA,TEPC,TIDC,NSB,WALC,WACM,WAUW
0,0,2015-07-01 05:00:00+00:00,162827.0,0.0,22945.0,0.0,26384.0,84024.0,0.0,12583.0,...,0.0,0.0,2541.0,0.0,0.0,0.0,47.0,0.0,0.0,0.0
1,1,2015-07-01 06:00:00+00:00,335153.0,0.0,21396.0,28985.0,24336.0,79791.0,73432.0,12349.0,...,28891.0,94.0,2367.0,16136.0,0.0,0.0,46.0,0.0,0.0,0.0
2,2,2015-07-01 07:00:00+00:00,333837.0,0.0,20627.0,27498.0,22842.0,76760.0,70211.0,12445.0,...,27413.0,85.0,2246.0,15503.0,0.0,0.0,43.0,0.0,0.0,0.0
3,3,2015-07-01 08:00:00+00:00,398386.0,38210.0,20102.0,26384.0,21906.0,74931.0,68163.0,12385.0,...,26291.0,93.0,2179.0,14896.0,1605.0,408.0,40.0,1119.0,0.0,0.0
4,4,2015-07-01 09:00:00+00:00,388954.0,35171.0,19931.0,25663.0,21615.0,74368.0,67309.0,12387.0,...,25582.0,81.0,2157.0,14663.0,1537.0,380.0,38.0,1018.0,0.0,0.0


We are using Facebook's prophet module for the actual data processing.

https://facebook.github.io/prophet/

So, we need to produce a data frame in a form that prophet can process correctly.

In [4]:
#fbprophet really just needs a ds column of datetimes and a y frame of data to fit.
df = demand[['date','US48']]
df.columns = ['ds','y']

In case you haven't noticed, the date column includes information about the timezone. Facebook prophet does not support timezone data, so we need to remove this now to prevent an error later.

In [5]:
df['ds'] = pd.to_datetime(df['ds'])
df['ds'] = df['ds'].dt.tz_convert(None)

And, we are done preprocessing. The data is ready to be put into Facebook prophet's model. We just need to save the preprocessed data into a file for retrieval later.

In [6]:
df.to_csv("data/preprocessed.csv")