# Data Setup

In this notebook, we demonstrate how to setup time series data for the examples inlcuded in this book. The data in this example is taken from the GEFCom2014 forecasting competition<sup>1</sup> (see reference below). It consists of 3 years of hourly electricity load and temperature values between 2012 and 2014. 

<sup>1</sup>Tao Hong, Pierre Pinson, Shu Fan, Hamidreza Zareipour, Alberto Troccoli and Rob J. Hyndman, "Probabilistic energy forecasting: Global Energy Forecasting Competition 2014 and beyond", International Journal of Forecasting, vol.32, no.3, pp 896-913, July-September, 2016.

In [None]:
import os
import shutil
import matplotlib.pyplot as plt
from common.utils import load_data, extract_data, download_file
%matplotlib inline

In [None]:
data_dir = './data'

if not os.path.exists(data_dir):
    os.mkdir(data_dir)

if not os.path.exists(os.path.join(data_dir, 'energy.csv')):
    download_file("https://mlftsfwp.blob.core.windows.net/mlftsfwp/GEFCom2014.zip")
    shutil.move("GEFCom2014.zip", os.path.join(data_dir,"GEFCom2014.zip"))
    extract_data(data_dir)

In [None]:
ts_data_load = load_data(data_dir)[['load']]
ts_data_load.head()

In [None]:
ts_data_load.plot(y='load', subplots=True, figsize=(15, 8), fontsize=12)
plt.xlabel('timestamp', fontsize=12)
plt.ylabel('load', fontsize=12)
plt.show()

In [None]:
ts_data_load['2014-07-01':'2014-07-07'].plot(y='load', subplots=True, figsize=(15, 8), fontsize=12)
plt.xlabel('timestamp', fontsize=12)
plt.ylabel('load', fontsize=12)
plt.show()