# Energy Price Prediction Project

## Previous Notebooks

- [Energy data import and cleaning](1.0-GME-Data.ipynb)
- [Weather data import and cleaning](1.1-Weather-Data.ipynb)
- [Energy price futures import and cleaning](1.2-Futures-Data.ipynb)
- [Gas price import and cleaning](1.3-Gas-Data.ipynb)

In [1]:
import numpy as np
import pandas as pd

Loading pickles and merging them in one dataset:

In [2]:
market = pd.read_pickle('../data/interim/market.pkl')
weather = pd.read_pickle('../data/interim/weather.pkl')
# futures = pd.read_pickle('../data/interim/futures.pkl') # I'm not using this for the moment being
gas = pd.read_pickle('../data/interim/gas.pkl')

In [3]:
# today I get weather data till yesterday, so I have to join pun date with yesterday's weather date
energy = market.merge(weather.shift(periods=1, freq='d'), left_on='date', right_index=True, how='left')

In [4]:
energy = energy.merge(gas, left_on='date', right_on='market_date', how='left')\
                .drop('market_date', axis=1)\
                .rename(columns={'control_price':'gas_price'})

In [5]:
energy.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 33600 entries, 0 to 33599
Data columns (total 61 columns):
date         33600 non-null datetime64[ns]
hour         33600 non-null int64
pun          33600 non-null float64
italy        33600 non-null int64
cnorth       33600 non-null int64
csouth       33600 non-null int64
north        33600 non-null int64
sardinia     33600 non-null int64
sicily       33600 non-null int64
south        33600 non-null int64
AUST-NORD    33600 non-null float64
AUST-XAUS    23856 non-null float64
BRNN-GREC    33600 non-null float64
BRNN-SUD     33600 non-null float64
BSP-SLOV     33600 non-null float64
CNOR-CORS    33600 non-null float64
CNOR-CSUD    33600 non-null float64
CNOR-NORD    33600 non-null float64
COAC-SARD    33600 non-null float64
CORS-CNOR    33600 non-null float64
CORS-SARD    33600 non-null float64
CSUD-CNOR    33600 non-null float64
CSUD-SARD    33600 non-null float64
CSUD-SUD     33600 non-null float64
FOGN-SUD     33600 non-null float64


In [6]:
energy.tail()

Unnamed: 0,date,hour,pun,italy,cnorth,csouth,north,sardinia,sicily,south,...,SVIZ-NORD,XAUS-AUST,XFRA-FRAN,hdd_liml,hdd_lira,hdd_lirn,cdd_liml,cdd_lira,cdd_lirn,gas_price
33595,2017-10-14,20,65.51901,34858,3872,5843,18188,1068,2304,3583,...,10000.0,276.0,3321.0,1.5,1.0,0.6,1.0,1.3,1.8,19.29
33596,2017-10-14,21,57.52219,32855,3678,5505,16994,1034,2232,3412,...,10000.0,276.0,3321.0,1.5,1.0,0.6,1.0,1.3,1.8,19.29
33597,2017-10-14,22,51.64213,30160,3349,5040,15773,959,2043,2996,...,10000.0,276.0,3205.0,1.5,1.0,0.6,1.0,1.3,1.8,19.29
33598,2017-10-14,23,49.99348,27857,2964,4651,14741,894,1893,2714,...,10000.0,276.0,3205.0,1.5,1.0,0.6,1.0,1.3,1.8,19.29
33599,2017-10-14,24,46.80189,25891,2795,4236,13739,846,1734,2541,...,10000.0,256.0,3057.0,1.5,1.0,0.6,1.0,1.3,1.8,19.29


In [7]:
energy.to_pickle('../data/interim/energy.pkl')

## Following Notebooks

- [Exploratory data analysis](2.0-EDA.ipynb)
- [Feature engineering](3.0-Feature-Engineering.ipynb)
- [More exploratory data analysis](4.0-EDA-Bis.ipynb)
- [Predictive model](5.0-Model.ipynb)