# ASHRAE - Great Energy Predictor III¶

This notebook contains basic exploration of provided data and creation of recurent neural network model with tensorflow. <br>This it was kept in a simple manner in order to provide base line solution for LSTM.
<br><br>

_Context_

With advancements in technology and increasing number of people world wide also the amount of energy is rising. To prevent negative impact of this growth we could find possibilities to optimize the energy consumed by buildings. In order to do that there is required some insights into "energetic efficiency" of buildings.

In [None]:
!pip install tensorflow==2.0.0
!pip install chart_studio

import numpy as np 
import pandas as pd
import os

import tensorflow as tf
import matplotlib.pyplot as plt
plt.style.use('fivethirtyeight') 

%matplotlib inline
from pylab import rcParams
from plotly import tools
import chart_studio.plotly as py
from plotly.offline import init_notebook_mode, iplot
init_notebook_mode(connected=True)
import plotly.graph_objs as go
import plotly.figure_factory as ff
import statsmodels.api as sm

import math
import random

### Preprocessing

We've following files for exploration.

In [None]:
!ls ../input/ashrae-energy-prediction/

Let's start analysis from the reading datasets.

In [None]:
%%time
train_df = pd.read_csv('../input/ashrae-energy-prediction/train.csv')
weather_train_df = pd.read_csv('../input/ashrae-energy-prediction/weather_train.csv')
test_df = pd.read_csv('../input/ashrae-energy-prediction/test.csv')
weather_test_df = pd.read_csv('../input/ashrae-energy-prediction/weather_test.csv')
building_meta_df = pd.read_csv('../input/ashrae-energy-prediction/building_metadata.csv')

In [None]:
train_df.head()

In [None]:
weather_train_df.head()

In [None]:
test_df.head()

In [None]:
weather_test_df.head()

In [None]:
building_meta_df.head()

In [None]:
train_df = train_df.merge(building_meta_df, on="building_id")

In [None]:
train_df = train_df.merge(weather_train_df, on=["site_id", "timestamp"])

Let's take a look at sample rows from combined dataframe.

In [None]:
sample_df = train_df.sample(20, random_state=0)
sample_df

Now let's change the 'primary_use' column to categorical.

In [None]:
train_df['primary_use'] = pd.Categorical(train_df['primary_use'])
train_df['primary_use'] = train_df['primary_use'].cat.codes
train_df.head(10)

Let's check how many 'NaN' values has each column.

In [None]:
train_df.isnull().sum()

In [None]:
100 * train_df.isnull().sum() / len(train_df)

As we can see there is quite lot of NaN's in some of the columns. In order to use the this data in neural network we have to change them into numerical values or drop them.

In [None]:
del train_df['floor_count']
del train_df['year_built']

### Shaping data
In order to run LSTM neural network data has to be in three dimentional shape where axis corresponds to following data:
* x-axis time steps
* y-axis data examples
* z-axis features for single point in time

![image.png](attachment:image.png)

There are fine written resources on the net describing how LSTM networks works and what's their architecture.
* i.e. [here](http://colah.github.io/posts/2015-08-Understanding-LSTMs/)

Let's take data for first 10 building_id's and preprocess them for LSTM neural network.

In [None]:
def extract_data(building_id):
    return train_df[train_df['building_id'] == building_id]

extracted_df = {}
for i in range(10):
    df = extract_data(i)
    sea_lev_pressure_mean = df['sea_level_pressure'].mean()
    dew_temperature_mean = df['dew_temperature'].mean()
    air_temperature_mean = df['air_temperature'].mean()
    precip_depth_1_hr_mean = df['precip_depth_1_hr'].mean()
    wind_speed_mean = df['wind_speed'].median()

    df['meter'] = pd.to_numeric(df['meter'], errors='coerce').fillna(0).astype(np.float32)
    df['meter_reading'] = pd.to_numeric(df['meter_reading'], errors='coerce').fillna(0).astype(np.float32)
    df['site_id'] = pd.to_numeric(df['site_id'], errors='coerce').fillna(0).astype(np.float32)
    df['primary_use'] = pd.to_numeric(df['primary_use'], errors='coerce').fillna(0).astype(np.float32)
    df['square_feet'] = pd.to_numeric(df['square_feet'], errors='coerce').fillna(0).astype(np.float32)
    
    df['air_temperature'] = df['air_temperature'].fillna(-1)
    df['cloud_coverage'] = df['cloud_coverage'].fillna(-1)
    df['sea_level_pressure'] = df['sea_level_pressure'].fillna(sea_lev_pressure_mean)
    df['dew_temperature'] = df['dew_temperature'].fillna(dew_temperature_mean)
    df['air_temperature'] = df['air_temperature'].fillna(air_temperature_mean)
    df['precip_depth_1_hr'] = df['precip_depth_1_hr'].fillna(precip_depth_1_hr_mean)
    df['wind_direction'] = df['wind_direction'].apply(lambda x: random.random() * 360.0 if (math.isnan(x)) else x)
    df['wind_speed'] = df['wind_speed'].fillna(wind_speed_mean).astype(np.float32)
    extracted_df[i] = df
    
extracted_df[4]

Dictionary 'extracted_df' contains data for buildings_ids in range \[0, 10) which have to be reshaped in order to feed them into LSTM neural network.

In [None]:
### Building simple LSTM network

In [None]:
def input_function(x_tr, y_tr, x_d, y_d, x_t, y_t, bs_tr):
    t_d = tf.data.Dataset.from_tensor_slices((x_tr, y_tr))
    t_d = t_d.cache().batch(bs_tr).repeat()
    d_d = tf.data.Dataset.from_tensor_slices((x_d, y_d))
    d_d = d_d.batch(bs_tr).repeat()
    v_d = tf.data.Dataset.from_tensor_slices((x_t, y_t))
    v_d = v_d.batch(1).repeat()
    return t_d, d_d, v_d

Splits data according to provided parameters.

In [None]:
def split_data(ds, tar, st_ind, en_ind, hs,
               ts, step):
    data = []
    lab = []

    st_ind = st_ind + hs
    if en_ind is None:
        en_ind = len(ds) - ts

    for i in range(st_ind, en_ind):
        indices = range(i - hs, i, step)
        data.append(ds[indices])
        lab.append(tar[i + ts])

    return np.array(data), np.array(lab)

Prepares data for three sets of data (train, dev, val) for every 'building_id'.

In [None]:
def prepare_data(data, hn, tv_split,
                 ve_split, tn, step=1, bs=64):
    ds = {}

    for j in range(10):
        x_tr, y_tr = split_data(data[j], data[j][:, 3], 0,
                                          tv_split, hn,
                                          tn, step)
        x_d, y_d = split_data(data[j], data[j][:, 3],
                                      tv_split, ve_split, hn,
                                      tn, step)
        x_t, y_t = split_data(data[j], data[j][:, 3],
                                        ve_split, None, hn,
                                        tn, step)
        t_d, d_d, v_d = input_function(x_tr, y_tr, x_d, y_d, x_t, y_t, bs)
        ds[j] = {}
        ds[j]['train'] = t_d
        ds[j]['dev'] = d_d
        ds[j]['test'] = v_d

    inp_d = x_tr.shape[-2:]

    return inp_d, ds

Standardize features by removing mean and scaling according to variances.

In [None]:
def scale_data(data_dict, fc, fc_n):
    td = {}
    norm_param = {}

    for i in range(10):
        norm_param[i] = {}
        features = data_dict[i][fc]
        features.index = data_dict[i]['timestamp']
        features.head()
        for j in fc_n:
            data_mean = features[j].mean(axis=0)
            data_std = features[j].std(axis=0)
            features[j] = (features[j] - data_mean) / data_std
        td[i] = features.values
    
    return td, norm_param
        


Fit to every data set with given 'building_id'.

In [None]:
def fit_model(ds, model, eval_int, epochs=10, trg='meter'):
    tm_histories = {}
    for i in range(10):
        print("Fitting to %s" % i)
        tm_histories[i] = model.fit(ds[i]["train"],
                                   validation_data=ds[i]["dev"],
                                   steps_per_epoch=eval_int,
                                   epochs=epochs,
                                   validation_steps=1).history
    return model, tm_histories

Here we prepare data using previously defined functions

In [None]:
fc = ['meter', 'meter_reading', 'site_id',
      'primary_use', 'air_temperature', 'cloud_coverage',
      'dew_temperature', 'precip_depth_1_hr', 'sea_level_pressure',
      'wind_direction', 'wind_speed']

fc_n = ['meter_reading','air_temperature', 'cloud_coverage',
       'dew_temperature', 'precip_depth_1_hr', 'sea_level_pressure',
       'wind_direction', 'wind_speed']

tv_split = int(len(extracted_df[4])* 0.7)
ve_split = int(len(extracted_df[4])* 0.85)
past = 30
eval_int = 1

td_set, norm_params = scale_data(extracted_df, fc, fc_n)
# print(td_set[1][:10])
inp_d, ds = prepare_data(td_set, past, tv_split, ve_split, 1, step=1, bs=32)



Bellow we create LSTM model with tensorflow keras. Sample loss function is doesn't complie with the one provided in the task yet.

In [None]:
model = tf.keras.models.Sequential()
model.add(tf.keras.layers.LSTM(units=30,
                               return_sequences=True,
                               input_shape=inp_d))
model.add(tf.keras.layers.LSTM(units=32, return_sequences=True))
model.add(tf.keras.layers.Dropout(0.8))
model.add(tf.keras.layers.LSTM(units=16, return_sequences=True))
model.add(tf.keras.layers.Dropout(0.8))
model.add(tf.keras.layers.Dense(1))
model.compile(optimizer=tf.keras.optimizers.Adam(clipvalue=1.0), loss='mse')

In [None]:
# model, hist_dict = fit_model(ds, model, eval_int, 500)

### Further analysis
Comming soon ...