<b>EnergyUsagePrediction.model_def.asum.v1_0_7.ipynb</b>
<br/>For my use case "Energy usage prediction based on historical weather and energy usage data.". The original dataset  can be downloaded from <a href="https://www.kaggle.com/taranvee/smart-home-dataset-with-weather-information">kaggle</a>
<br/>The dataset used in this step (feature engineering) has already been transformed in the ETL step.
<br/>Data exploration is described/performed in "EnergyUsagePrediction.data_exp.asum.1_0_5.Ipynb"
<br/>ETL is described/performed in "EnergyUsagePrediction.etl.asum.1_0_8.Ipynb"
<br/>Feature engineering is described/performed in "EnergyUsagePrediction.feature_eng.asum.1_0_8.Ipynb"
<br/>
<br/>This task defines the machine learning or deep learning model.
<br/>
<br/>Load <i>smart-home-dataset-with-weather-information_post_feature_eng.csv</i> file into pandas dataframe


In [None]:
import types
import numpy as np
import pandas as pd
from botocore.client import Config
import ibm_boto3

def __iter__(self): return 0

# @hidden_cell
# The following code accesses a file in your IBM Cloud Object Storage. It includes your credentials.
# You might want to remove those credentials before you share the notebook.
client_x = ibm_boto3.client(service_name='s3',
    ibm_api_key_id='[credentials]',
    ibm_auth_endpoint="https://iam.cloud.ibm.com/oidc/token",
    config=Config(signature_version='oauth'),
    endpoint_url='https://s3.eu-geo.objectstorage.service.networklayer.com')

body = client_x.get_object(Bucket='default-donotdelete-pr-dczw8ajohz6wjh',Key='smart-home-dataset-with-weather-information_post_feature_eng.csv')['Body']
# add missing __iter__ method, so pandas accepts body as file-like object
if not hasattr(body, "__iter__"): body.__iter__ = types.MethodType( __iter__, body )

df = pd.read_csv(body)
df.head()



In [None]:
# import the necessary packages
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dropout
from tensorflow.keras.layers import Dense
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.layers.experimental import preprocessing
#from keras import backend as K


In [None]:
%matplotlib inline
import matplotlib.pyplot as plt


For usability we define constants for the labels


In [None]:
lbTimestamp = 'Timestamp'
lbTotalEneryUsage = 'TotalUsage_kW'
lbTemperature = 'Temperature_F'
lbTemperatureNormalized = 'Temperature_F_normalized'
lbHumidity = 'Humidity'
lbHumidityNormalized = 'Humidity_normalized'
lbPressure = 'Pressure_hPa'
lbPressureNormalized = 'Pressure_hPa_normalized'
lbWindSpeed = 'WindSpeed'
lbWindSpeedNormalized = 'WindSpeed_normalized'
lbCloudCover = 'cloudCover'
lbCloudCoverNormalized = 'cloudCover_normalized'
lbWindBearing = 'WindBearing'
lbWindBearingNormalized = 'WindBearing_normalized'
lbPrecipIntensity = 'PrecipIntensity'
lbPrecipIntensityNormalized = 'PrecipIntensity_normalized'
lbDewPoint = 'dewPoint_F'
lbDewPointNormalized = 'dewPoint_F_normalized'
lbDayOfYear='dayOfYear'
lbDayOfYearNormalized='dayOfYear_normalized'
lbHourOfDay='hourOfDay'
lbHourOfDayNormalized='hourOfDay_normalized'
lbMinuteOfDay='minuteOfDay'
lbMinuteOfDayNormalized='minuteOfDay_normalized'

lbWeatherIndicatorClearDay = 'weatherIndicator_clear-day'
lbWeatherIndicatorClearNight = 'weatherIndicator_clear-night'
lbWeatherIndicatorCloudy = 'weatherIndicator_cloudy'
lbWeatherIndicatorFog = 'weatherIndicator_fog'
lbWeatherIndicatorPartlyCloudyDay = 'weatherIndicator_partly-cloudy-day'
lbWeatherIndicatorPartlyCloudyNight = 'weatherIndicator_partly-cloudy-night'
lbWeatherIndicatorRain = 'weatherIndicator_rain'
lbWeatherIndicatorSnow = 'weatherIndicator_snow'
lbWeatherIndicatorWind = 'weatherIndicator_wind'

In [None]:
df.info()

We take 80% of the dataset for training, 20 procent for testing.
<br/>We use a seed to build deterministic training and test data.

In [None]:
inputColumns = [lbTemperatureNormalized,
                lbHumidityNormalized,
                lbWindSpeedNormalized,
                lbWindBearingNormalized,
                lbDewPointNormalized,
                lbPrecipIntensityNormalized, 
                lbDayOfYearNormalized,
                lbHourOfDayNormalized, 
                lbMinuteOfDayNormalized, 
                lbWeatherIndicatorClearDay,
                lbWeatherIndicatorClearNight,
                lbWeatherIndicatorCloudy,
                lbWeatherIndicatorFog,
                lbWeatherIndicatorPartlyCloudyDay,
                lbWeatherIndicatorPartlyCloudyNight,
                lbWeatherIndicatorRain,
                lbWeatherIndicatorSnow,
                lbWeatherIndicatorWind]


In [None]:
outputColumns = [lbTotalEneryUsage]

In [None]:
train=df.sample(frac=0.8,random_state=42) #random state is a seed value
test=df.drop(train.index)

In [None]:
train_x=train[inputColumns]
test_x=test[inputColumns]

In [None]:
train_x.info()


In [None]:
train_y=train[outputColumns]
test_y=test[outputColumns]

In [None]:
train_y.info()

First we will use a traditional machine learning algorithm: LinearRegression

In [None]:
from sklearn import linear_model

In [None]:
# with sklearn

regr = linear_model.LinearRegression()
regr.fit(train_x, train_y)


In [None]:
print('Intercept: \n', regr.intercept_)
print('Coefficients: \n', regr.coef_)

In [None]:
predicted = regr.predict(test_x)

In [None]:
import sklearn.metrics as sm
print("Mean absolute error =", round(sm.mean_absolute_error(test_y, predicted), 2)) 
print("Mean squared error =", round(sm.mean_squared_error(test_y, predicted), 2)) 
print("Median absolute error =", round(sm.median_absolute_error(test_y, predicted), 2)) 
print("Explain variance score =", round(sm.explained_variance_score(test_y, predicted), 2)) 
print("R2 score =", round(sm.r2_score(test_y, predicted), 2))

In [None]:
from sklearn.metrics import mean_squared_error
mean_squared_error(test_y, predicted)

In [None]:
samples = test.sample(200,random_state=42)
samples_x=samples[inputColumns]
samples_y=samples[outputColumns]
predictedSamples = regr.predict(samples_x)

In [None]:
figure=plt.figure(figsize=(12,12))
samples_y = samples.reset_index()
plt.plot(samples_y[lbTotalEneryUsage], figure=figure)
plt.xlabel("x")
plt.ylabel("actual+predicted")
plt.plot(predictedSamples, figure=figure)
plt.show()

Now  let's start with a Deep Learning approach using Keras Sequential Model

In [None]:
# Create Keras model
model = Sequential()
batch_size = 32
input_dim=18
model.add(Dense(batch_size*input_dim, kernel_initializer = "uniform",input_dim=input_dim, name="input"))
model.add(Dense(256, activation="relu", name="hiddenlayer1"))
model.add(Dense(1, name="output"))

# Gradient descent algorithm
adam = Adam(0.001)

model.compile(loss='mse', optimizer=adam)
history = model.fit(train_x, train_y, epochs=15, batch_size=batch_size)


In [None]:
plt.plot(history.history['loss'])
plt.xlabel("No. of Iterations")
plt.ylabel("J(Theta1 Theta0)/Cost")
plt.show()

In [None]:
model.evaluate(test_x, test_y)

In [None]:
predicted = model.predict(test_x)


In [None]:
print("Mean absolute error =", round(sm.mean_absolute_error(test_y, predicted), 2)) 
print("Mean squared error =", round(sm.mean_squared_error(test_y, predicted), 2)) 
print("Median absolute error =", round(sm.median_absolute_error(test_y, predicted), 2)) 
print("Explain variance score =", round(sm.explained_variance_score(test_y, predicted), 2)) 
print("R2 score =", round(sm.r2_score(test_y, predicted), 2))

In [None]:
def printSamplePredictedVsActual(myModel, testData):
    test_sample= testData.sample(200,random_state=42)
    test_sample = test_sample.reset_index()
    test_sample_x = test_sample[inputColumns]
    test_sample_y = test_sample[outputColumns]
    test_predicted = myModel.predict(test_sample_x)
    figure=plt.figure(figsize=(12,12))
    plt.plot(test_sample[outputColumns], figure=figure)
    plt.xlabel("x")
    plt.ylabel("actual")
    plt.plot(test_predicted, figure=figure)
    plt.show()
    

In [None]:
printSamplePredictedVsActual(model, test)

In [None]:
print(str(predicted.min()) +','+str(predicted.max())+str(predicted.mean()))

In [None]:
# Create Keras model2
model2 = Sequential()
batch_size = 32
input_dim=18
model2.add(Dense(batch_size*input_dim, kernel_initializer = "uniform",input_dim=input_dim, name="input"))
model2.add(Dropout(0.2))
model2.add(Dense(256, activation="relu", name="hiddenlayer1"))
model2.add(Dense(256, activation="relu", name="hiddenlayer2"))
model2.add(Dense(256, activation="relu", name="hiddenlayer3"))
model2.add(Dense(256, activation="relu", name="hiddenlayer4"))
model2.add(Dense(256, activation="relu", name="hiddenlayer5"))
model2.add(Dense(1, name="output"))

# Gradient descent algorithm
#adam = Adam(0.1)
adam = Adam(0.00001)

model2.compile(loss='mse', optimizer=adam)
history = model2.fit(train_x, train_y, epochs=25, batch_size=batch_size)

In [None]:
plt.plot(history.history['loss'])
plt.xlabel("No. of Iterations")
plt.ylabel("J(Theta1 Theta0)/Cost")
plt.show()

In [None]:
test_x=test[inputColumns]
test_y=test[outputColumns]
model2.evaluate(test_x, test_y)


In [None]:
predicted = model2.predict(test_x)

In [None]:
print("Mean absolute error =", round(sm.mean_absolute_error(test_y, predicted), 2)) 
print("Mean squared error =", round(sm.mean_squared_error(test_y, predicted), 2)) 
print("Median absolute error =", round(sm.median_absolute_error(test_y, predicted), 2)) 
print("Explain variance score =", round(sm.explained_variance_score(test_y, predicted), 2)) 
print("R2 score =", round(sm.r2_score(test_y, predicted), 2))

In [None]:
printSamplePredictedVsActual(model2, test)

So far, the Dense Neural network performs best.