## Household Power Consumption

Power outage accidents will cause huge economic loss to the social economy. Therefore, it is very important to predict power consumption.

Given the rise of smart electricity meters and the wide adoption of electricity generation technology like solar panels, there is a wealth of electricity usage data available.

### Data Description
---

It is a multivariate series comprised of seven variables

* global_active_power: The total active power consumed by the household (kilowatts).

* global_reactive_power: The total reactive power consumed by the household (kilowatts).

* voltage: Average voltage (volts).

* global_intensity: Average current intensity (amps).

* sub_metering_1: Active energy for kitchen (watt-hours of active energy).

* sub_metering_2: Active energy for laundry (watt-hours of active energy).

* sub_metering_3: Active energy for climate control systems (watt-hours of active energy).

This data represents a multivariate time series of power-related variables that in turn could be used to model and even forecast future electricity consumption

### Import the Necessary Libraries

In [24]:
# Data Manipulation
import numpy as np
from numpy import nan
import pandas as pd

# Data Visualization
import matplotlib.pyplot as plt

# Tensorflow
import tensorflow as tf
from tensorflow.keras import Sequential
from tensorflow.keras.layers import LSTM, Dense

# Model performance evaluation
from sklearn.metrics import mean_squared_error

# Scaling
from sklearn.preprocessing import MinMaxScaler

In [25]:
# fix random seed for reproducibility
tf.random.set_seed(7)

### Import the Data

In [26]:
#Reading the dataset
data = pd.read_excel("RNGC1d.xls", sheet_name="Data 1", skiprows=2, usecols=[1])
data.columns
data = data.rename(columns={data.columns[0]:"price"})

In [27]:
data.head() # see the top rows

Unnamed: 0,price
0,2.194
1,2.268
2,2.36
3,2.318
4,2.252


In [28]:
# load the dataset
dataset = data.copy()
dataset

Unnamed: 0,price
0,2.194
1,2.268
2,2.360
3,2.318
4,2.252
...,...
7325,2.439
7326,2.514
7327,2.338
7328,2.223


Observation :
1. From the above diagram we can say that power consumption in the month of Nov, Dec, Jan, Feb, Mar is more as there is a long tail as compare to other months.

2. It also shows that the during the winter seasons, the heating systems are used and not in summer.

3. The above graph is highly concentrated on 0.3W and 1.3W.

### Active Power Uses Prediction
---

What can we predict

* Forecast hourly consumption for the next day.
* Forecast daily consumption for the next week.
* Forecast daily consumption for the next month.
* Forecast monthly consumption for the next year.

#### Modeling Methods
---
There are many modeling methods and few of those are as follows

* Naive Methods -> Naive methods would include methods that make very simple, but often very effective assumptions.

* Classical Linear Methods -> Classical linear methods include techniques are very effective for univariate time series forecasting

* Machine Learning Methods -> Machine learning methods require that the problem be framed as a supervised learning problem.
    * K-nearest neighbors.
    * SVM
    * Decision trees
    * Random forest
    * Gradient boosting machines
    
* Deep Learning Methods -> combinations of CNN LSTM and ConvLSTM, have proven effective on time series classification tasks
    * CNN
    * LSTM
    * CNN - LSTM

#### Problem Framing:
Given recent power consumption, what is the expected power consumption for the week ahead?
This requires that a predictive model forecast the total active power for each day over the next seven days

A model of this type could be helpful within the household in planning expenditures. It could also be helpful on the supply side for planning electricity demand for a specific household.

* Input -> Predict

* [Week1] -> Week2
 
* [Week2] -> Week3

* [Week3] -> Week4

### Modeling

In [29]:
dataset.head()

Unnamed: 0,price
0,2.194
1,2.268
2,2.36
3,2.318
4,2.252


In [30]:
dataset.tail()

Unnamed: 0,price
7325,2.439
7326,2.514
7327,2.338
7328,2.223
7329,2.348


In [31]:
dataset.head()

Unnamed: 0,price
0,2.194
1,2.268
2,2.36
3,2.318
4,2.252


In [32]:
dataset.shape

(7330, 1)

In [33]:
dataset

Unnamed: 0,price
0,2.194
1,2.268
2,2.360
3,2.318
4,2.252
...,...
7325,2.439
7326,2.514
7327,2.338
7328,2.223


In [34]:
# split into train and test sets
train_size = int(len(dataset) * 0.67)
test_size = len(dataset) - train_size
data_train, data_test = dataset.iloc[0:train_size,:]["price"], dataset.iloc[train_size:len(dataset),:]["price"]
print(len(data_train), len(data_test))

4911 2419


In [35]:
data_train

0       2.194
1       2.268
2       2.360
3       2.318
4       2.252
        ...  
4906    3.230
4907    3.310
4908    3.285
4909    3.342
4910    3.419
Name: price, Length: 4911, dtype: float64

In [36]:
data_test

4911    3.368
4912    3.463
4913    3.444
4914    3.460
4915    3.545
        ...  
7325    2.439
7326    2.514
7327    2.338
7328    2.223
7329    2.348
Name: price, Length: 2419, dtype: float64

### Preparing the Data

In [37]:
#training data

data_train.head(14)

0     2.194
1     2.268
2     2.360
3     2.318
4     2.252
5     2.250
6     2.305
7     2.470
8     2.246
9     2.359
10    2.417
11    2.528
12    2.554
13    2.639
Name: price, dtype: float64

In [38]:
#converting the data into numpy array

data_train = np.array(data_train)

In [39]:
data_train

array([2.194, 2.268, 2.36 , ..., 3.285, 3.342, 3.419])

In [41]:
# we are splitting the data weekly wise(7days)

X_train, y_train = [], []

for i in range(7, len(data_train)-7):
    X_train.append(data_train[i-7:i])
    y_train.append(data_train[i:i+7])

In [43]:
#converting list to numpy array

X_train, y_train = np.array(X_train), np.array(y_train)

In [44]:
#shape of train and test dataset

X_train.shape, y_train.shape

((4897, 7), (4897, 7))

In [45]:
#printing the ytrain value

pd.DataFrame(y_train).head()

Unnamed: 0,0,1,2,3,4,5,6
0,2.47,2.246,2.359,2.417,2.528,2.554,2.639
1,2.246,2.359,2.417,2.528,2.554,2.639,2.585
2,2.359,2.417,2.528,2.554,2.639,2.585,2.383
3,2.417,2.528,2.554,2.639,2.585,2.383,2.369
4,2.528,2.554,2.639,2.585,2.383,2.369,2.347


In [46]:
#Normalising the dataset between 0 and 1

x_scaler = MinMaxScaler()
X_train = x_scaler.fit_transform(X_train)

In [47]:
#Normalising the dataset

y_scaler = MinMaxScaler()
y_train = y_scaler.fit_transform(y_train)

In [48]:
pd.DataFrame(X_train).head()

Unnamed: 0,0,1,2,3,4,5,6
0,0.061971,0.067236,0.073782,0.070793,0.066097,0.065955,0.069868
1,0.067236,0.073782,0.070793,0.066097,0.065955,0.069868,0.081608
2,0.073782,0.070793,0.066097,0.065955,0.069868,0.081608,0.065671
3,0.070793,0.066097,0.065955,0.069868,0.081608,0.065671,0.07371
4,0.066097,0.065955,0.069868,0.081608,0.065671,0.07371,0.077837


In [67]:
pd.DataFrame(y_train)

Unnamed: 0,0,1,2,3,4,5,6
0,0.081608,0.065671,0.073710,0.077837,0.085735,0.087584,0.093632
1,0.065671,0.073710,0.077837,0.085735,0.087584,0.093632,0.089790
2,0.073710,0.077837,0.085735,0.087584,0.093632,0.089790,0.075418
3,0.077837,0.085735,0.087584,0.093632,0.089790,0.075418,0.074422
4,0.085735,0.087584,0.093632,0.089790,0.075418,0.074422,0.072857
...,...,...,...,...,...,...,...
4892,0.151049,0.146852,0.144006,0.142014,0.141942,0.136891,0.140448
4893,0.146852,0.144006,0.142014,0.141942,0.136891,0.140448,0.135681
4894,0.144006,0.142014,0.141942,0.136891,0.140448,0.135681,0.141373
4895,0.142014,0.141942,0.136891,0.140448,0.135681,0.141373,0.139594


In [68]:
y_train = y_train.reshape(4897, 7, 1)

In [50]:
#converting to 3 dimension

X_train = X_train.reshape(4897, 7, 1)

In [51]:
X_train.shape

(4897, 7, 1)

### Build LSTM Model

In [53]:
#building sequential model using Keras

reg = Sequential()
reg.add(LSTM(units = 5, activation = 'relu', input_shape=(7,1)))
reg.add(Dense(7))

In [57]:
#here we have considered loss as mean square error and optimizer as adam

reg.compile(loss='mean_squared_error', optimizer='adam')

In [58]:
#training the model

reg.fit(X_train, y_train, epochs = 50, verbose=2)

Epoch 1/50
154/154 - 2s - loss: 0.0524 - 2s/epoch - 11ms/step
Epoch 2/50
154/154 - 0s - loss: 0.0199 - 348ms/epoch - 2ms/step
Epoch 3/50
154/154 - 0s - loss: 0.0088 - 287ms/epoch - 2ms/step
Epoch 4/50
154/154 - 0s - loss: 0.0045 - 294ms/epoch - 2ms/step
Epoch 5/50
154/154 - 0s - loss: 0.0021 - 286ms/epoch - 2ms/step
Epoch 6/50
154/154 - 0s - loss: 0.0013 - 285ms/epoch - 2ms/step
Epoch 7/50
154/154 - 0s - loss: 0.0011 - 292ms/epoch - 2ms/step
Epoch 8/50
154/154 - 0s - loss: 9.8759e-04 - 287ms/epoch - 2ms/step
Epoch 9/50
154/154 - 0s - loss: 9.4487e-04 - 299ms/epoch - 2ms/step
Epoch 10/50
154/154 - 0s - loss: 9.0973e-04 - 332ms/epoch - 2ms/step
Epoch 11/50
154/154 - 0s - loss: 8.8698e-04 - 421ms/epoch - 3ms/step
Epoch 12/50
154/154 - 0s - loss: 8.7760e-04 - 301ms/epoch - 2ms/step
Epoch 13/50
154/154 - 0s - loss: 8.6582e-04 - 280ms/epoch - 2ms/step
Epoch 14/50
154/154 - 0s - loss: 8.7212e-04 - 283ms/epoch - 2ms/step
Epoch 15/50
154/154 - 0s - loss: 8.5444e-04 - 360ms/epoch - 2ms/step
Epoc

<keras.callbacks.History at 0x24455ee07c0>

### Prepare test dataset and test LSTM model

In [59]:
#testing dataset

data_test = np.array(data_test)

In [60]:
#here we are splitting the data weekly wise(7days)

X_test, y_test = [], []

for i in range(7, len(data_test)-7):
    X_test.append(data_test[i-7:i])
    y_test.append(data_test[i:i+7])

In [61]:
X_test, y_test = np.array(X_test), np.array(y_test)

In [62]:
X_test = x_scaler.transform(X_test)
y_test = y_scaler.transform(y_test)

In [63]:
y_test.shape

(2405, 7)

In [64]:
X_test.shape

(2405, 7)

In [65]:
#converting to 3 dimension

X_test = X_test.reshape(2405,7,1)

### Predicting