# SEP 769 - Deep Learning Project - Optimizing Energy Usage in Buildings Using IoT Data and Deep Learning Algorithms
> Hongqing Cao 400053625  
Sushant Shailesh Panchal 400614293  
Yanyi He 400651032  
Yash Parab 400611922

## Introduction
This project aims to develop a machine learning-based system to optimize energy usage in buildings using data from IoT sensors. The goal is to reduce energy consumption and costs while maintaining occupant comfort and health. The project involves preprocessing data from IoT sensors, developing a deep learning model to optimize energy usage, and testing the model on new data to evaluate its effectiveness.

The dataset [**Individual Household Electric Power Consumption dataset**](https://archive.ics.uci.edu/dataset/235) is retrived from UCI Machine Learning Repository, which contains measurements of electric power consumption in one household with a one-minute sampling rate over a period of almost 4 years. Different electrical quantities and some sub-metering values are available.

### Objectives
- Compare two different models over a subset of dataset, find the better performed model. 
- Tune, train, and evaluate the selected model, use this model to forecast future power consumption. 
- Visualize the predicted vs actual load, identify peak/vally power load time frames to suggest grid operation. 


### Dataset
- **Source**: UCI Machine Learning Repository (https://archive.ics.uci.edu/dataset/235).
- **Information**: This archive contains 2075259 measurements gathered in a house located in Sceaux (7km of Paris, France) between December 2006 and November 2010 (47 months). 

- **Features**: 
    - `date: dd/mm/yyyy`
    - `time: hh:mm:ss`
    - `global_active_power: float` Total household active power usage in kilowatts (kW)
    - `reactive_power: float` Power not used for work, in KW
    - `voltage: float` Voltage across the circuit, in V
    - `global_intensity: float` Current drawn, in A
    - `sub_metering_1: float` Energy drawn for kitchen, in Wh
    - `sub_metering_2: float` Energy drawn for laundry, in Wh
    - `sub_metering_3: float` Energy drawn for water heater and AC, in Wh

- **Note**:  
`global_active_power * 1000 / 60 - sub_metering_1 - sub_metering_2 - sub_metering_3`  
represents the active energy consumed every minute (in watt hour) in the household by electrical equipment not measured in sub-meterings 1, 2 and 3.

- **Missing Data**: The dataset contains some missing values in the measurements (nearly 1,25% of the rows). All calendar timestamps are present in the dataset but for some timestamps, the measurement values are missing (willed with `?`). 

### References
- Marino et al. (2016), *Building Energy Load Forecasting using Deep Neural Networks*. 
- Bonetto & Rossi (2017), *Machine Learning Approaches to Energy Consumption Forecasting in Households*. 
- Gasparin et al. (2019), *Deep Learning for Time Series Forecasting: The Electric Load Case*. 

### Essential Notes
- **Hardware**: The code in notebook in designed to best perform with CUDA GPUs, targeting RTX 4070. 

- **Environment**: Use `Python 3.11.7`, `NVIDIA-SMI 575.64.04`, `Driver Version: 577.00`, `CUDA Version: 12.9`. 

## 0. Environment Setup
### Libaries Imported
`numpy 2.1.3`  
`matplotlib 3.10.3`  
`pandas 2.3.1`  
`tensorflow 2.19.0`  
`scikit-learn 1.7.1`  
`keras 3.10.0`  
`tensorboard 2.19.0`  
`seaborn 0.13.2`  

### GPU Config
set `set_memory_growth`, `mixed_precision`

### Utility Functions
``

In [1]:
### Libaries Imported
from os import path
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import tensorflow as tf
import seaborn as sns

from sklearn.preprocessing import StandardScaler

from tensorflow import keras
from keras.datasets import cifar10, mnist
from keras import Sequential, layers
from keras import backend as K
from keras.optimizers import Adam, RMSprop, SGD
from tensorboard.plugins.hparams import api as hp
from keras.callbacks import EarlyStopping
from keras import mixed_precision
from tensorflow.keras.preprocessing.image import ImageDataGenerator
import gc

%load_ext tensorboard

np.random.seed(42)

2025-07-27 21:37:02.560467: I tensorflow/core/util/port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2025-07-27 21:37:02.575085: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:467] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
E0000 00:00:1753666622.587391  131760 cuda_dnn.cc:8579] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1753666622.590941  131760 cuda_blas.cc:1407] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
W0000 00:00:1753666622.602750  131760 computation_placer.cc:177] computation placer already registered. Please check linkage and avoid linking 

In [2]:
### GPU Config
# use set_memory_growth to aallocate VRAM

gpus = tf.config.list_physical_devices('GPU')

if gpus:
    for gpu in gpus:
        tf.config.experimental.set_memory_growth(gpu, True)

# use mixed_precision to accelrate GPU speed

mixed_precision.set_global_policy('mixed_float16')

In [3]:
### Global Variables
DATA_DIR = "data"
RAW_DIR = DATA_DIR+"/raw"
TRAIN_DIR = DATA_DIR+"/train"
TEST_DIR = DATA_DIR+"/test"

MODEL_DIR = "model"

LOG_DIR = "log"
HPARAM_DIR = LOG_DIR+"/hparam"

In [4]:
### Utility functions

def delete_model(modelname, historyname):
    '''function for delete model, history and free vram'''
    global_vars = globals()
    if modelname in globals():
        del global_vars[modelname]
    if historyname in globals():
        del global_vars[historyname]
    gc.collect()
    K.clear_session()

def inspect_data(df, name):
    '''print summary data of a given dataframe, with
    count
    mean
    std
    min
    25%
    50%
    75%
    max
    dtype
    '''
    print(f"\n{name} summary:")
    print(df.describe())
    print(df.isnull().sum())

def plot_time_series(df, column, start=None, end=None, title=None):
    '''plot the raw time series data from time to end with corresponding column

    column =  global_active_power, sub_metering_other, sub_metering_1/2/3'''
    df_subset = df[start:end] if start or end else df
    df_subset[column].plot(figsize=(12, 4), title=title or column)

# def plot_hourly_avg(df, column):
#     hourly_avg = df.groupby(df.index.hour)[column].mean()
#     hourly_avg.plot(kind="bar", figsize=(10, 4), title=f"Average {column} by Hour")

# def plot_dayofweek_avg(df, column):
#     dow_avg = df.groupby(df.index.dayofweek)[column].mean()
#     dow_avg.plot(kind="bar", figsize=(8, 4), title=f"Average {column} by Day of Week")

# def plot_distribution(df, column):
#     df[column].hist(bins=50, figsize=(6, 4), title=f"Distribution of {column}")

# def plot_correlation_heatmap(df):
#     plt.figure(figsize=(10, 6))
#     sns.heatmap(df.corr(), annot=True, cmap="coolwarm")
#     plt.title("Feature Correlation Heatmap")

def plot_lagged_sample(X, y, feature_idx=0, sample_idx=0):
    '''Visualize what an LSTM/S2S input-output pair looks like'''
    plt.plot(X[sample_idx, :, feature_idx], label="Input sequence")
    plt.axhline(y[sample_idx], color='r', linestyle='--', label="Target")
    plt.legend()
    plt.title("Lagged input and target")
    plt.show()

def plot_forecast(y_true, y_pred, title="Forecast vs Actual"):
    '''plot forecast vs actual lines'''
    plt.figure(figsize=(12, 4))
    plt.plot(y_true, label="Actual")
    plt.plot(y_pred, label="Forecast")
    plt.title(title)
    plt.legend()
    plt.show()


## 1. Data Preprocessing

### Load Data
 - Load raw dataset
 - Convert data type

### Resample Data
Resample data to 1-hr

### Handle Missing Data
Reconstruct misssing data with linear interpolation

### Save Raw Data
Save to `data/raw`

### Feature Engineering
 - `season_spring: binary` one-hot coded season transformed from `date`
 - `season_summer: binary` one-hot coded season transformed from `date`
 - `season_autumn: binary` one-hot coded season transformed from `date`
 - `season_winter: binary` one-hot coded season transformed from `date`
 - `day_of_week: int` categorical in `range(7)`, 0 Sunday, 1 Monday, 6 Saturday
 - `hour_sin, hour_cos: float` hour in transfromed from `time` in cyclical encoding, range [-1, 1]
 - `global_active_power: float` target variable, 
 - `sub_metering_other: float` calculated by `global_active_power * 1000 / 60 - sub_metering_1 - sub_metering_2 - sub_metering_3`, 
 - `reactive_power: float` drop
 - `voltage: float` drop
 - `global_intensity: float` drop
 - `sub_metering_1: float` existing raw feature
 - `sub_metering_2: float` existing raw feature
 - `sub_metering_3: float` existing raw feature

### Save Feature Engineered Data
Save to `data/raw`

### Create Subsets (distinct)
 - `model_selection`: 1 month (~720 samples)
 - `hparam_tuning`: 1 month (~720 samples)
 - `train`: 12 months (~8600 samples)
 - `test`: 3 months (~2100 samples)
 - `test_1`: 3 months (~2100 samples) forecast
 - `test_2`: 3 months (~2100 samples) forecast

### Save Subset Data
 - Store model selection, tuning, training data in `data/train` 
 - Store  testing data in `data/test` 

### Normalization and Standarization
Use `StandardScaler` for `global_active_power`, `sub_metering_other`, `sub_metering_1`, `sub_metering_2`, `sub_metering_3` 

### Save Normalized Data
Store model selection, tuning, training data in `data/train` 

### Lagging Data
Lag data by 1 weel = 168 hrs = 168 samples. 
Specify target variable 

### Save Final Processed Data
 - Store raw data in `/data/raw` 
 - Store model selection, tuning, training data in `data/train` 
 - Store  testing data in `data/test` 
 - Store models in `model` 
 - Store hyperparameter tuning log in `log/hparam`. 

In [5]:
### Load Data
# Load raw data
chunks = []
reader = pd.read_csv(
    RAW_DIR+"/household_power_consumption.txt", 
    sep=";", 
    na_values='?',
    low_memory=False,
    chunksize=500_000
    )

for chunk in reader:
    chunks.append(chunk)

df = pd.concat(chunks)

df.columns = df.columns.str.lower()
df['datetime_str'] = df['date'] + ' ' + df['time']

df['datetime'] = (pd.to_datetime(df['datetime_str'], dayfirst=True))
df = df.drop(columns=['date', 'time', 'datetime_str'])
df.set_index("datetime", inplace=True)
df.sort_index(inplace=True)

# Convert from string/object to float
df = df.astype(np.float32)

In [6]:
### Resample Data
# Resample to hourly mean
df_hourly = df.resample('h').mean()

In [7]:
### Handle Missing Data
df_hourly_reco = df_hourly.interpolate(method='linear', limit_direction='both')
df_hourly_reco.dropna(inplace=True)

In [8]:
# Save raw data
df_hourly_reco.to_pickle(RAW_DIR+"/hourly_reco.pkl")


In [9]:
# Load raw data
df_hourly_reco = pd.read_pickle(RAW_DIR+"/hourly_reco.pkl")

In [10]:
### Feature Engineering
df_hourly_reco['month'] = df_hourly_reco.index.month
df_hourly_reco['season'] = df_hourly_reco.index.month % 12 // 3 
# 1=spring, 2=summer, 3=fall, 0=winter

season_map = {0: 'winter', 1: 'spring', 2: 'summer', 3: 'autumn'}
df_hourly_reco['season_label'] = df_hourly_reco['season'].map(season_map)

df_hourly_reco = pd.get_dummies(df_hourly_reco, columns=['season_label'], prefix='season')

dow_mon0 = df_hourly_reco.index.dayofweek # 0=Monday
dow_sun0 = (dow_mon0 + 1) % 7 # 0=Sunday
df_hourly_reco['day_of_week'] = dow_sun0 # 0=Sunday


df_hourly_reco['hour'] = df_hourly_reco.index.hour
# For smoother time-of-day signals:
df_hourly_reco['hour_sin'] = np.sin(2 * np.pi * df_hourly_reco['hour'] / 24)
df_hourly_reco['hour_cos'] = np.cos(2 * np.pi * df_hourly_reco['hour'] / 24)

df_hourly_reco['sub_metering_other'] = (
    df_hourly_reco['global_active_power'] * 1000 / 60
    - df_hourly_reco['sub_metering_1']
    - df_hourly_reco['sub_metering_2']
    - df_hourly_reco['sub_metering_3']
)

df_fe = df_hourly_reco.copy()

In [11]:
### Feature Engineering Continued
drop_cols = ['month', 'season', 'hour', 'voltage', 'global_reactive_power', 'global_intensity']
df_fe.drop(columns=drop_cols, inplace=True)

In [12]:
### Save Feature Engineered Data
df_fe.to_pickle(RAW_DIR+"/feature_engineered.pkl")


In [13]:
### Load Feature Engineered Data
df_fe=pd.read_pickle(RAW_DIR+"/feature_engineered.pkl")

In [14]:
### Create Subsets (distinct)
total_len = len(df_fe)

# Subset sizes (in number of rows)
model_sel_len = 720     # 1 month
tune_len      = 720     # 1 month
train_len     = 8760    # 12 months
test_len      = 2160    # 3 months

# Sequential slicing
model_sel_df = df_fe.iloc[:model_sel_len]

tune_df = df_fe.iloc[model_sel_len : model_sel_len + tune_len]

train_df = df_fe.iloc[model_sel_len + tune_len : model_sel_len + tune_len + train_len]

test_df = df_fe.iloc[model_sel_len + tune_len + train_len : model_sel_len + tune_len + train_len + test_len]

test1_start = '2009-01-15'
test1_end = '2009-04-15'

test2_start = '2009-07-15'
test2_end = '2009-10-15'

test1_df = df_fe.loc[test1_start:test1_end]
test2_df = df_fe.loc[test2_start:test2_end]


In [15]:
### Save Subsets
model_sel_df.to_pickle(TRAIN_DIR+"/model_selection.pkl")
tune_df.to_pickle(TRAIN_DIR+"/hparam_tuning.pkl")
train_df.to_pickle(TRAIN_DIR+"/train.pkl")
test_df.to_pickle(TEST_DIR+"/test.pkl")
test1_df.to_pickle(TEST_DIR+"/test1.pkl")
test2_df.to_pickle(TEST_DIR+"/test2.pkl")


In [16]:
### Load Subsets
model_sel_df = pd.read_pickle(TRAIN_DIR+"/model_selection.pkl")
tune_df = pd.read_pickle(TRAIN_DIR+"/hparam_tuning.pkl")
train_df = pd.read_pickle(TRAIN_DIR+"/train.pkl")
test_df = pd.read_pickle(TEST_DIR+"/test.pkl")
test1_df = pd.read_pickle(TEST_DIR+"/test1.pkl")
test2_df = pd.read_pickle(TEST_DIR+"/test2.pkl")

In [17]:
### Normalization and Standarization
cols_to_scale = ['global_active_power', 'sub_metering_other', 'sub_metering_1', 'sub_metering_2', 'sub_metering_3']

scaler = StandardScaler()
scaler.fit(train_df[cols_to_scale])

def std_scale(df, scaler, cols_to_scale):
    df_scaled = df.copy()
    df_scaled[cols_to_scale] = scaler.transform(df[cols_to_scale])
    return df_scaled


model_sel_scaled = std_scale(model_sel_df, scaler, cols_to_scale)
tune_scaled = std_scale(tune_df, scaler, cols_to_scale)
train_scaled = std_scale(train_df, scaler, cols_to_scale)
test_scaled = std_scale(test_df, scaler, cols_to_scale)
test1_scaled = std_scale(test1_df, scaler, cols_to_scale)
test2_scaled = std_scale(test2_df, scaler, cols_to_scale)

In [18]:
### Save Normalized Data
model_sel_scaled.to_pickle(TRAIN_DIR+"/model_selection_scaled.pkl")
tune_scaled.to_pickle(TRAIN_DIR+"/hparam_tuning_scaled.pkl")
train_scaled.to_pickle(TRAIN_DIR+"/train_scaled.pkl")
test_scaled.to_pickle(TEST_DIR+"/test_scaled.pkl")
test1_scaled.to_pickle(TEST_DIR+"/test1_scaled.pkl")
test2_scaled.to_pickle(TEST_DIR+"/test2_scaled.pkl")


In [19]:
### Load Normalized Data
model_sel_scaled = pd.read_pickle(TRAIN_DIR+"/model_selection_scaled.pkl")
tune_scaled = pd.read_pickle(TRAIN_DIR+"/hparam_tuning_scaled.pkl")
train_scaled = pd.read_pickle(TRAIN_DIR+"/train_scaled.pkl")
test_scaled = pd.read_pickle(TEST_DIR+"/test_scaled.pkl")
test1_scaled = pd.read_pickle(TEST_DIR+"/test1_scaled.pkl")
test2_scaled = pd.read_pickle(TEST_DIR+"/test2_scaled.pkl")

In [20]:
### Lagging Data

def create_lagged_sequences(df, target_col, lag_size=168, forecast_horizon=1):
    features = df.drop(columns=[target_col]).values
    target = df[target_col].values

    num_samples = len(df) - lag_size - forecast_horizon + 1
    num_features = features.shape[1]

    X = np.zeros((num_samples, lag_size, num_features), dtype=np.float32)
    y = np.zeros((num_samples, forecast_horizon), dtype=np.float32)

    for i in range(num_samples):
        X[i] = features[i : i + lag_size]
        y[i] = target[i + lag_size : i + lag_size + forecast_horizon]

    return X, y

target_col='global_active_power'


X_model_sel, y_model_sel = create_lagged_sequences(model_sel_scaled, target_col)
X_tune, y_tune = create_lagged_sequences(tune_scaled, target_col)
X_train, y_train = create_lagged_sequences(train_scaled, target_col)
X_test, y_test = create_lagged_sequences(test_scaled, target_col)
X_test1, y_test1 = create_lagged_sequences(test1_scaled, target_col)
X_test2, y_test2= create_lagged_sequences(test2_scaled, target_col)

In [21]:
### Save Processed Data
np.save(TRAIN_DIR+"/X_model_sel.npy", X_model_sel)
np.save(TRAIN_DIR+"/y_model_sel.npy", y_model_sel)
np.save(TRAIN_DIR+"/X_tune.npy", X_tune)
np.save(TRAIN_DIR+"/y_tune.npy", y_tune)
np.save(TRAIN_DIR+"/X_train.npy", X_train)
np.save(TRAIN_DIR+"/y_train.npy", y_train)
np.save(TRAIN_DIR+"/X_test.npy", X_test)
np.save(TRAIN_DIR+"/y_test.npy", y_test)
np.save(TRAIN_DIR+"/X_test1.npy", X_test1)
np.save(TRAIN_DIR+"/y_test1.npy", y_test1)
np.save(TRAIN_DIR+"/X_test2.npy", X_test2)
np.save(TRAIN_DIR+"/y_test2.npy", y_test2)

In [22]:
### Load Processed Data
y_model_sel = np.load(TRAIN_DIR+"/y_model_sel.npy")
X_model_sel = np.load(TRAIN_DIR+"/X_model_sel.npy")
X_tune = np.load(TRAIN_DIR+"/X_tune.npy")
y_tune = np.load(TRAIN_DIR+"/y_tune.npy")
X_train = np.load(TRAIN_DIR+"/X_train.npy")
y_train = np.load(TRAIN_DIR+"/y_train.npy")
X_test = np.load(TRAIN_DIR+"/X_test.npy")
y_test = np.load(TRAIN_DIR+"/y_test.npy")
X_test1 = np.load(TRAIN_DIR+"/X_test1.npy")
y_test1 = np.load(TRAIN_DIR+"/y_test1.npy")
X_test2 = np.load(TRAIN_DIR+"/X_test2.npy")
y_test2 = np.load(TRAIN_DIR+"/y_test2.npy")

## 2. Model Selection
Train both model with `model_selection` subset with `validation_split = 0.2`. 

### LSTM

### S2S
Seq2seq 

### Evaluation
Use $\mathrm{RMSE}$ to compare models

### Save Model
Save both models in `model`


In [None]:
### LSTM
def build_LSTM(input_shape, units=64, dropout_rate=0.2, learning_rate = 0.01):
    model = Sequential([
        layers.Input(shape=input_shape),
        layers.LSTM(units),
        layers.Dropout(dropout_rate),
        layers.Dense(1)
    ])
    model.compile(optimizer=Adam(learning_rate=learning_rate), loss='mse', metrics=['mae'])
    return model

In [None]:
# ### S2S
# class Encoder(layers.Layer):
#     def __init__(self, input_shape, units):
#         self.input_shape = input_shape
#         self.units = units

#         # The input layer converts tokens to vectors
#         self.input = layers.Input(shape = input_shape)

#         # The LSTM layer processes those vectors sequentially.
#         self.lstm = layers.LSTM(
#             units,
#             return_state = True, 
#         )

#     def call(self, X):
#         X = self.input(X)
#         state = self.lstm
#         return state


In [None]:

# class Decoder(layers.Layer):
#     def __init__(self, output_shape, units):
#         super(Decoder, self).__init__()
#         self.output_shape = output_shape
#         self.units = units

#         # 1. The input layer converts token IDs to vectors
#         self.input = layers.Input(shape=output_shape)

#         # 2. The LSTM keeps track of what's been generated so far.
#         self.lstm = layers.LSTM(
#             units,
#             return_sequences=True,
#             return_state=True,
#         )
#         # 3. This fully connected layer produces the logits for each
#         # output token.
#         self.dense  = layers.Dense(output_shape)

#     def call(self, X,
#          state=None,
#          return_state=False):  
#         # 1. Lookup the embeddings
#         X = self.input(X)
#         # 2. Process the target sequence.
#         X, state = self.lstm(X, initial_state=state)
#         # 3. Generate logit predictions for the next token.
#         logits = self.dense(X)

#         if return_state:
#             return logits, state
#         else:
#             return logits

In [None]:
delete_model('model_sel_LSTM', 'history_sel_LSTM')
delete_model('model_sel_S2S', 'history_sel_S2S')

input_shape = X_model_sel.shape[1:]
output_shape = y_model_sel.shape[1:]

model_sel_LSTM = build_LSTM(input_shape)
model_sel_LSTM.summary()

# encoder = Encoder(input_shape, units=64)
# decoder = Decoder(output_shape, units=64)

# init_state = encoder(X_model_sel)
# output, state = decoder(X_model_sel, init_state)

AttributeError: property 'input' of 'Encoder' object has no setter

In [25]:
### Evaluation

## 3. Model Train
### Hyperparameter Tuning
Tune hyperparameters on `hparam_tuning` subset with `validation_split = 0.2`. 

Hyperparameters to be tuned:
 - `learning_rate`
 - `num_layers`
 - `hidden_size`
 - `dropout`
 - `etc`

Save logs in `log/hparam`

### Train Tuned Model
Train model on `train` subset with `validation_split = 0.2`. 

### Save Model
Save model in `model`. 

In [26]:
### Hyperparameter Tuning


In [27]:
### Train Tuned Model


In [28]:
### Save Model


## 4. Model Evaluation

### Evaluate Model
Use `test` subset to evaluate model with metrics $\mathrm{RMSE}$ and $\mathrm{MAE}$. 

### Visualization
Plots to be visualized
 - Actual vs Predicted 
 - Loss over Epochs

### Analyze Model Performance


In [29]:
### Evaluate Model

In [30]:
### Visualization

In [31]:
### Analyze Model Performance


## 5. Forecaste to Energy Optimization

### Use `test1` and `test2` subset, take one day/week ahead, predict the `global_active_power` of the subset. 

### Merics
RMSE, MAE

### Visualize Result
Predicted vs Actual
