## Step 1: Load and Inspect the Dataset

In this step, we will:

- Import necessary Python libraries
- Load the dataset using `pandas`
- Combine the `Date` and `Time` columns into a single `Datetime` column
- Set the `Datetime` column as the index for easier time series manipulation
- Display the first few rows of the cleaned dataset to verify formatting


In [None]:
import pandas as pd

# Load the dataset (update the path as needed)
df = pd.read_csv('household_power_consumption.txt', sep=';',
                 parse_dates={'Datetime': ['Date', 'Time']},
                 infer_datetime_format=True,
                 na_values='?',
                 low_memory=False)

# Set datetime as the index
df.set_index('Datetime', inplace=True)

# Convert all columns to numeric (some may be loaded as object due to 'na_values')
df = df.apply(pd.to_numeric, errors='coerce')

# Preview the dataset
print("Shape of dataset:", df.shape)
df.head()


  df = pd.read_csv('household_power_consumption.txt', sep=';',
  df = pd.read_csv('household_power_consumption.txt', sep=';',


Shape of dataset: (2075259, 7)


  df = pd.read_csv('household_power_consumption.txt', sep=';',


Unnamed: 0_level_0,Global_active_power,Global_reactive_power,Voltage,Global_intensity,Sub_metering_1,Sub_metering_2,Sub_metering_3
Datetime,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
2006-12-16 17:24:00,4.216,0.418,234.84,18.4,0.0,1.0,17.0
2006-12-16 17:25:00,5.36,0.436,233.63,23.0,0.0,1.0,16.0
2006-12-16 17:26:00,5.374,0.498,233.29,23.0,0.0,2.0,17.0
2006-12-16 17:27:00,5.388,0.502,233.74,23.0,0.0,1.0,17.0
2006-12-16 17:28:00,3.666,0.528,235.68,15.8,0.0,1.0,17.0


## Step 2: Handle Missing Values and Optional Resampling

In this step, we will:

- Identify and handle missing values in the dataset
- Drop rows with any missing values (since the dataset is large enough)
- Optionally resample the data to a coarser time interval (e.g., 5-minute average) to reduce data size and smooth noise
- Display the dataset shape and a sample to confirm preprocessing


In [None]:
# Check for missing values
missing_counts = df.isna().sum()
print("Missing values per column:\n", missing_counts)




Missing values per column:
 Global_active_power      25979
Global_reactive_power    25979
Voltage                  25979
Global_intensity         25979
Sub_metering_1           25979
Sub_metering_2           25979
Sub_metering_3           25979
dtype: int64


In [None]:
# Check for missing values again
print("Missing values before imputation:\n", df.isna().sum())

# Fill missing values using forward-fill and then back-fill
df_filled = df.ffill().bfill()

# Confirm that no missing values remain
print("\nMissing values after imputation:\n", df_filled.isna().sum())

# Optional: Resample to 5-minute intervals
df_resampled = df_filled.resample('5T').mean()

print(f"\nDataset shape after optional resampling: {df_resampled.shape}")
df_resampled.head()


Missing values before imputation:
 Global_active_power      25979
Global_reactive_power    25979
Voltage                  25979
Global_intensity         25979
Sub_metering_1           25979
Sub_metering_2           25979
Sub_metering_3           25979
dtype: int64

Missing values after imputation:
 Global_active_power      0
Global_reactive_power    0
Voltage                  0
Global_intensity         0
Sub_metering_1           0
Sub_metering_2           0
Sub_metering_3           0
dtype: int64

Dataset shape after optional resampling: (415053, 7)


  df_resampled = df_filled.resample('5T').mean()


Unnamed: 0_level_0,Global_active_power,Global_reactive_power,Voltage,Global_intensity,Sub_metering_1,Sub_metering_2,Sub_metering_3
Datetime,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
2006-12-16 17:20:00,4.216,0.418,234.84,18.4,0.0,1.0,17.0
2006-12-16 17:25:00,4.6616,0.4972,234.272,19.96,0.0,1.4,16.8
2006-12-16 17:30:00,3.836,0.5116,234.204,16.56,0.0,1.2,16.8
2006-12-16 17:35:00,4.6684,0.41,234.212,20.0,0.0,1.0,16.8
2006-12-16 17:40:00,3.9176,0.0616,235.89,16.76,0.0,0.0,17.0


## Step 3: Feature Scaling (Normalization)

To prepare the data for training a deep learning model, we normalize all features to a common scale.

- We'll use `MinMaxScaler` to scale all features between 0 and 1
- This helps the model train faster and prevents features with large values from dominating the learning
- We keep the scaler object to inverse-transform predictions later


In [None]:
from sklearn.preprocessing import MinMaxScaler

# Initialize the scaler
scaler = MinMaxScaler()

# Fit and transform the resampled data
scaled_data = scaler.fit_transform(df_resampled)

# Convert back to DataFrame to preserve column names and structure
df_scaled = pd.DataFrame(scaled_data, columns=df_resampled.columns, index=df_resampled.index)

# Preview scaled data
df_scaled.head()


Unnamed: 0_level_0,Global_active_power,Global_reactive_power,Voltage,Global_intensity,Sub_metering_1,Sub_metering_2,Sub_metering_3
Datetime,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
2006-12-16 17:20:00,0.429205,0.389199,0.363893,0.435407,0.0,0.012626,0.548387
2006-12-16 17:25:00,0.475406,0.462942,0.34436,0.472727,0.0,0.017677,0.541935
2006-12-16 17:30:00,0.389806,0.47635,0.342022,0.391388,0.0,0.015152,0.541935
2006-12-16 17:35:00,0.476111,0.38175,0.342297,0.473684,0.0,0.012626,0.541935
2006-12-16 17:40:00,0.398266,0.057356,0.4,0.396172,0.0,0.0,0.548387


## Step 4: Creating Sliding Windows for Supervised Learning

Neural networks like RNNs expect input data to be in the form of sequences. Since our data is a continuous time series, we convert it into a supervised learning problem using sliding windows.

Here's what we do:

- We define a `sequence_length` — how many past time steps to include as input (e.g., 12 past readings)
- For every window of `sequence_length` timesteps, we extract:
  - An input `X` of shape `(sequence_length, num_features)`
  - A target `y`, which is the value of **`Global_active_power` at the next time step (t+1)`**
- This results in a dataset of `(num_samples, sequence_length, num_features)` input sequences and a `(num_samples,)` target array
- This structure is ideal for feeding into RNNs or CNNs

We'll also split the data into `train`, `validation`, and `test` sets after creating the windows.


In [None]:
import numpy as np

# Set parameters
sequence_length = 12  # 12 timesteps (60 minutes if 5-minute intervals)
target_column = 'Global_active_power'

# Convert to NumPy array for faster slicing
values = df_scaled.values
target_index = df_scaled.columns.get_loc(target_column)

X, y = [], []

for i in range(len(values) - sequence_length):
    X.append(values[i:i+sequence_length])                  # all features for seq_len
    y.append(values[i+sequence_length, target_index])      # target at t+1

X = np.array(X)
y = np.array(y)

print(f"X shape: {X.shape} — (samples, seq_len, num_features)")
print(f"y shape: {y.shape} — (samples,)")

# Quick sanity check
print("\nExample input sequence shape:", X[0].shape)
print("Corresponding target value:", y[0])


X shape: (415041, 12, 7) — (samples, seq_len, num_features)
y shape: (415041,) — (samples,)

Example input sequence shape: (12, 7)
Corresponding target value: 0.3068596549435965


## Step 5: Train/Test Split

To evaluate our forecasting model, we split the dataset into a training set and a test set.

- We use an **80/20 chronological split** to ensure that the model is trained on past data and tested on future data (important in time series forecasting).
- We do **not shuffle** the data because doing so would break the temporal order.


In [None]:
# Define split index (80% train, 20% test)
split_idx = int(len(X) * 0.8)

X_train, X_test = X[:split_idx], X[split_idx:]
y_train, y_test = y[:split_idx], y[split_idx:]

print(f"Train set: {X_train.shape}, {y_train.shape}")
print(f"Test set: {X_test.shape}, {y_test.shape}")


Train set: (332032, 12, 7), (332032,)
Test set: (83009, 12, 7), (83009,)


## Step 6: Define PyTorch LSTM Model and Training Function with W&B Logging

In this step, we:

- Define an LSTM model using PyTorch for forecasting `Global_active_power`
- Wrap the training loop in a sweep-compatible `train()` function
- Track:
  - Training loss, accuracy, precision, recall, F1 score
  - Final test loss + metrics
  - Confusion matrix (by thresholding outputs)
  - Model artifact for best-performing model
- Log dataset details automatically within each W&B run


In [None]:
import torch
import torch.nn as nn
from sklearn.metrics import mean_squared_error, mean_absolute_error, r2_score
import wandb
import numpy as np

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

# Define PyTorch LSTM model
class LSTMForecaster(nn.Module):
    def __init__(self, input_size, hidden_size, num_layers, dropout):
        super(LSTMForecaster, self).__init__()
        self.lstm = nn.LSTM(input_size, hidden_size, num_layers, batch_first=True, dropout=dropout)
        self.fc = nn.Linear(hidden_size, 1)

    def forward(self, x):
        out, _ = self.lstm(x)
        out = self.fc(out[:, -1])
        return out.squeeze(1)

# Sweep-compatible training function
def train(config=None):
    with wandb.init(config=config) as run:
        config = run.config

        # Dataset info logging
        run.log({
            "dataset_name": "Household Electric Power Consumption (resampled 5-min)",
            "sequence_length": config.sequence_length,
            "features": list(df_scaled.columns),
            "train_size": X_train.shape[0],
            "test_size": X_test.shape[0],
            "sample_input_sequence": wandb.Table(
                data=X_train[:5].reshape(5, -1),
                columns=[f"f{i}" for i in range(X_train.shape[1] * X_train.shape[2])]
            ),
            "sample_targets": wandb.Histogram(y_train[:100])
        })

        # Model
        model = LSTMForecaster(
            input_size=X_train.shape[2],
            hidden_size=config.hidden_size,
            num_layers=config.num_layers,
            dropout=config.dropout
        ).to(device)

        wandb.watch(model)

        # DataLoader
        train_data = torch.utils.data.TensorDataset(
            torch.tensor(X_train, dtype=torch.float32),
            torch.tensor(y_train, dtype=torch.float32)
        )
        test_data = torch.utils.data.TensorDataset(
            torch.tensor(X_test, dtype=torch.float32),
            torch.tensor(y_test, dtype=torch.float32)
        )
        train_loader = torch.utils.data.DataLoader(train_data, batch_size=config.batch_size, shuffle=False)
        test_loader = torch.utils.data.DataLoader(test_data, batch_size=config.batch_size, shuffle=False)

        # Training setup
        criterion = nn.MSELoss()
        optimizer = torch.optim.Adam(model.parameters(), lr=config.learning_rate)
        best_mse = float("inf")

        for epoch in range(config.epochs):
            model.train()
            train_losses = []
            for xb, yb in train_loader:
                xb, yb = xb.to(device), yb.to(device)
                optimizer.zero_grad()
                y_pred = model(xb)
                loss = criterion(y_pred, yb)
                loss.backward()
                optimizer.step()
                train_losses.append(loss.item())

            # Evaluation
            model.eval()
            y_preds, y_trues = [], []
            with torch.no_grad():
                for xb, yb in test_loader:
                    xb = xb.to(device)
                    y_pred = model(xb).cpu().numpy()
                    y_preds.extend(y_pred)
                    y_trues.extend(yb.numpy())

            mse = mean_squared_error(y_trues, y_preds)
            mae = mean_absolute_error(y_trues, y_preds)
            r2 = r2_score(y_trues, y_preds)

            # Log per-epoch metrics
            wandb.log({
                "epoch": epoch + 1,
                "train_loss": np.mean(train_losses),
                "val_mse": mse,
                "val_mae": mae,
                "val_r2": r2
            })

            # Save best model based on MSE
            if mse < best_mse:
                best_mse = mse
                torch.save(model.state_dict(), "best_model.pt")

        # Final evaluation on test set
        final_mse = mean_squared_error(y_trues, y_preds)
        final_mae = mean_absolute_error(y_trues, y_preds)
        final_r2 = r2_score(y_trues, y_preds)

        # Log final evaluation metrics
        wandb.log({
            "final_mse": final_mse,
            "final_mae": final_mae,
            "final_r2": final_r2
        })

        # Log to summary for sweep comparison table
        run.summary["best_val_mse"] = best_mse
        run.summary["final_val_mse"] = final_mse
        run.summary["final_val_mae"] = final_mae
        run.summary["final_val_r2"] = final_r2

        # Upload model artifact
        artifact = wandb.Artifact("lstm_forecasting_model", type="model")
        artifact.add_file("best_model.pt")
        run.log_artifact(artifact)


## Step 7: Define and Launch W&B Sweep

We now define a hyperparameter sweep using Weights & Biases to optimize our LSTM model.

### 🔍 Sweep Variables:

| Hyperparameter | Description | Values |
|----------------|-------------|--------|
| `learning_rate` | How fast the model learns | [0.001, 0.005, 0.01] |
| `hidden_size` | Size of LSTM's memory | [32, 64, 128] |
| `dropout` | Dropout to prevent overfitting | [0.1, 0.3, 0.5] |
| `batch_size` | Batch size for training | [32, 64] |
| `num_layers` | Number of LSTM layers | [1, 2] |
| `epochs` | Number of training epochs | 10 (fixed) |

The sweep will use **Bayesian optimization**, targeting **maximum F1-score**.


In [None]:
sweep_config = {
    'method': 'bayes',  # You can switch to 'random' for faster results
    'metric': {
        'name': 'final_mse',   # ✅ This is what we now track and want to minimize
        'goal': 'minimize'
    },
    'parameters': {
        'learning_rate': {
            'values': [0.001, 0.005, 0.01]
        },
        'hidden_size': {
            'values': [32, 64, 128]
        },
        'dropout': {
            'values': [0.1, 0.3, 0.5]
        },
        'batch_size': {
            'values': [32, 64]
        },
        'num_layers': {
            'values': [1, 2]
        },
        'epochs': {
            'value': 10  # fixed across all runs
        },
        'sequence_length': {
            'value': 12  # just for logging clarity
        },
        'features': {
            'value': list(df_scaled.columns)  # static for logging
        }
    }
}

# Initialize sweep
sweep_id = wandb.sweep(sweep_config, project="GoTG_Assignment06_RNN_Take2", entity="usf-guardians")
print("Sweep initialized with ID:", sweep_id)

# Launch agent – this will run 10 sweep iterations using the `train()` function
wandb.agent(sweep_id, function=train, count=10)


[34m[1mwandb[0m: Using wandb-core as the SDK backend.  Please refer to https://wandb.me/wandb-core for more information.


<IPython.core.display.Javascript object>

[34m[1mwandb[0m: Logging into wandb.ai. (Learn how to deploy a W&B server locally: https://wandb.me/wandb-server)
[34m[1mwandb[0m: You can find your API key in your browser here: https://wandb.ai/authorize
wandb: Paste an API key from your profile and hit enter:

 ··········


[34m[1mwandb[0m: No netrc file found, creating one.
[34m[1mwandb[0m: Appending key for api.wandb.ai to your netrc file: /root/.netrc


Create sweep with ID: 295fthxo
Sweep URL: https://wandb.ai/usf-guardians/GoTG_Assignment06_RNN_Take2/sweeps/295fthxo
Sweep initialized with ID: 295fthxo


[34m[1mwandb[0m: Agent Starting Run: nykfkeh8 with config:
[34m[1mwandb[0m: 	batch_size: 32
[34m[1mwandb[0m: 	dropout: 0.3
[34m[1mwandb[0m: 	epochs: 10
[34m[1mwandb[0m: 	features: ['Global_active_power', 'Global_reactive_power', 'Voltage', 'Global_intensity', 'Sub_metering_1', 'Sub_metering_2', 'Sub_metering_3']
[34m[1mwandb[0m: 	hidden_size: 128
[34m[1mwandb[0m: 	learning_rate: 0.01
[34m[1mwandb[0m: 	num_layers: 2
[34m[1mwandb[0m: 	sequence_length: 12
[34m[1mwandb[0m: Currently logged in as: [33mprincepraveen[0m ([33musf-guardians[0m) to [32mhttps://api.wandb.ai[0m. Use [1m`wandb login --relogin`[0m to force relogin


0,1
epoch,▁▂▃▃▄▅▆▆▇█
final_mae,▁
final_mse,▁
final_r2,▁
sequence_length,▁
test_size,▁
train_loss,▂▁▁▁▁▁▁█▇▇
train_size,▁
val_mae,▂▁▁▁▁▁▁▇██
val_mse,▂▁▁▁▁▁▁▇▇█

0,1
dataset_name,Household Electric P...
epoch,10
final_mae,0.04434
final_mse,0.00402
final_r2,0.491
sequence_length,12
test_size,83009
train_loss,0.00556
train_size,332032
val_mae,0.04434


[34m[1mwandb[0m: Agent Starting Run: 7boqt1l1 with config:
[34m[1mwandb[0m: 	batch_size: 64
[34m[1mwandb[0m: 	dropout: 0.1
[34m[1mwandb[0m: 	epochs: 10
[34m[1mwandb[0m: 	features: ['Global_active_power', 'Global_reactive_power', 'Voltage', 'Global_intensity', 'Sub_metering_1', 'Sub_metering_2', 'Sub_metering_3']
[34m[1mwandb[0m: 	hidden_size: 32
[34m[1mwandb[0m: 	learning_rate: 0.005
[34m[1mwandb[0m: 	num_layers: 1
[34m[1mwandb[0m: 	sequence_length: 12




0,1
epoch,▁▂▃▃▄▅▆▆▇█
final_mae,▁
final_mse,▁
final_r2,▁
sequence_length,▁
test_size,▁
train_loss,█▅▄▃▃▂▂▁▁▁
train_size,▁
val_mae,▅▂▃▅▆█▂▅▁▁
val_mse,█▅▄▃▃▄▁▃▂▁

0,1
dataset_name,Household Electric P...
epoch,10
final_mae,0.0181
final_mse,0.00109
final_r2,0.86149
sequence_length,12
test_size,83009
train_loss,0.00131
train_size,332032
val_mae,0.0181


[34m[1mwandb[0m: Agent Starting Run: gqivi3u8 with config:
[34m[1mwandb[0m: 	batch_size: 64
[34m[1mwandb[0m: 	dropout: 0.5
[34m[1mwandb[0m: 	epochs: 10
[34m[1mwandb[0m: 	features: ['Global_active_power', 'Global_reactive_power', 'Voltage', 'Global_intensity', 'Sub_metering_1', 'Sub_metering_2', 'Sub_metering_3']
[34m[1mwandb[0m: 	hidden_size: 64
[34m[1mwandb[0m: 	learning_rate: 0.005
[34m[1mwandb[0m: 	num_layers: 1
[34m[1mwandb[0m: 	sequence_length: 12




0,1
epoch,▁▂▃▃▄▅▆▆▇█
final_mae,▁
final_mse,▁
final_r2,▁
sequence_length,▁
test_size,▁
train_loss,█▄▃▃▃▂▂▁▁▁
train_size,▁
val_mae,▃▇▄▅▇▁▃▄▁█
val_mse,█▇▅▅█▃▃▂▁▅

0,1
dataset_name,Household Electric P...
epoch,10
final_mae,0.01897
final_mse,0.00113
final_r2,0.85688
sequence_length,12
test_size,83009
train_loss,0.0013
train_size,332032
val_mae,0.01897


[34m[1mwandb[0m: Agent Starting Run: njrywq95 with config:
[34m[1mwandb[0m: 	batch_size: 64
[34m[1mwandb[0m: 	dropout: 0.3
[34m[1mwandb[0m: 	epochs: 10
[34m[1mwandb[0m: 	features: ['Global_active_power', 'Global_reactive_power', 'Voltage', 'Global_intensity', 'Sub_metering_1', 'Sub_metering_2', 'Sub_metering_3']
[34m[1mwandb[0m: 	hidden_size: 64
[34m[1mwandb[0m: 	learning_rate: 0.01
[34m[1mwandb[0m: 	num_layers: 2
[34m[1mwandb[0m: 	sequence_length: 12


0,1
epoch,▁▂▃▃▄▅▆▆▇█
final_mae,▁
final_mse,▁
final_r2,▁
sequence_length,▁
test_size,▁
train_loss,█▄▃▃▂▂▂▁▁▁
train_size,▁
val_mae,█▃▃▂▂▁▁▁▂█
val_mse,▆▄▃▄▂▂▂▁▃█

0,1
dataset_name,Household Electric P...
epoch,10
final_mae,0.02114
final_mse,0.00137
final_r2,0.82686
sequence_length,12
test_size,83009
train_loss,0.00147
train_size,332032
val_mae,0.02114


[34m[1mwandb[0m: Agent Starting Run: 6z9z24ta with config:
[34m[1mwandb[0m: 	batch_size: 64
[34m[1mwandb[0m: 	dropout: 0.5
[34m[1mwandb[0m: 	epochs: 10
[34m[1mwandb[0m: 	features: ['Global_active_power', 'Global_reactive_power', 'Voltage', 'Global_intensity', 'Sub_metering_1', 'Sub_metering_2', 'Sub_metering_3']
[34m[1mwandb[0m: 	hidden_size: 64
[34m[1mwandb[0m: 	learning_rate: 0.005
[34m[1mwandb[0m: 	num_layers: 2
[34m[1mwandb[0m: 	sequence_length: 12


0,1
epoch,▁▂▃▃▄▅▆▆▇█
final_mae,▁
final_mse,▁
final_r2,▁
sequence_length,▁
test_size,▁
train_loss,█▃▃▂▂▂▁▁▁▁
train_size,▁
val_mae,█▇█▆▆▄▅▂▃▁
val_mse,█▇▆▅▆▄▄▁▅▁

0,1
dataset_name,Household Electric P...
epoch,10
final_mae,0.01746
final_mse,0.00105
final_r2,0.86751
sequence_length,12
test_size,83009
train_loss,0.00147
train_size,332032
val_mae,0.01746


[34m[1mwandb[0m: Agent Starting Run: re1h6ztw with config:
[34m[1mwandb[0m: 	batch_size: 64
[34m[1mwandb[0m: 	dropout: 0.5
[34m[1mwandb[0m: 	epochs: 10
[34m[1mwandb[0m: 	features: ['Global_active_power', 'Global_reactive_power', 'Voltage', 'Global_intensity', 'Sub_metering_1', 'Sub_metering_2', 'Sub_metering_3']
[34m[1mwandb[0m: 	hidden_size: 64
[34m[1mwandb[0m: 	learning_rate: 0.005
[34m[1mwandb[0m: 	num_layers: 2
[34m[1mwandb[0m: 	sequence_length: 12


0,1
epoch,▁▂▃▃▄▅▆▆▇█
final_mae,▁
final_mse,▁
final_r2,▁
sequence_length,▁
test_size,▁
train_loss,█▄▃▂▂▂▂▁▁▁
train_size,▁
val_mae,▆▆██▅▄▅▂▁▂
val_mse,█▆▅▅▄▄▄▂▁▁

0,1
dataset_name,Household Electric P...
epoch,10
final_mae,0.01742
final_mse,0.00105
final_r2,0.86706
sequence_length,12
test_size,83009
train_loss,0.00146
train_size,332032
val_mae,0.01742


[34m[1mwandb[0m: Agent Starting Run: 9yqezjl1 with config:
[34m[1mwandb[0m: 	batch_size: 64
[34m[1mwandb[0m: 	dropout: 0.5
[34m[1mwandb[0m: 	epochs: 10
[34m[1mwandb[0m: 	features: ['Global_active_power', 'Global_reactive_power', 'Voltage', 'Global_intensity', 'Sub_metering_1', 'Sub_metering_2', 'Sub_metering_3']
[34m[1mwandb[0m: 	hidden_size: 64
[34m[1mwandb[0m: 	learning_rate: 0.005
[34m[1mwandb[0m: 	num_layers: 2
[34m[1mwandb[0m: 	sequence_length: 12


0,1
epoch,▁▂▃▃▄▅▆▆▇█
final_mae,▁
final_mse,▁
final_r2,▁
sequence_length,▁
test_size,▁
train_loss,█▃▃▂▂▂▁▁▁▁
train_size,▁
val_mae,▄▃▄▄▃▃▂▁█▁
val_mse,█▆▅▅▄▃▁▁▇▂

0,1
dataset_name,Household Electric P...
epoch,10
final_mae,0.01782
final_mse,0.00109
final_r2,0.86244
sequence_length,12
test_size,83009
train_loss,0.00149
train_size,332032
val_mae,0.01782


[34m[1mwandb[0m: Agent Starting Run: ieszdaer with config:
[34m[1mwandb[0m: 	batch_size: 64
[34m[1mwandb[0m: 	dropout: 0.5
[34m[1mwandb[0m: 	epochs: 10
[34m[1mwandb[0m: 	features: ['Global_active_power', 'Global_reactive_power', 'Voltage', 'Global_intensity', 'Sub_metering_1', 'Sub_metering_2', 'Sub_metering_3']
[34m[1mwandb[0m: 	hidden_size: 64
[34m[1mwandb[0m: 	learning_rate: 0.005
[34m[1mwandb[0m: 	num_layers: 2
[34m[1mwandb[0m: 	sequence_length: 12


0,1
epoch,▁▂▃▃▄▅▆▆▇█
final_mae,▁
final_mse,▁
final_r2,▁
sequence_length,▁
test_size,▁
train_loss,█▃▃▂▂▂▁▁▁▁
train_size,▁
val_mae,█▃▃▃▃▃▃▂▂▁
val_mse,█▆▅▄▄▃▃▂▁▁

0,1
dataset_name,Household Electric P...
epoch,10
final_mae,0.01727
final_mse,0.00103
final_r2,0.8702
sequence_length,12
test_size,83009
train_loss,0.00144
train_size,332032
val_mae,0.01727


[34m[1mwandb[0m: Agent Starting Run: 082k5fae with config:
[34m[1mwandb[0m: 	batch_size: 64
[34m[1mwandb[0m: 	dropout: 0.5
[34m[1mwandb[0m: 	epochs: 10
[34m[1mwandb[0m: 	features: ['Global_active_power', 'Global_reactive_power', 'Voltage', 'Global_intensity', 'Sub_metering_1', 'Sub_metering_2', 'Sub_metering_3']
[34m[1mwandb[0m: 	hidden_size: 128
[34m[1mwandb[0m: 	learning_rate: 0.005
[34m[1mwandb[0m: 	num_layers: 2
[34m[1mwandb[0m: 	sequence_length: 12


0,1
epoch,▁▂▃▃▄▅▆▆▇█
final_mae,▁
final_mse,▁
final_r2,▁
sequence_length,▁
test_size,▁
train_loss,█▄▃▃▂▂▂▁▁▁
train_size,▁
val_mae,▃█▃▂▃▂▂▁▁▃
val_mse,██▄▅▆▄▂▁▃▃

0,1
dataset_name,Household Electric P...
epoch,10
final_mae,0.01804
final_mse,0.0011
final_r2,0.86097
sequence_length,12
test_size,83009
train_loss,0.0014
train_size,332032
val_mae,0.01804


[34m[1mwandb[0m: Agent Starting Run: rjjnlbg4 with config:
[34m[1mwandb[0m: 	batch_size: 64
[34m[1mwandb[0m: 	dropout: 0.5
[34m[1mwandb[0m: 	epochs: 10
[34m[1mwandb[0m: 	features: ['Global_active_power', 'Global_reactive_power', 'Voltage', 'Global_intensity', 'Sub_metering_1', 'Sub_metering_2', 'Sub_metering_3']
[34m[1mwandb[0m: 	hidden_size: 32
[34m[1mwandb[0m: 	learning_rate: 0.005
[34m[1mwandb[0m: 	num_layers: 1
[34m[1mwandb[0m: 	sequence_length: 12




0,1
epoch,▁▂▃▃▄▅▆▆▇█
final_mae,▁
final_mse,▁
final_r2,▁
sequence_length,▁
test_size,▁
train_loss,█▄▃▂▂▂▂▁▁▁
train_size,▁
val_mae,▁▇█▂▁▄▃▅▅▆
val_mse,█▆▄▂▁▂▂▂▁▁

0,1
dataset_name,Household Electric P...
epoch,10
final_mae,0.01864
final_mse,0.00112
final_r2,0.85834
sequence_length,12
test_size,83009
train_loss,0.00135
train_size,332032
val_mae,0.01864


## 📊 Model Evaluation and Performance Interpretation

This section analyzes the performance of different LSTM models based on the W&B sweep results. The primary evaluation metric was **Mean Squared Error (MSE)**, complemented by **Mean Absolute Error (MAE)** and **R² Score** for interpretability.

---

### ✅ Best Performing Models

| Run Name         | final_mse | final_r2 | final_mae | Key Configuration |
|------------------|------------|-----------|-----------|--------------------|
| **jolly-sweep-8** | **0.00103** | 0.8702    | 0.01727   | hidden_size=64, dropout=0.5, batch_size=64, learning_rate=0.005, num_layers=2 |
| **sparkling-sweep-5** | **0.00105** | 0.8675    | 0.01746   | hidden_size=64, dropout=0.5, batch_size=64, learning_rate=0.005, num_layers=2 |

**Interpretation:**
- These models show excellent generalization with low error and high R² values.
- A **hidden size of 64** provided enough capacity without overfitting.
- **Dropout = 0.5** ensured strong regularization.
- **Two LSTM layers** allowed the model to learn hierarchical temporal features.
- A learning rate of **0.005** allowed stable convergence without overshooting.

---

### ❌ Worst Performing Models

| Run Name         | final_mse | final_r2 | final_mae | Key Configuration |
|------------------|------------|-----------|-----------|--------------------|
| **restful-sweep-4** | 0.00137   | 0.8269    | 0.02114   | hidden_size=64, dropout=0.3, batch_size=64, learning_rate=0.01, num_layers=2 |
| **serene-sweep-1** | 0.00402   | 0.4910    | 0.04434   | hidden_size=64, dropout=0.3, batch_size=32, learning_rate=0.005, num_layers=2 |

**Interpretation:**
- **serene-sweep-1** had a high final MSE and low R², likely due to underfitting or unstable convergence (train loss was high: ~0.0055).
- **restful-sweep-4** had a **very high learning rate (0.01)**, which may have led to optimization instability.
- Both runs had **lower dropout (0.3)**, potentially leading to overfitting or poor generalization.
- Batch size or poor parameter synergy may have contributed to noisy gradient updates or vanishing gradients.

---

### 🧠 Conclusion

- Models with **moderate depth**, **higher dropout (0.5)**, and **balanced learning rates (0.005)** provided the best results.
- **Too high learning rates** or **under-regularized architectures** (low dropout) degraded performance.
- Overall, the best models achieved **R² scores above 0.87**, indicating strong predictive power on this forecasting task.
