# Homework 6

You need to edit this `.ipynb` file by replacing "# Your code", "# Your answer", etc., click "Restart & Run All" in Jupyter Notebook to generate your results, and download it as an `.html` file. Please submit your `.ipynb` and `.html` files (instead of a `.zip` file) on Moodle. If you have questions regarding the homework, please email the TA Saumil Shah (sashah8@ncsu.edu), or attend our office hours.

In this problem, we will observe overfitting by revising the code of "Example: load forecasting" in the "Deep Learning with PyTorch" lecture notes. Then we will employ ridge regularization (L2 regularization) and dropout to mitigate overfitting.

In the following code, we load the data (note that we use a smaller training set, which contains data in 2006 only, so that it is easier to observe overfitting), and define the training and test functions.

In [1]:
import torch
from torch import nn
from torch.utils.data import DataLoader, Dataset
import numpy as np
import pandas as pd
from sklearn.preprocessing import OneHotEncoder, StandardScaler
df = pd.read_csv('/content/bse_clean (2).csv', parse_dates=['Date'])
df['Day'] = df['Date'].dt.dayofweek
X = np.hstack((OneHotEncoder(sparse_output=False).fit_transform(df[['Hour', 'Month', 'Day']]), df[['T']])).astype('float32')
y = df[['Load']].to_numpy().astype('float32')
train_index = (df['Date'].dt.year == 2006)
test_index = ((df['Date'].dt.year >= 2007) & (df['Date'].dt.year <= 2008))
scaler_X = StandardScaler()
scaler_X.fit(X[train_index])
X = scaler_X.transform(X)
scaler_y = StandardScaler()
scaler_y.fit(y[train_index])
y = scaler_y.transform(y)
X_train = X[train_index]
X_test = X[test_index]
y_train = y[train_index]
y_test = y[test_index]

class LoadDataset(Dataset):
    def __init__(self, X, y):
        self.X, self.y = X, y

    def __len__(self):
        return len(self.X)

    def __getitem__(self, idx):
        return self.X[idx], self.y[idx]

batch_size = 128
train_dataloader = DataLoader(LoadDataset(X_train, y_train), batch_size=batch_size)
test_dataloader = DataLoader(LoadDataset(X_test, y_test), batch_size=batch_size)

def train(dataloader, model, loss_fn, optimizer):
    size = len(dataloader.dataset)
    model.train()
    for batch, (X, y) in enumerate(dataloader):
        pred = model(X)
        loss = loss_fn(pred, y)
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

        if batch % 100 == 0:
            loss, current = loss.item(), batch * len(X)
            print(f'loss: {loss:f}  [{current:5d}/{size:5d}]')

def test(dataloader, model, loss_fn):
    size = len(dataloader.dataset)
    num_batches = len(dataloader)
    model.eval()
    test_loss, test_mape = 0, 0
    with torch.no_grad():
        for X, y in dataloader:
            pred = model(X)
            test_loss += loss_fn(pred, y).item()
            y_orig = scaler_y.inverse_transform(y)
            pred_orig = scaler_y.inverse_transform(pred)
            test_mape += np.abs((y_orig - pred_orig) / y_orig).sum()
    test_loss /= num_batches
    test_mape /= size
    print(f'Test Error: \n MAPE: {(100 * test_mape):0.2f}%, Avg loss: {test_loss:f}\n')

In all the problems, you can plot the test loss versus epoch index to better visualize the overfitting and how it is mitigated. But this is optional.

## Problem 1

In the lecture notes, there are two hidden layers with $64$ and $32$ units, respectively, as shown below. Now change the number of units to $1024$ and $512$, respectively, and run the code. Do not change other parts such as the number of epochs. You should see that the test loss decreases at the beginning, and then increases (though slowly).

In [2]:
class NeuralNetwork(nn.Module):
     def __init__(self):
         super().__init__()
         self.fc = nn.Sequential(
             nn.Linear(44, 1024),
             nn.ReLU(),
             nn.Linear(1024, 512),
             nn.ReLU(),
             nn.Linear(512, 1)
         )

     def forward(self, x):
         x = self.fc(x)
         return x

model = NeuralNetwork()
loss_fn = nn.MSELoss()
optimizer = torch.optim.RMSprop(model.parameters(), lr=1e-3)

epochs = 200
for t in range(epochs):
    print('Epoch', t + 1)
    print('-------------------------------')
    train(train_dataloader, model, loss_fn, optimizer)
    test(test_dataloader, model, loss_fn)
print("Done!")

Epoch 1
-------------------------------
loss: 0.788004  [    0/ 8760]
Test Error: 
 MAPE: 8.46%, Avg loss: 0.316012

Epoch 2
-------------------------------
loss: 0.211846  [    0/ 8760]
Test Error: 
 MAPE: 8.10%, Avg loss: 0.315054

Epoch 3
-------------------------------
loss: 0.166261  [    0/ 8760]
Test Error: 
 MAPE: 8.08%, Avg loss: 0.311366

Epoch 4
-------------------------------
loss: 0.120823  [    0/ 8760]
Test Error: 
 MAPE: 7.91%, Avg loss: 0.294236

Epoch 5
-------------------------------
loss: 0.087558  [    0/ 8760]
Test Error: 
 MAPE: 7.65%, Avg loss: 0.273093

Epoch 6
-------------------------------
loss: 0.069165  [    0/ 8760]
Test Error: 
 MAPE: 7.39%, Avg loss: 0.253059

Epoch 7
-------------------------------
loss: 0.060840  [    0/ 8760]
Test Error: 
 MAPE: 7.16%, Avg loss: 0.236675

Epoch 8
-------------------------------
loss: 0.058244  [    0/ 8760]
Test Error: 
 MAPE: 6.97%, Avg loss: 0.223100

Epoch 9
-------------------------------
loss: 0.059599  [    0/ 

The following two problems are independent. Each is based on your revised code above (with $1024$ and $512$ units).

## Problem 2

One method to mitigate the overfitting in Problem 1 is to use ridge regularization (L2 regularization). Based on your solution to Problem 1, set `weight_decay=0.01` in the proper function, and run your code.

In [3]:
# Your code
class NeuralNetwork(nn.Module):
    def __init__(self):
        super().__init__()
        self.fc = nn.Sequential(
            nn.Linear(44, 1024),
            nn.ReLU(),
            nn.Linear(1024, 512),
            nn.ReLU(),
            nn.Linear(512, 1)
        )

    def forward(self, x):
        return self.fc(x)

model = NeuralNetwork()
loss_fn = nn.MSELoss()
optimizer = torch.optim.RMSprop(model.parameters(), lr=1e-3, weight_decay=0.01)

epochs = 200
for t in range(epochs):
    print('Epoch', t + 1)
    print('-------------------------------')
    train(train_dataloader, model, loss_fn, optimizer)
    test(test_dataloader, model, loss_fn)
print("Done!")

Epoch 1
-------------------------------
loss: 0.701757  [    0/ 8760]
Test Error: 
 MAPE: 8.91%, Avg loss: 0.328127

Epoch 2
-------------------------------
loss: 0.187449  [    0/ 8760]
Test Error: 
 MAPE: 8.45%, Avg loss: 0.316508

Epoch 3
-------------------------------
loss: 0.172342  [    0/ 8760]
Test Error: 
 MAPE: 8.27%, Avg loss: 0.303339

Epoch 4
-------------------------------
loss: 0.133997  [    0/ 8760]
Test Error: 
 MAPE: 7.99%, Avg loss: 0.290551

Epoch 5
-------------------------------
loss: 0.131401  [    0/ 8760]
Test Error: 
 MAPE: 7.58%, Avg loss: 0.262186

Epoch 6
-------------------------------
loss: 0.103009  [    0/ 8760]
Test Error: 
 MAPE: 7.26%, Avg loss: 0.239464

Epoch 7
-------------------------------
loss: 0.091093  [    0/ 8760]
Test Error: 
 MAPE: 6.97%, Avg loss: 0.221263

Epoch 8
-------------------------------
loss: 0.091037  [    0/ 8760]
Test Error: 
 MAPE: 6.78%, Avg loss: 0.209646

Epoch 9
-------------------------------
loss: 0.100194  [    0/ 

## Problem 3

Another method to mitigate the overfitting in Problem 1 is to use dropout. Based on your solution to Problem 1, add a dropout layer with dropout rate $0.5$ after each of the two hidden layers (more specifically, after each activation function), and run your code.

In [4]:
# Your code
class NeuralNetwork(nn.Module):
    def __init__(self):
        super().__init__()
        self.fc = nn.Sequential(
            nn.Linear(44, 1024),
            nn.ReLU(),
            nn.Dropout(0.5),
            nn.Linear(1024, 512),
            nn.ReLU(),
            nn.Dropout(0.5),
            nn.Linear(512, 1)
        )

    def forward(self, x):
        return self.fc(x)

model = NeuralNetwork()
loss_fn = nn.MSELoss()
optimizer = torch.optim.RMSprop(model.parameters(), lr=1e-3)

epochs = 200
for t in range(epochs):
    print('Epoch', t + 1)
    print('-------------------------------')
    train(train_dataloader, model, loss_fn, optimizer)
    test(test_dataloader, model, loss_fn)
print("Done!")

Epoch 1
-------------------------------
loss: 0.716121  [    0/ 8760]
Test Error: 
 MAPE: 9.20%, Avg loss: 0.355622

Epoch 2
-------------------------------
loss: 0.599660  [    0/ 8760]
Test Error: 
 MAPE: 8.31%, Avg loss: 0.316055

Epoch 3
-------------------------------
loss: 0.408592  [    0/ 8760]
Test Error: 
 MAPE: 7.77%, Avg loss: 0.276318

Epoch 4
-------------------------------
loss: 0.270881  [    0/ 8760]
Test Error: 
 MAPE: 7.31%, Avg loss: 0.248580

Epoch 5
-------------------------------
loss: 0.191594  [    0/ 8760]
Test Error: 
 MAPE: 6.99%, Avg loss: 0.225856

Epoch 6
-------------------------------
loss: 0.145112  [    0/ 8760]
Test Error: 
 MAPE: 6.74%, Avg loss: 0.208312

Epoch 7
-------------------------------
loss: 0.128689  [    0/ 8760]
Test Error: 
 MAPE: 6.52%, Avg loss: 0.196386

Epoch 8
-------------------------------
loss: 0.116127  [    0/ 8760]
Test Error: 
 MAPE: 6.47%, Avg loss: 0.188354

Epoch 9
-------------------------------
loss: 0.127898  [    0/ 

In [5]:
%%shell
jupyter nbconvert --to html /content/hw6_slee93.ipynb

This application is used to convert notebook files (*.ipynb)
        to various other formats.


Options
The options below are convenience aliases to configurable class-options,
as listed in the "Equivalent to" description-line of the aliases.
To see all configurable class-options for some <cmd>, use:
    <cmd> --help-all

--debug
    set log level to logging.DEBUG (maximize logging output)
    Equivalent to: [--Application.log_level=10]
--show-config
    Show the application's configuration (human-readable format)
    Equivalent to: [--Application.show_config=True]
--show-config-json
    Show the application's configuration (json format)
    Equivalent to: [--Application.show_config_json=True]
--generate-config
    generate default config file
    Equivalent to: [--JupyterApp.generate_config=True]
-y
    Answer yes to any questions instead of prompting.
    Equivalent to: [--JupyterApp.answer_yes=True]
--execute
    Execute the notebook prior to export.
    Equivalent to: [--ExecutePr

CalledProcessError: Command 'jupyter nbconvert --to html /content/hw6_slee93.ipynb
' returned non-zero exit status 255.