# Task 3. 
Based on what you learned in task 2, to train a neural network model on the BBBP dataset used in HW5 by using Morgan fingerprints as input features and accuracy as the performance measure.   Please do similar explorations as what you did for task 2,  summarize what you have learned from this experiment, and discuss what are similar or different from what you observed in Task 2.  Please  compare your results of you optimal NN model (accuracy) with your optimal random forest model that have been developed in HW5.  (40 Points, HW8c.ipynb)

## Download data

In [2]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

In [3]:
from rdkit import Chem
from rdkit.Chem import AllChem

In [4]:
import torch
import torch.nn as nn

from torch.utils.data import random_split
from torch.utils.data import TensorDataset
from torch.utils.data import DataLoader

In [5]:
# download data
bbbp = pd.read_csv('BBBP.csv')
bbbp.head()

Unnamed: 0,num,name,p_np,smiles
0,1,Propanolol,1,[Cl].CC(C)NCC(O)COc1cccc2ccccc12
1,2,Terbutylchlorambucil,1,C(=O)(OC(C)(C)C)CCCc1ccc(cc1)N(CCCl)CCCl
2,3,40730,1,c12c3c(N4CCN(C)CC4)c(F)cc1c(c(C(O)=O)cn2C(C)CO...
3,4,24,1,C1CCN(CC1)Cc1cccc(c1)OCCCNC(=O)C
4,5,cloxacillin,1,Cc1onc(c2ccccc2Cl)c1C(=O)N[C@H]3[C@H]4SC(C)(C)...


## Calculate morgan fingerprints

In [6]:
# suppress warnings from invalid molecules
from rdkit import RDLogger
RDLogger.DisableLog('rdApp.*')

In [7]:
# function to generate canon SMILES
def gen_canon_smiles(smiles_list):
    
    invalid_ids = []
    canon_smiles = []

    for i in range(len(smiles_list)):   
        mol = Chem.MolFromSmiles(smiles_list[i])
        
        # do not append NoneType if invalid
        if mol is None: 
            invalid_ids.append(i)
            continue

        canon_smiles.append(Chem.MolToSmiles(mol))

    return canon_smiles, invalid_ids


# function to calculate morgan fingerprints from SMILES
def calc_morgan_fpts(smiles_list):
    morgan_fingerprints = []
    
    for i in smiles_list:
        mol = Chem.MolFromSmiles(i)
        
        # do not try to calculate if invalid
        if mol is None: continue
            
        fpts = AllChem.GetMorganFingerprintAsBitVect(mol,2,2048)
        mfpts = np.array(fpts)
        morgan_fingerprints.append(mfpts) 
        
    return np.array(morgan_fingerprints)

In [8]:
# generate canon smiles
canon_smiles, invalid_ids = gen_canon_smiles(bbbp.smiles)

# drop rows with invalid SMILES
bbbp = bbbp.drop(invalid_ids)

# replace SMILES with canon SMILES
bbbp.smiles = canon_smiles

# drop duplicates to prevent train/valid/test contamination
bbbp.drop_duplicates(subset=['smiles'], inplace=True)

In [9]:
# create TensorDataset
X = torch.from_numpy(calc_morgan_fpts(bbbp.smiles)).float()
y = torch.from_numpy(bbbp.p_np.values)
bbbp_ds = TensorDataset(X, y)

In [10]:
# split data into training, validation, and test sets
init_train_ds, train_ds, valid_ds, test_ds = random_split(bbbp_ds, [0.08, 0.72, 0.10, 0.10])

In [11]:
# create DataLoaders
batch_size = 64

init_train_dl = DataLoader(init_train_ds, batch_size, shuffle=True)
train_dl = DataLoader(train_ds, batch_size, shuffle=True)
valid_dl = DataLoader(valid_ds, batch_size, shuffle=True)
test_dl = DataLoader(test_ds, batch_size, shuffle=True)

## Explore neural networks

#### Base model

In [23]:
# construct a basic model architecture
class Model(nn.Module):
    def __init__(self):
        super(Model, self).__init__()
        self.flatten = nn.Flatten()
        self.stack = nn.Sequential(
            nn.Linear(2048, 64), 
            nn.ReLU(), 
            nn.Linear(64, 64),
            nn.ReLU(), 
            nn.Linear(64, 2)
        )
        
    def forward(self, x):
        x = self.flatten(x)
        logits = self.stack(x)
        return logits
    
model = Model()
model

Model(
  (flatten): Flatten(start_dim=1, end_dim=-1)
  (stack): Sequential(
    (0): Linear(in_features=2048, out_features=64, bias=True)
    (1): ReLU()
    (2): Linear(in_features=64, out_features=64, bias=True)
    (3): ReLU()
    (4): Linear(in_features=64, out_features=2, bias=True)
  )
)

In [24]:
# function to train the model and evaluate on the validation set
def train_and_validate(model, train_dl, valid_dl, optimizer, loss_fn, num_epochs=20):
    for epoch in range(num_epochs):
        # Training loop
        acc_hist_train = 0
        loss_hist_train = 0
        for x_batch, y_batch in train_dl:
            pred = model(x_batch)
            loss = loss_fn(pred, y_batch)
            loss.backward()
            optimizer.step()
            optimizer.zero_grad()
            is_correct = (torch.argmax(pred, dim=1) == y_batch).float()
            acc_hist_train += is_correct.sum()
            loss_hist_train += loss.item() * x_batch.size(0)  # accumulate loss
        acc_hist_train /= len(train_dl.dataset)
        loss_hist_train /= len(train_dl.dataset)

        # Validation loop
        acc_hist_valid = 0
        loss_hist_valid = 0
        with torch.no_grad():
            for x_batch, y_batch in valid_dl:
                pred = model(x_batch)
                loss = loss_fn(pred, y_batch)
                is_correct = (torch.argmax(pred, dim=1) == y_batch).float()
                acc_hist_valid += is_correct.sum()
                loss_hist_valid += loss.item() * x_batch.size(0)
            acc_hist_valid /= len(valid_dl.dataset)
            loss_hist_valid /= len(valid_dl.dataset)

        print(f'Epoch [{epoch+1:0>2}/{num_epochs}], Train Loss: {loss_hist_train:.4f}, Train Acc: {acc_hist_train:.4f}, Valid Loss: {loss_hist_valid:.4f}, Valid Acc: {acc_hist_valid:.4f}')

In [25]:
# evaluate the model
loss_fn = nn.CrossEntropyLoss()
learning_rate = 0.001
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)

train_and_validate(model, init_train_dl, valid_dl, optimizer, loss_fn)

Epoch [01/20], Train Loss: 0.7119, Train Acc: 0.2767, Valid Loss: 0.6868, Valid Acc: 0.6345
Epoch [02/20], Train Loss: 0.6666, Train Acc: 0.8050, Valid Loss: 0.6529, Valid Acc: 0.7766
Epoch [03/20], Train Loss: 0.6204, Train Acc: 0.8491, Valid Loss: 0.6144, Valid Acc: 0.7766
Epoch [04/20], Train Loss: 0.5625, Train Acc: 0.8491, Valid Loss: 0.5721, Valid Acc: 0.7868
Epoch [05/20], Train Loss: 0.4937, Train Acc: 0.8553, Valid Loss: 0.5327, Valid Acc: 0.7817
Epoch [06/20], Train Loss: 0.4260, Train Acc: 0.8742, Valid Loss: 0.5003, Valid Acc: 0.7817
Epoch [07/20], Train Loss: 0.3599, Train Acc: 0.8805, Valid Loss: 0.4771, Valid Acc: 0.7817
Epoch [08/20], Train Loss: 0.2962, Train Acc: 0.9057, Valid Loss: 0.4587, Valid Acc: 0.7868
Epoch [09/20], Train Loss: 0.2368, Train Acc: 0.9245, Valid Loss: 0.4422, Valid Acc: 0.8020
Epoch [10/20], Train Loss: 0.1836, Train Acc: 0.9560, Valid Loss: 0.4328, Valid Acc: 0.8122
Epoch [11/20], Train Loss: 0.1362, Train Acc: 0.9874, Valid Loss: 0.4342, Valid 

#### Number of dimensions of hidden layer

hidden layer: 256, 256

In [28]:
# construct a nn model with higher dimension hidden layer
class Model(nn.Module):
    def __init__(self):
        super(Model, self).__init__()
        self.flatten = nn.Flatten()
        self.stack = nn.Sequential(
            nn.Linear(2048, 256), 
            nn.ReLU(), 
            nn.Linear(256, 256),
            nn.ReLU(), 
            nn.Linear(256, 2)
        )
        
    def forward(self, x):
        x = self.flatten(x)
        logits = self.stack(x)
        return logits
    
model = Model()
model

Model(
  (flatten): Flatten(start_dim=1, end_dim=-1)
  (stack): Sequential(
    (0): Linear(in_features=2048, out_features=256, bias=True)
    (1): ReLU()
    (2): Linear(in_features=256, out_features=256, bias=True)
    (3): ReLU()
    (4): Linear(in_features=256, out_features=2, bias=True)
  )
)

In [29]:
# evaluate the model
loss_fn = nn.CrossEntropyLoss()
learning_rate = 0.001
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)

train_and_validate(model, init_train_dl, valid_dl, optimizer, loss_fn)

Epoch [01/20], Train Loss: 0.6721, Train Acc: 0.6918, Valid Loss: 0.6215, Valid Acc: 0.7462
Epoch [02/20], Train Loss: 0.5746, Train Acc: 0.7547, Valid Loss: 0.5467, Valid Acc: 0.7462
Epoch [03/20], Train Loss: 0.4524, Train Acc: 0.7987, Valid Loss: 0.4977, Valid Acc: 0.7665
Epoch [04/20], Train Loss: 0.3295, Train Acc: 0.8428, Valid Loss: 0.4855, Valid Acc: 0.7919
Epoch [05/20], Train Loss: 0.2124, Train Acc: 0.9119, Valid Loss: 0.4499, Valid Acc: 0.8274
Epoch [06/20], Train Loss: 0.1275, Train Acc: 0.9748, Valid Loss: 0.4316, Valid Acc: 0.8223
Epoch [07/20], Train Loss: 0.0678, Train Acc: 1.0000, Valid Loss: 0.4732, Valid Acc: 0.8223
Epoch [08/20], Train Loss: 0.0320, Train Acc: 1.0000, Valid Loss: 0.5714, Valid Acc: 0.8274
Epoch [09/20], Train Loss: 0.0156, Train Acc: 1.0000, Valid Loss: 0.7050, Valid Acc: 0.8274
Epoch [10/20], Train Loss: 0.0073, Train Acc: 1.0000, Valid Loss: 0.8475, Valid Acc: 0.8274
Epoch [11/20], Train Loss: 0.0040, Train Acc: 1.0000, Valid Loss: 0.9814, Valid 

hidden layer: 256, 64

In [30]:
# construct a nn model with variable dimension hidden layer
class Model(nn.Module):
    def __init__(self):
        super(Model, self).__init__()
        self.flatten = nn.Flatten()
        self.stack = nn.Sequential(
            nn.Linear(2048, 256), 
            nn.ReLU(), 
            nn.Linear(256, 64),
            nn.ReLU(), 
            nn.Linear(64, 2)
        )
        
    def forward(self, x):
        x = self.flatten(x)
        logits = self.stack(x)
        return logits
    
model = Model()
model

Model(
  (flatten): Flatten(start_dim=1, end_dim=-1)
  (stack): Sequential(
    (0): Linear(in_features=2048, out_features=256, bias=True)
    (1): ReLU()
    (2): Linear(in_features=256, out_features=64, bias=True)
    (3): ReLU()
    (4): Linear(in_features=64, out_features=2, bias=True)
  )
)

In [31]:
# evaluate the model
loss_fn = nn.CrossEntropyLoss()
learning_rate = 0.001
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)

train_and_validate(model, init_train_dl, valid_dl, optimizer, loss_fn)

Epoch [01/20], Train Loss: 0.6563, Train Acc: 0.7421, Valid Loss: 0.6171, Valid Acc: 0.7462
Epoch [02/20], Train Loss: 0.5821, Train Acc: 0.7421, Valid Loss: 0.5585, Valid Acc: 0.7462
Epoch [03/20], Train Loss: 0.4846, Train Acc: 0.7673, Valid Loss: 0.5106, Valid Acc: 0.7462
Epoch [04/20], Train Loss: 0.3819, Train Acc: 0.8176, Valid Loss: 0.4895, Valid Acc: 0.7716
Epoch [05/20], Train Loss: 0.3016, Train Acc: 0.8679, Valid Loss: 0.4810, Valid Acc: 0.7919
Epoch [06/20], Train Loss: 0.2294, Train Acc: 0.9057, Valid Loss: 0.4614, Valid Acc: 0.8071
Epoch [07/20], Train Loss: 0.1653, Train Acc: 0.9748, Valid Loss: 0.4595, Valid Acc: 0.8173
Epoch [08/20], Train Loss: 0.1175, Train Acc: 0.9937, Valid Loss: 0.4855, Valid Acc: 0.8223
Epoch [09/20], Train Loss: 0.0741, Train Acc: 1.0000, Valid Loss: 0.5401, Valid Acc: 0.8223
Epoch [10/20], Train Loss: 0.0445, Train Acc: 1.0000, Valid Loss: 0.6088, Valid Acc: 0.8223
Epoch [11/20], Train Loss: 0.0254, Train Acc: 1.0000, Valid Loss: 0.6815, Valid 

hidden layer: 512, 512

In [32]:
# construct a nn model with hidden layer dimensions: 512, 512
class Model(nn.Module):
    def __init__(self):
        super(Model, self).__init__()
        self.flatten = nn.Flatten()
        self.stack = nn.Sequential(
            nn.Linear(2048, 512), 
            nn.ReLU(), 
            nn.Linear(512, 512),
            nn.ReLU(), 
            nn.Linear(512, 2)
        )
        
    def forward(self, x):
        x = self.flatten(x)
        logits = self.stack(x)
        return logits
    
model = Model()
model

Model(
  (flatten): Flatten(start_dim=1, end_dim=-1)
  (stack): Sequential(
    (0): Linear(in_features=2048, out_features=512, bias=True)
    (1): ReLU()
    (2): Linear(in_features=512, out_features=512, bias=True)
    (3): ReLU()
    (4): Linear(in_features=512, out_features=2, bias=True)
  )
)

In [33]:
# evaluate the model
loss_fn = nn.CrossEntropyLoss()
learning_rate = 0.001
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)

train_and_validate(model, init_train_dl, valid_dl, optimizer, loss_fn)

Epoch [01/20], Train Loss: 0.6656, Train Acc: 0.5975, Valid Loss: 0.5761, Valid Acc: 0.7462
Epoch [02/20], Train Loss: 0.5033, Train Acc: 0.7421, Valid Loss: 0.5228, Valid Acc: 0.7462
Epoch [03/20], Train Loss: 0.3430, Train Acc: 0.7987, Valid Loss: 0.5073, Valid Acc: 0.7868
Epoch [04/20], Train Loss: 0.2052, Train Acc: 0.8994, Valid Loss: 0.4779, Valid Acc: 0.8223
Epoch [05/20], Train Loss: 0.1105, Train Acc: 0.9874, Valid Loss: 0.4705, Valid Acc: 0.8274
Epoch [06/20], Train Loss: 0.0486, Train Acc: 1.0000, Valid Loss: 0.5609, Valid Acc: 0.8426
Epoch [07/20], Train Loss: 0.0180, Train Acc: 1.0000, Valid Loss: 0.7070, Valid Acc: 0.8426
Epoch [08/20], Train Loss: 0.0063, Train Acc: 1.0000, Valid Loss: 0.8708, Valid Acc: 0.8325
Epoch [09/20], Train Loss: 0.0025, Train Acc: 1.0000, Valid Loss: 1.0349, Valid Acc: 0.8274
Epoch [10/20], Train Loss: 0.0012, Train Acc: 1.0000, Valid Loss: 1.1822, Valid Acc: 0.8274
Epoch [11/20], Train Loss: 0.0007, Train Acc: 1.0000, Valid Loss: 1.3090, Valid 

hidden layer: 512, 256

In [34]:
# construct a nn model with hidden layer dimensions: 512, 256
class Model(nn.Module):
    def __init__(self):
        super(Model, self).__init__()
        self.flatten = nn.Flatten()
        self.stack = nn.Sequential(
            nn.Linear(2048, 512), 
            nn.ReLU(), 
            nn.Linear(512, 512),
            nn.ReLU(), 
            nn.Linear(512, 2)
        )
        
    def forward(self, x):
        x = self.flatten(x)
        logits = self.stack(x)
        return logits
    
model = Model()
model

Model(
  (flatten): Flatten(start_dim=1, end_dim=-1)
  (stack): Sequential(
    (0): Linear(in_features=2048, out_features=512, bias=True)
    (1): ReLU()
    (2): Linear(in_features=512, out_features=512, bias=True)
    (3): ReLU()
    (4): Linear(in_features=512, out_features=2, bias=True)
  )
)

In [35]:
# evaluate the model
loss_fn = nn.CrossEntropyLoss()
learning_rate = 0.001
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)

train_and_validate(model, init_train_dl, valid_dl, optimizer, loss_fn)

Epoch [01/20], Train Loss: 0.6685, Train Acc: 0.5849, Valid Loss: 0.5831, Valid Acc: 0.7462
Epoch [02/20], Train Loss: 0.5220, Train Acc: 0.7421, Valid Loss: 0.5133, Valid Acc: 0.7462
Epoch [03/20], Train Loss: 0.3645, Train Acc: 0.7925, Valid Loss: 0.4942, Valid Acc: 0.7716
Epoch [04/20], Train Loss: 0.2344, Train Acc: 0.8805, Valid Loss: 0.4498, Valid Acc: 0.8173
Epoch [05/20], Train Loss: 0.1299, Train Acc: 0.9874, Valid Loss: 0.4502, Valid Acc: 0.8376
Epoch [06/20], Train Loss: 0.0657, Train Acc: 1.0000, Valid Loss: 0.5083, Valid Acc: 0.8426
Epoch [07/20], Train Loss: 0.0239, Train Acc: 1.0000, Valid Loss: 0.6262, Valid Acc: 0.8426
Epoch [08/20], Train Loss: 0.0082, Train Acc: 1.0000, Valid Loss: 0.7815, Valid Acc: 0.8325
Epoch [09/20], Train Loss: 0.0029, Train Acc: 1.0000, Valid Loss: 0.9360, Valid Acc: 0.8325
Epoch [10/20], Train Loss: 0.0011, Train Acc: 1.0000, Valid Loss: 1.0860, Valid Acc: 0.8274
Epoch [11/20], Train Loss: 0.0006, Train Acc: 1.0000, Valid Loss: 1.2192, Valid 

hidden layer size: 1024, 1024

In [36]:
# construct a nn model with hidden layer dimensions: 1024, 1024
class Model(nn.Module):
    def __init__(self):
        super(Model, self).__init__()
        self.flatten = nn.Flatten()
        self.stack = nn.Sequential(
            nn.Linear(2048, 1024), 
            nn.ReLU(), 
            nn.Linear(1024, 1024),
            nn.ReLU(), 
            nn.Linear(1024, 2)
        )
        
    def forward(self, x):
        x = self.flatten(x)
        logits = self.stack(x)
        return logits
    
model = Model()
model

Model(
  (flatten): Flatten(start_dim=1, end_dim=-1)
  (stack): Sequential(
    (0): Linear(in_features=2048, out_features=1024, bias=True)
    (1): ReLU()
    (2): Linear(in_features=1024, out_features=1024, bias=True)
    (3): ReLU()
    (4): Linear(in_features=1024, out_features=2, bias=True)
  )
)

In [37]:
# evaluate the model
loss_fn = nn.CrossEntropyLoss()
learning_rate = 0.001
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)

train_and_validate(model, init_train_dl, valid_dl, optimizer, loss_fn)

Epoch [01/20], Train Loss: 0.6249, Train Acc: 0.6918, Valid Loss: 0.5416, Valid Acc: 0.7462
Epoch [02/20], Train Loss: 0.4290, Train Acc: 0.7673, Valid Loss: 0.4808, Valid Acc: 0.7766
Epoch [03/20], Train Loss: 0.2175, Train Acc: 0.9057, Valid Loss: 0.3913, Valid Acc: 0.8325
Epoch [04/20], Train Loss: 0.1012, Train Acc: 1.0000, Valid Loss: 0.4524, Valid Acc: 0.8376
Epoch [05/20], Train Loss: 0.0296, Train Acc: 1.0000, Valid Loss: 0.7064, Valid Acc: 0.8376
Epoch [06/20], Train Loss: 0.0079, Train Acc: 1.0000, Valid Loss: 1.0153, Valid Acc: 0.8376
Epoch [07/20], Train Loss: 0.0028, Train Acc: 1.0000, Valid Loss: 1.3142, Valid Acc: 0.8376
Epoch [08/20], Train Loss: 0.0009, Train Acc: 1.0000, Valid Loss: 1.5775, Valid Acc: 0.8376
Epoch [09/20], Train Loss: 0.0004, Train Acc: 1.0000, Valid Loss: 1.8045, Valid Acc: 0.8376
Epoch [10/20], Train Loss: 0.0002, Train Acc: 1.0000, Valid Loss: 1.9936, Valid Acc: 0.8376
Epoch [11/20], Train Loss: 0.0001, Train Acc: 1.0000, Valid Loss: 2.1473, Valid 

The performance of the model seems the best with a hidden layer of at least 512 dimensions. Increasing the dimensions past 512 does not significantly improve performance on the validation set.

#### Number of hidden layers 

Two hidden layers with dimensions (512, 512), (512, 512)

In [38]:
# construct a nn model with 2 hidden layers
class Model(nn.Module):
    def __init__(self):
        super(Model, self).__init__()
        self.flatten = nn.Flatten()
        self.stack = nn.Sequential(
            nn.Linear(2048, 512), 
            nn.ReLU(), 
            nn.Linear(512, 512),
            nn.ReLU(), 
            nn.Linear(512, 512),
            nn.ReLU(), 
            nn.Linear(512, 2)
        )
        
    def forward(self, x):
        x = self.flatten(x)
        logits = self.stack(x)
        return logits
    
model = Model()
model

Model(
  (flatten): Flatten(start_dim=1, end_dim=-1)
  (stack): Sequential(
    (0): Linear(in_features=2048, out_features=512, bias=True)
    (1): ReLU()
    (2): Linear(in_features=512, out_features=512, bias=True)
    (3): ReLU()
    (4): Linear(in_features=512, out_features=512, bias=True)
    (5): ReLU()
    (6): Linear(in_features=512, out_features=2, bias=True)
  )
)

In [39]:
# evaluate the model
loss_fn = nn.CrossEntropyLoss()
learning_rate = 0.001
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)

train_and_validate(model, init_train_dl, valid_dl, optimizer, loss_fn)

Epoch [01/20], Train Loss: 0.6772, Train Acc: 0.5786, Valid Loss: 0.5917, Valid Acc: 0.7462
Epoch [02/20], Train Loss: 0.5467, Train Acc: 0.7421, Valid Loss: 0.5342, Valid Acc: 0.7462
Epoch [03/20], Train Loss: 0.3665, Train Acc: 0.7736, Valid Loss: 0.4915, Valid Acc: 0.7716
Epoch [04/20], Train Loss: 0.2377, Train Acc: 0.8679, Valid Loss: 0.5274, Valid Acc: 0.8223
Epoch [05/20], Train Loss: 0.1478, Train Acc: 0.9874, Valid Loss: 0.6124, Valid Acc: 0.8173
Epoch [06/20], Train Loss: 0.0809, Train Acc: 1.0000, Valid Loss: 0.7396, Valid Acc: 0.8274
Epoch [07/20], Train Loss: 0.0277, Train Acc: 1.0000, Valid Loss: 0.8555, Valid Acc: 0.8274
Epoch [08/20], Train Loss: 0.0067, Train Acc: 1.0000, Valid Loss: 1.0122, Valid Acc: 0.8325
Epoch [09/20], Train Loss: 0.0016, Train Acc: 1.0000, Valid Loss: 1.2218, Valid Acc: 0.8274
Epoch [10/20], Train Loss: 0.0004, Train Acc: 1.0000, Valid Loss: 1.4595, Valid Acc: 0.8173
Epoch [11/20], Train Loss: 0.0001, Train Acc: 1.0000, Valid Loss: 1.6826, Valid 

Two hidden layers with dimensions (512, 512, 512, 256)

In [40]:
# construct a nn model with 2 hidden layers
class Model(nn.Module):
    def __init__(self):
        super(Model, self).__init__()
        self.flatten = nn.Flatten()
        self.stack = nn.Sequential(
            nn.Linear(2048, 512), 
            nn.ReLU(), 
            nn.Linear(512, 512),
            nn.ReLU(), 
            nn.Linear(512, 256),
            nn.ReLU(), 
            nn.Linear(256, 2)
        )
        
    def forward(self, x):
        x = self.flatten(x)
        logits = self.stack(x)
        return logits
    
model = Model()
model

Model(
  (flatten): Flatten(start_dim=1, end_dim=-1)
  (stack): Sequential(
    (0): Linear(in_features=2048, out_features=512, bias=True)
    (1): ReLU()
    (2): Linear(in_features=512, out_features=512, bias=True)
    (3): ReLU()
    (4): Linear(in_features=512, out_features=256, bias=True)
    (5): ReLU()
    (6): Linear(in_features=256, out_features=2, bias=True)
  )
)

In [41]:
# evaluate the model
loss_fn = nn.CrossEntropyLoss()
learning_rate = 0.001
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)

train_and_validate(model, init_train_dl, valid_dl, optimizer, loss_fn)

Epoch [01/20], Train Loss: 0.6596, Train Acc: 0.7421, Valid Loss: 0.5810, Valid Acc: 0.7462
Epoch [02/20], Train Loss: 0.5240, Train Acc: 0.7421, Valid Loss: 0.5244, Valid Acc: 0.7462
Epoch [03/20], Train Loss: 0.3527, Train Acc: 0.7736, Valid Loss: 0.5044, Valid Acc: 0.7716
Epoch [04/20], Train Loss: 0.2272, Train Acc: 0.8868, Valid Loss: 0.4837, Valid Acc: 0.8274
Epoch [05/20], Train Loss: 0.1488, Train Acc: 0.9937, Valid Loss: 0.6155, Valid Acc: 0.8274
Epoch [06/20], Train Loss: 0.0903, Train Acc: 1.0000, Valid Loss: 0.7961, Valid Acc: 0.8274
Epoch [07/20], Train Loss: 0.0388, Train Acc: 1.0000, Valid Loss: 0.9268, Valid Acc: 0.8325
Epoch [08/20], Train Loss: 0.0096, Train Acc: 1.0000, Valid Loss: 1.0491, Valid Acc: 0.8223
Epoch [09/20], Train Loss: 0.0022, Train Acc: 1.0000, Valid Loss: 1.2051, Valid Acc: 0.8325
Epoch [10/20], Train Loss: 0.0006, Train Acc: 1.0000, Valid Loss: 1.3711, Valid Acc: 0.8173
Epoch [11/20], Train Loss: 0.0001, Train Acc: 1.0000, Valid Loss: 1.5308, Valid 

Two hidden layers with dimensions (1024, 512, 512, 512)

In [42]:
# construct a nn model with 2 hidden layers
class Model(nn.Module):
    def __init__(self):
        super(Model, self).__init__()
        self.flatten = nn.Flatten()
        self.stack = nn.Sequential(
            nn.Linear(2048, 1024), 
            nn.ReLU(), 
            nn.Linear(1024, 512),
            nn.ReLU(), 
            nn.Linear(512, 512),
            nn.ReLU(), 
            nn.Linear(512, 2)
        )
        
    def forward(self, x):
        x = self.flatten(x)
        logits = self.stack(x)
        return logits
    
model = Model()
model

Model(
  (flatten): Flatten(start_dim=1, end_dim=-1)
  (stack): Sequential(
    (0): Linear(in_features=2048, out_features=1024, bias=True)
    (1): ReLU()
    (2): Linear(in_features=1024, out_features=512, bias=True)
    (3): ReLU()
    (4): Linear(in_features=512, out_features=512, bias=True)
    (5): ReLU()
    (6): Linear(in_features=512, out_features=2, bias=True)
  )
)

In [43]:
# evaluate the model
loss_fn = nn.CrossEntropyLoss()
learning_rate = 0.001
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)

train_and_validate(model, init_train_dl, valid_dl, optimizer, loss_fn)

Epoch [01/20], Train Loss: 0.6725, Train Acc: 0.5157, Valid Loss: 0.5636, Valid Acc: 0.7462
Epoch [02/20], Train Loss: 0.4878, Train Acc: 0.7421, Valid Loss: 0.5574, Valid Acc: 0.7462
Epoch [03/20], Train Loss: 0.3236, Train Acc: 0.7736, Valid Loss: 0.5025, Valid Acc: 0.7868
Epoch [04/20], Train Loss: 0.2005, Train Acc: 0.9119, Valid Loss: 0.5410, Valid Acc: 0.8274
Epoch [05/20], Train Loss: 0.1339, Train Acc: 1.0000, Valid Loss: 0.6292, Valid Acc: 0.8325
Epoch [06/20], Train Loss: 0.0680, Train Acc: 1.0000, Valid Loss: 0.7521, Valid Acc: 0.8325
Epoch [07/20], Train Loss: 0.0182, Train Acc: 1.0000, Valid Loss: 0.9011, Valid Acc: 0.8325
Epoch [08/20], Train Loss: 0.0027, Train Acc: 1.0000, Valid Loss: 1.0938, Valid Acc: 0.8325
Epoch [09/20], Train Loss: 0.0005, Train Acc: 1.0000, Valid Loss: 1.3479, Valid Acc: 0.8173
Epoch [10/20], Train Loss: 0.0001, Train Acc: 1.0000, Valid Loss: 1.6674, Valid Acc: 0.8122
Epoch [11/20], Train Loss: 0.0000, Train Acc: 1.0000, Valid Loss: 1.9999, Valid 

Three hidden layers with dimensions (512, 512), (512, 512), (512, 512)

In [44]:
# construct a nn model with 2 hidden layers
class Model(nn.Module):
    def __init__(self):
        super(Model, self).__init__()
        self.flatten = nn.Flatten()
        self.stack = nn.Sequential(
            nn.Linear(2048, 512), 
            nn.ReLU(), 
            nn.Linear(512, 512),
            nn.ReLU(), 
            nn.Linear(512, 512),
            nn.ReLU(), 
            nn.Linear(512, 512),
            nn.ReLU(), 
            nn.Linear(512, 2)
        )
        
    def forward(self, x):
        x = self.flatten(x)
        logits = self.stack(x)
        return logits
    
model = Model()
model

Model(
  (flatten): Flatten(start_dim=1, end_dim=-1)
  (stack): Sequential(
    (0): Linear(in_features=2048, out_features=512, bias=True)
    (1): ReLU()
    (2): Linear(in_features=512, out_features=512, bias=True)
    (3): ReLU()
    (4): Linear(in_features=512, out_features=512, bias=True)
    (5): ReLU()
    (6): Linear(in_features=512, out_features=512, bias=True)
    (7): ReLU()
    (8): Linear(in_features=512, out_features=2, bias=True)
  )
)

In [45]:
# evaluate the model
loss_fn = nn.CrossEntropyLoss()
learning_rate = 0.001
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)

train_and_validate(model, init_train_dl, valid_dl, optimizer, loss_fn)

Epoch [01/20], Train Loss: 0.6666, Train Acc: 0.7421, Valid Loss: 0.5994, Valid Acc: 0.7462
Epoch [02/20], Train Loss: 0.5389, Train Acc: 0.7421, Valid Loss: 0.6116, Valid Acc: 0.7462
Epoch [03/20], Train Loss: 0.4015, Train Acc: 0.7421, Valid Loss: 0.4855, Valid Acc: 0.7462
Epoch [04/20], Train Loss: 0.2551, Train Acc: 0.7925, Valid Loss: 0.4568, Valid Acc: 0.8173
Epoch [05/20], Train Loss: 0.1795, Train Acc: 0.9811, Valid Loss: 0.6923, Valid Acc: 0.8274
Epoch [06/20], Train Loss: 0.1230, Train Acc: 1.0000, Valid Loss: 1.1063, Valid Acc: 0.8274
Epoch [07/20], Train Loss: 0.0483, Train Acc: 1.0000, Valid Loss: 1.4581, Valid Acc: 0.8325
Epoch [08/20], Train Loss: 0.0062, Train Acc: 1.0000, Valid Loss: 1.7529, Valid Acc: 0.8325
Epoch [09/20], Train Loss: 0.0004, Train Acc: 1.0000, Valid Loss: 2.1248, Valid Acc: 0.8274
Epoch [10/20], Train Loss: 0.0000, Train Acc: 1.0000, Valid Loss: 2.5419, Valid Acc: 0.8173
Epoch [11/20], Train Loss: 0.0000, Train Acc: 1.0000, Valid Loss: 3.0555, Valid 

The performance does not seem to improve with more hidden layers. I will keep the model with only one hidden layer.

#### Activation functions

ReLU activation function

In [50]:
# construct a nn model with ReLU activation function
class Model(nn.Module):
    def __init__(self):
        super(Model, self).__init__()
        self.flatten = nn.Flatten()
        self.stack = nn.Sequential(
            nn.Linear(2048, 512), 
            nn.ReLU(), 
            nn.Linear(512, 512),
            nn.ReLU(), 
            nn.Linear(512, 2)
        )
        
    def forward(self, x):
        x = self.flatten(x)
        logits = self.stack(x)
        return logits
    
model = Model()
model

Model(
  (flatten): Flatten(start_dim=1, end_dim=-1)
  (stack): Sequential(
    (0): Linear(in_features=2048, out_features=512, bias=True)
    (1): ReLU()
    (2): Linear(in_features=512, out_features=512, bias=True)
    (3): ReLU()
    (4): Linear(in_features=512, out_features=2, bias=True)
  )
)

In [51]:
# evaluate the model
loss_fn = nn.CrossEntropyLoss()
learning_rate = 0.001
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)

train_and_validate(model, init_train_dl, valid_dl, optimizer, loss_fn)

Epoch [01/20], Train Loss: 0.6588, Train Acc: 0.6352, Valid Loss: 0.5781, Valid Acc: 0.7462
Epoch [02/20], Train Loss: 0.4998, Train Acc: 0.7484, Valid Loss: 0.5217, Valid Acc: 0.7462
Epoch [03/20], Train Loss: 0.3462, Train Acc: 0.7987, Valid Loss: 0.5155, Valid Acc: 0.7868
Epoch [04/20], Train Loss: 0.1955, Train Acc: 0.9119, Valid Loss: 0.4659, Valid Acc: 0.8223
Epoch [05/20], Train Loss: 0.0981, Train Acc: 0.9937, Valid Loss: 0.4486, Valid Acc: 0.8223
Epoch [06/20], Train Loss: 0.0455, Train Acc: 1.0000, Valid Loss: 0.5116, Valid Acc: 0.8376
Epoch [07/20], Train Loss: 0.0177, Train Acc: 1.0000, Valid Loss: 0.6396, Valid Acc: 0.8376
Epoch [08/20], Train Loss: 0.0068, Train Acc: 1.0000, Valid Loss: 0.8055, Valid Acc: 0.8325
Epoch [09/20], Train Loss: 0.0030, Train Acc: 1.0000, Valid Loss: 0.9819, Valid Acc: 0.8274
Epoch [10/20], Train Loss: 0.0014, Train Acc: 1.0000, Valid Loss: 1.1462, Valid Acc: 0.8274
Epoch [11/20], Train Loss: 0.0008, Train Acc: 1.0000, Valid Loss: 1.2920, Valid 

Sigmoid activation function

In [52]:
# construct a nn model with Sigmoid activation function
class Model(nn.Module):
    def __init__(self):
        super(Model, self).__init__()
        self.flatten = nn.Flatten()
        self.stack = nn.Sequential(
            nn.Linear(2048, 512), 
            nn.Sigmoid(), 
            nn.Linear(512, 512),
            nn.Sigmoid(), 
            nn.Linear(512, 2)
        )
        
    def forward(self, x):
        x = self.flatten(x)
        logits = self.stack(x)
        return logits
    
model = Model()
model

Model(
  (flatten): Flatten(start_dim=1, end_dim=-1)
  (stack): Sequential(
    (0): Linear(in_features=2048, out_features=512, bias=True)
    (1): Sigmoid()
    (2): Linear(in_features=512, out_features=512, bias=True)
    (3): Sigmoid()
    (4): Linear(in_features=512, out_features=2, bias=True)
  )
)

In [53]:
# evaluate the model
loss_fn = nn.CrossEntropyLoss()
learning_rate = 0.001
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)

train_and_validate(model, init_train_dl, valid_dl, optimizer, loss_fn)

Epoch [01/20], Train Loss: 0.5799, Train Acc: 0.6164, Valid Loss: 0.7237, Valid Acc: 0.7462
Epoch [02/20], Train Loss: 0.6981, Train Acc: 0.7421, Valid Loss: 0.5599, Valid Acc: 0.7462
Epoch [03/20], Train Loss: 0.5640, Train Acc: 0.7421, Valid Loss: 0.5856, Valid Acc: 0.7462
Epoch [04/20], Train Loss: 0.5551, Train Acc: 0.7421, Valid Loss: 0.5525, Valid Acc: 0.7462
Epoch [05/20], Train Loss: 0.5404, Train Acc: 0.7421, Valid Loss: 0.5675, Valid Acc: 0.7462
Epoch [06/20], Train Loss: 0.5151, Train Acc: 0.7421, Valid Loss: 0.5284, Valid Acc: 0.7462
Epoch [07/20], Train Loss: 0.4891, Train Acc: 0.7736, Valid Loss: 0.5322, Valid Acc: 0.7513
Epoch [08/20], Train Loss: 0.4457, Train Acc: 0.8113, Valid Loss: 0.4962, Valid Acc: 0.7513
Epoch [09/20], Train Loss: 0.3978, Train Acc: 0.7799, Valid Loss: 0.5076, Valid Acc: 0.7513
Epoch [10/20], Train Loss: 0.3557, Train Acc: 0.7987, Valid Loss: 0.4515, Valid Acc: 0.7766
Epoch [11/20], Train Loss: 0.2809, Train Acc: 0.9057, Valid Loss: 0.4262, Valid 

Tanh activation function

In [54]:
# construct a nn model with Tanh activation function
class Model(nn.Module):
    def __init__(self):
        super(Model, self).__init__()
        self.flatten = nn.Flatten()
        self.stack = nn.Sequential(
            nn.Linear(2048, 512), 
            nn.Tanh(), 
            nn.Linear(512, 512),
            nn.Tanh(), 
            nn.Linear(512, 2)
        )
        
    def forward(self, x):
        x = self.flatten(x)
        logits = self.stack(x)
        return logits
    
model = Model()
model

Model(
  (flatten): Flatten(start_dim=1, end_dim=-1)
  (stack): Sequential(
    (0): Linear(in_features=2048, out_features=512, bias=True)
    (1): Tanh()
    (2): Linear(in_features=512, out_features=512, bias=True)
    (3): Tanh()
    (4): Linear(in_features=512, out_features=2, bias=True)
  )
)

In [55]:
# evaluate the model
loss_fn = nn.CrossEntropyLoss()
learning_rate = 0.001
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)

train_and_validate(model, init_train_dl, valid_dl, optimizer, loss_fn)

Epoch [01/20], Train Loss: 0.6462, Train Acc: 0.5535, Valid Loss: 0.5074, Valid Acc: 0.7563
Epoch [02/20], Train Loss: 0.3459, Train Acc: 0.8176, Valid Loss: 0.4371, Valid Acc: 0.8223
Epoch [03/20], Train Loss: 0.1284, Train Acc: 0.9623, Valid Loss: 0.4397, Valid Acc: 0.8274
Epoch [04/20], Train Loss: 0.0407, Train Acc: 1.0000, Valid Loss: 0.5697, Valid Acc: 0.8376
Epoch [05/20], Train Loss: 0.0109, Train Acc: 1.0000, Valid Loss: 0.7758, Valid Acc: 0.8325
Epoch [06/20], Train Loss: 0.0030, Train Acc: 1.0000, Valid Loss: 1.0044, Valid Acc: 0.8325
Epoch [07/20], Train Loss: 0.0011, Train Acc: 1.0000, Valid Loss: 1.2145, Valid Acc: 0.8376
Epoch [08/20], Train Loss: 0.0005, Train Acc: 1.0000, Valid Loss: 1.3936, Valid Acc: 0.8325
Epoch [09/20], Train Loss: 0.0003, Train Acc: 1.0000, Valid Loss: 1.5434, Valid Acc: 0.8325
Epoch [10/20], Train Loss: 0.0002, Train Acc: 1.0000, Valid Loss: 1.6636, Valid Acc: 0.8274
Epoch [11/20], Train Loss: 0.0001, Train Acc: 1.0000, Valid Loss: 1.7581, Valid 

LeakyReLU activation function

In [56]:
# construct a nn model with LeakyReLU activation function
class Model(nn.Module):
    def __init__(self):
        super(Model, self).__init__()
        self.flatten = nn.Flatten()
        self.stack = nn.Sequential(
            nn.Linear(2048, 512), 
            nn.LeakyReLU(), 
            nn.Linear(512, 512),
            nn.LeakyReLU(), 
            nn.Linear(512, 2)
        )
        
    def forward(self, x):
        x = self.flatten(x)
        logits = self.stack(x)
        return logits
    
model = Model()
model

Model(
  (flatten): Flatten(start_dim=1, end_dim=-1)
  (stack): Sequential(
    (0): Linear(in_features=2048, out_features=512, bias=True)
    (1): LeakyReLU(negative_slope=0.01)
    (2): Linear(in_features=512, out_features=512, bias=True)
    (3): LeakyReLU(negative_slope=0.01)
    (4): Linear(in_features=512, out_features=2, bias=True)
  )
)

In [57]:
# evaluate the model
loss_fn = nn.CrossEntropyLoss()
learning_rate = 0.001
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)

train_and_validate(model, init_train_dl, valid_dl, optimizer, loss_fn)

Epoch [01/20], Train Loss: 0.6585, Train Acc: 0.7044, Valid Loss: 0.5750, Valid Acc: 0.7462
Epoch [02/20], Train Loss: 0.4919, Train Acc: 0.7421, Valid Loss: 0.5180, Valid Acc: 0.7462
Epoch [03/20], Train Loss: 0.3297, Train Acc: 0.8239, Valid Loss: 0.5274, Valid Acc: 0.7868
Epoch [04/20], Train Loss: 0.1826, Train Acc: 0.9182, Valid Loss: 0.4931, Valid Acc: 0.8223
Epoch [05/20], Train Loss: 0.0833, Train Acc: 0.9937, Valid Loss: 0.4864, Valid Acc: 0.8274
Epoch [06/20], Train Loss: 0.0322, Train Acc: 1.0000, Valid Loss: 0.5836, Valid Acc: 0.8376
Epoch [07/20], Train Loss: 0.0105, Train Acc: 1.0000, Valid Loss: 0.7525, Valid Acc: 0.8376
Epoch [08/20], Train Loss: 0.0036, Train Acc: 1.0000, Valid Loss: 0.9479, Valid Acc: 0.8325
Epoch [09/20], Train Loss: 0.0014, Train Acc: 1.0000, Valid Loss: 1.1346, Valid Acc: 0.8274
Epoch [10/20], Train Loss: 0.0007, Train Acc: 1.0000, Valid Loss: 1.3037, Valid Acc: 0.8223
Epoch [11/20], Train Loss: 0.0004, Train Acc: 1.0000, Valid Loss: 1.4495, Valid 

ELU activation function

In [58]:
# construct a nn model with ELU activation function
class Model(nn.Module):
    def __init__(self):
        super(Model, self).__init__()
        self.flatten = nn.Flatten()
        self.stack = nn.Sequential(
            nn.Linear(2048, 512), 
            nn.ELU(), 
            nn.Linear(512, 512),
            nn.ELU(), 
            nn.Linear(512, 2)
        )
        
    def forward(self, x):
        x = self.flatten(x)
        logits = self.stack(x)
        return logits
    
model = Model()
model

Model(
  (flatten): Flatten(start_dim=1, end_dim=-1)
  (stack): Sequential(
    (0): Linear(in_features=2048, out_features=512, bias=True)
    (1): ELU(alpha=1.0)
    (2): Linear(in_features=512, out_features=512, bias=True)
    (3): ELU(alpha=1.0)
    (4): Linear(in_features=512, out_features=2, bias=True)
  )
)

In [59]:
# evaluate the model
loss_fn = nn.CrossEntropyLoss()
learning_rate = 0.001
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)

train_and_validate(model, init_train_dl, valid_dl, optimizer, loss_fn)

Epoch [01/20], Train Loss: 0.6270, Train Acc: 0.6792, Valid Loss: 0.5097, Valid Acc: 0.7614
Epoch [02/20], Train Loss: 0.3422, Train Acc: 0.8428, Valid Loss: 0.4461, Valid Acc: 0.8020
Epoch [03/20], Train Loss: 0.1161, Train Acc: 0.9623, Valid Loss: 0.4349, Valid Acc: 0.8325
Epoch [04/20], Train Loss: 0.0360, Train Acc: 1.0000, Valid Loss: 0.5703, Valid Acc: 0.8376
Epoch [05/20], Train Loss: 0.0093, Train Acc: 1.0000, Valid Loss: 0.7942, Valid Acc: 0.8274
Epoch [06/20], Train Loss: 0.0021, Train Acc: 1.0000, Valid Loss: 1.0404, Valid Acc: 0.8274
Epoch [07/20], Train Loss: 0.0008, Train Acc: 1.0000, Valid Loss: 1.2753, Valid Acc: 0.8325
Epoch [08/20], Train Loss: 0.0004, Train Acc: 1.0000, Valid Loss: 1.4841, Valid Acc: 0.8274
Epoch [09/20], Train Loss: 0.0003, Train Acc: 1.0000, Valid Loss: 1.6604, Valid Acc: 0.8274
Epoch [10/20], Train Loss: 0.0001, Train Acc: 1.0000, Valid Loss: 1.8038, Valid Acc: 0.8274
Epoch [11/20], Train Loss: 0.0001, Train Acc: 1.0000, Valid Loss: 1.9187, Valid 

The performance is best using ReLU as the activation function.

#### Batch normalization

no batch normalization

In [62]:
# construct a nn model with no batch normalization
class Model(nn.Module):
    def __init__(self):
        super(Model, self).__init__()
        self.flatten = nn.Flatten()
        self.stack = nn.Sequential(
            nn.Linear(2048, 512), 
            nn.ReLU(), 
            nn.Linear(512, 512),
            nn.ReLU(), 
            nn.Linear(512, 2)
        )
        
    def forward(self, x):
        x = self.flatten(x)
        logits = self.stack(x)
        return logits
    
model = Model()
model

Model(
  (flatten): Flatten(start_dim=1, end_dim=-1)
  (stack): Sequential(
    (0): Linear(in_features=2048, out_features=512, bias=True)
    (1): ReLU()
    (2): Linear(in_features=512, out_features=512, bias=True)
    (3): ReLU()
    (4): Linear(in_features=512, out_features=2, bias=True)
  )
)

In [63]:
# evaluate the model
loss_fn = nn.CrossEntropyLoss()
learning_rate = 0.001
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)

train_and_validate(model, init_train_dl, valid_dl, optimizer, loss_fn)

Epoch [01/20], Train Loss: 0.6834, Train Acc: 0.5283, Valid Loss: 0.5901, Valid Acc: 0.7462
Epoch [02/20], Train Loss: 0.5226, Train Acc: 0.7421, Valid Loss: 0.5144, Valid Acc: 0.7462
Epoch [03/20], Train Loss: 0.3552, Train Acc: 0.8050, Valid Loss: 0.5158, Valid Acc: 0.7766
Epoch [04/20], Train Loss: 0.2179, Train Acc: 0.8868, Valid Loss: 0.4939, Valid Acc: 0.8223
Epoch [05/20], Train Loss: 0.1215, Train Acc: 0.9748, Valid Loss: 0.4697, Valid Acc: 0.8223
Epoch [06/20], Train Loss: 0.0595, Train Acc: 1.0000, Valid Loss: 0.5239, Valid Acc: 0.8376
Epoch [07/20], Train Loss: 0.0253, Train Acc: 1.0000, Valid Loss: 0.6431, Valid Acc: 0.8426
Epoch [08/20], Train Loss: 0.0085, Train Acc: 1.0000, Valid Loss: 0.7980, Valid Acc: 0.8376
Epoch [09/20], Train Loss: 0.0033, Train Acc: 1.0000, Valid Loss: 0.9563, Valid Acc: 0.8325
Epoch [10/20], Train Loss: 0.0013, Train Acc: 1.0000, Valid Loss: 1.0968, Valid Acc: 0.8274
Epoch [11/20], Train Loss: 0.0007, Train Acc: 1.0000, Valid Loss: 1.2242, Valid 

1D batch normalization

In [64]:
# construct a nn model with 1D batch normalization
class Model(nn.Module):
    def __init__(self):
        super(Model, self).__init__()
        self.flatten = nn.Flatten()
        self.stack = nn.Sequential(
            nn.Linear(2048, 512), 
            nn.BatchNorm1d(512), 
            nn.ReLU(), 
            nn.Linear(512, 512), 
            nn.BatchNorm1d(512), 
            nn.ReLU(), 
            nn.Linear(512, 2)
        )
        
    def forward(self, x):
        x = self.flatten(x)
        logits = self.stack(x)
        return logits
    
model = Model()
model

Model(
  (flatten): Flatten(start_dim=1, end_dim=-1)
  (stack): Sequential(
    (0): Linear(in_features=2048, out_features=512, bias=True)
    (1): BatchNorm1d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (2): ReLU()
    (3): Linear(in_features=512, out_features=512, bias=True)
    (4): BatchNorm1d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (5): ReLU()
    (6): Linear(in_features=512, out_features=2, bias=True)
  )
)

In [65]:
# evaluate the model
loss_fn = nn.CrossEntropyLoss()
learning_rate = 0.001
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)

train_and_validate(model, init_train_dl, valid_dl, optimizer, loss_fn)

Epoch [01/20], Train Loss: 0.5032, Train Acc: 0.7421, Valid Loss: 0.4159, Valid Acc: 0.8579
Epoch [02/20], Train Loss: 0.0412, Train Acc: 1.0000, Valid Loss: 0.4682, Valid Acc: 0.8426
Epoch [03/20], Train Loss: 0.0193, Train Acc: 0.9937, Valid Loss: 0.4929, Valid Acc: 0.8426
Epoch [04/20], Train Loss: 0.0098, Train Acc: 1.0000, Valid Loss: 0.5267, Valid Acc: 0.8426
Epoch [05/20], Train Loss: 0.0047, Train Acc: 1.0000, Valid Loss: 0.4775, Valid Acc: 0.8782
Epoch [06/20], Train Loss: 0.0049, Train Acc: 1.0000, Valid Loss: 0.5779, Valid Acc: 0.8477
Epoch [07/20], Train Loss: 0.0018, Train Acc: 1.0000, Valid Loss: 0.5850, Valid Acc: 0.8274
Epoch [08/20], Train Loss: 0.0017, Train Acc: 1.0000, Valid Loss: 0.5829, Valid Acc: 0.8629
Epoch [09/20], Train Loss: 0.0011, Train Acc: 1.0000, Valid Loss: 0.6238, Valid Acc: 0.8173
Epoch [10/20], Train Loss: 0.0010, Train Acc: 1.0000, Valid Loss: 0.6197, Valid Acc: 0.8426
Epoch [11/20], Train Loss: 0.0011, Train Acc: 1.0000, Valid Loss: 0.6725, Valid 

Batch normalization significantly improves the model performance on the validation set.

#### Learning rates

learning_rate = 0.001

In [68]:
# construct the nn model
class Model(nn.Module):
    def __init__(self):
        super(Model, self).__init__()
        self.flatten = nn.Flatten()
        self.stack = nn.Sequential(
            nn.Linear(2048, 512), 
            nn.BatchNorm1d(512), 
            nn.ReLU(), 
            nn.Linear(512, 512), 
            nn.BatchNorm1d(512), 
            nn.ReLU(), 
            nn.Linear(512, 2)
        )
        
    def forward(self, x):
        x = self.flatten(x)
        logits = self.stack(x)
        return logits
    
model = Model()
model

Model(
  (flatten): Flatten(start_dim=1, end_dim=-1)
  (stack): Sequential(
    (0): Linear(in_features=2048, out_features=512, bias=True)
    (1): BatchNorm1d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (2): ReLU()
    (3): Linear(in_features=512, out_features=512, bias=True)
    (4): BatchNorm1d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (5): ReLU()
    (6): Linear(in_features=512, out_features=2, bias=True)
  )
)

In [69]:
# evaluate the model
loss_fn = nn.CrossEntropyLoss()
learning_rate = 0.001
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)

train_and_validate(model, init_train_dl, valid_dl, optimizer, loss_fn)

Epoch [01/20], Train Loss: 0.5204, Train Acc: 0.7610, Valid Loss: 0.3814, Valid Acc: 0.8579
Epoch [02/20], Train Loss: 0.0642, Train Acc: 0.9811, Valid Loss: 0.3901, Valid Acc: 0.8629
Epoch [03/20], Train Loss: 0.0304, Train Acc: 1.0000, Valid Loss: 0.4346, Valid Acc: 0.8629
Epoch [04/20], Train Loss: 0.0102, Train Acc: 1.0000, Valid Loss: 0.4792, Valid Acc: 0.8629
Epoch [05/20], Train Loss: 0.0080, Train Acc: 1.0000, Valid Loss: 0.4913, Valid Acc: 0.8629
Epoch [06/20], Train Loss: 0.0047, Train Acc: 1.0000, Valid Loss: 0.5352, Valid Acc: 0.8528
Epoch [07/20], Train Loss: 0.0037, Train Acc: 1.0000, Valid Loss: 0.5658, Valid Acc: 0.8629
Epoch [08/20], Train Loss: 0.0033, Train Acc: 1.0000, Valid Loss: 0.5584, Valid Acc: 0.8477
Epoch [09/20], Train Loss: 0.0019, Train Acc: 1.0000, Valid Loss: 0.5447, Valid Acc: 0.8629
Epoch [10/20], Train Loss: 0.0025, Train Acc: 1.0000, Valid Loss: 0.6401, Valid Acc: 0.8376
Epoch [11/20], Train Loss: 0.0024, Train Acc: 1.0000, Valid Loss: 0.6022, Valid 

learning_rate = 0.01

In [70]:
# construct the nn model
class Model(nn.Module):
    def __init__(self):
        super(Model, self).__init__()
        self.flatten = nn.Flatten()
        self.stack = nn.Sequential(
            nn.Linear(2048, 512), 
            nn.BatchNorm1d(512), 
            nn.ReLU(), 
            nn.Linear(512, 512), 
            nn.BatchNorm1d(512), 
            nn.ReLU(), 
            nn.Linear(512, 2)
        )
        
    def forward(self, x):
        x = self.flatten(x)
        logits = self.stack(x)
        return logits
    
model = Model()
model

Model(
  (flatten): Flatten(start_dim=1, end_dim=-1)
  (stack): Sequential(
    (0): Linear(in_features=2048, out_features=512, bias=True)
    (1): BatchNorm1d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (2): ReLU()
    (3): Linear(in_features=512, out_features=512, bias=True)
    (4): BatchNorm1d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (5): ReLU()
    (6): Linear(in_features=512, out_features=2, bias=True)
  )
)

In [71]:
# evaluate the model
loss_fn = nn.CrossEntropyLoss()
learning_rate = 0.01
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)

train_and_validate(model, init_train_dl, valid_dl, optimizer, loss_fn)

Epoch [01/20], Train Loss: 0.7381, Train Acc: 0.6101, Valid Loss: 0.5163, Valid Acc: 0.8122
Epoch [02/20], Train Loss: 0.2232, Train Acc: 0.9057, Valid Loss: 0.5114, Valid Acc: 0.8325
Epoch [03/20], Train Loss: 0.0537, Train Acc: 0.9748, Valid Loss: 0.6050, Valid Acc: 0.8528
Epoch [04/20], Train Loss: 0.0260, Train Acc: 1.0000, Valid Loss: 0.6466, Valid Acc: 0.8629
Epoch [05/20], Train Loss: 0.0035, Train Acc: 1.0000, Valid Loss: 0.8130, Valid Acc: 0.8528
Epoch [06/20], Train Loss: 0.0050, Train Acc: 1.0000, Valid Loss: 0.7648, Valid Acc: 0.8680
Epoch [07/20], Train Loss: 0.0030, Train Acc: 1.0000, Valid Loss: 0.8597, Valid Acc: 0.8376
Epoch [08/20], Train Loss: 0.0008, Train Acc: 1.0000, Valid Loss: 0.9343, Valid Acc: 0.8426
Epoch [09/20], Train Loss: 0.0007, Train Acc: 1.0000, Valid Loss: 0.9769, Valid Acc: 0.8579
Epoch [10/20], Train Loss: 0.0006, Train Acc: 1.0000, Valid Loss: 0.9735, Valid Acc: 0.8376
Epoch [11/20], Train Loss: 0.0007, Train Acc: 1.0000, Valid Loss: 0.9790, Valid 

learning_rate = 0.0001

In [72]:
# construct the nn model
class Model(nn.Module):
    def __init__(self):
        super(Model, self).__init__()
        self.flatten = nn.Flatten()
        self.stack = nn.Sequential(
            nn.Linear(2048, 512), 
            nn.BatchNorm1d(512), 
            nn.ReLU(), 
            nn.Linear(512, 512), 
            nn.BatchNorm1d(512), 
            nn.ReLU(), 
            nn.Linear(512, 2)
        )
        
    def forward(self, x):
        x = self.flatten(x)
        logits = self.stack(x)
        return logits
    
model = Model()
model

Model(
  (flatten): Flatten(start_dim=1, end_dim=-1)
  (stack): Sequential(
    (0): Linear(in_features=2048, out_features=512, bias=True)
    (1): BatchNorm1d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (2): ReLU()
    (3): Linear(in_features=512, out_features=512, bias=True)
    (4): BatchNorm1d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (5): ReLU()
    (6): Linear(in_features=512, out_features=2, bias=True)
  )
)

In [73]:
# evaluate the model
loss_fn = nn.CrossEntropyLoss()
learning_rate = 0.0001
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)

train_and_validate(model, init_train_dl, valid_dl, optimizer, loss_fn)

Epoch [01/20], Train Loss: 0.5802, Train Acc: 0.7358, Valid Loss: 0.5328, Valid Acc: 0.7614
Epoch [02/20], Train Loss: 0.3654, Train Acc: 0.9119, Valid Loss: 0.4845, Valid Acc: 0.7919
Epoch [03/20], Train Loss: 0.2342, Train Acc: 0.9748, Valid Loss: 0.4554, Valid Acc: 0.8122
Epoch [04/20], Train Loss: 0.1726, Train Acc: 0.9874, Valid Loss: 0.4261, Valid Acc: 0.8274
Epoch [05/20], Train Loss: 0.1239, Train Acc: 0.9937, Valid Loss: 0.4110, Valid Acc: 0.8731
Epoch [06/20], Train Loss: 0.1020, Train Acc: 0.9937, Valid Loss: 0.4006, Valid Acc: 0.8629
Epoch [07/20], Train Loss: 0.0940, Train Acc: 1.0000, Valid Loss: 0.4071, Valid Acc: 0.8528
Epoch [08/20], Train Loss: 0.0612, Train Acc: 1.0000, Valid Loss: 0.3973, Valid Acc: 0.8680
Epoch [09/20], Train Loss: 0.0504, Train Acc: 1.0000, Valid Loss: 0.4059, Valid Acc: 0.8629
Epoch [10/20], Train Loss: 0.0452, Train Acc: 1.0000, Valid Loss: 0.3974, Valid Acc: 0.8528
Epoch [11/20], Train Loss: 0.0349, Train Acc: 1.0000, Valid Loss: 0.3971, Valid 

Model performance on the validation set seems to slightly improve with a lower learning rate.

#### Optimization functions

Adam optimization function

In [68]:
# construct the nn model
class Model(nn.Module):
    def __init__(self):
        super(Model, self).__init__()
        self.flatten = nn.Flatten()
        self.stack = nn.Sequential(
            nn.Linear(2048, 512), 
            nn.BatchNorm1d(512), 
            nn.ReLU(), 
            nn.Linear(512, 512), 
            nn.BatchNorm1d(512), 
            nn.ReLU(), 
            nn.Linear(512, 2)
        )
        
    def forward(self, x):
        x = self.flatten(x)
        logits = self.stack(x)
        return logits
    
model = Model()
model

Model(
  (flatten): Flatten(start_dim=1, end_dim=-1)
  (stack): Sequential(
    (0): Linear(in_features=2048, out_features=512, bias=True)
    (1): BatchNorm1d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (2): ReLU()
    (3): Linear(in_features=512, out_features=512, bias=True)
    (4): BatchNorm1d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (5): ReLU()
    (6): Linear(in_features=512, out_features=2, bias=True)
  )
)

In [69]:
# evaluate the model
loss_fn = nn.CrossEntropyLoss()
learning_rate = 0.001
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)

train_and_validate(model, init_train_dl, valid_dl, optimizer, loss_fn)

Epoch [01/20], Train Loss: 0.5204, Train Acc: 0.7610, Valid Loss: 0.3814, Valid Acc: 0.8579
Epoch [02/20], Train Loss: 0.0642, Train Acc: 0.9811, Valid Loss: 0.3901, Valid Acc: 0.8629
Epoch [03/20], Train Loss: 0.0304, Train Acc: 1.0000, Valid Loss: 0.4346, Valid Acc: 0.8629
Epoch [04/20], Train Loss: 0.0102, Train Acc: 1.0000, Valid Loss: 0.4792, Valid Acc: 0.8629
Epoch [05/20], Train Loss: 0.0080, Train Acc: 1.0000, Valid Loss: 0.4913, Valid Acc: 0.8629
Epoch [06/20], Train Loss: 0.0047, Train Acc: 1.0000, Valid Loss: 0.5352, Valid Acc: 0.8528
Epoch [07/20], Train Loss: 0.0037, Train Acc: 1.0000, Valid Loss: 0.5658, Valid Acc: 0.8629
Epoch [08/20], Train Loss: 0.0033, Train Acc: 1.0000, Valid Loss: 0.5584, Valid Acc: 0.8477
Epoch [09/20], Train Loss: 0.0019, Train Acc: 1.0000, Valid Loss: 0.5447, Valid Acc: 0.8629
Epoch [10/20], Train Loss: 0.0025, Train Acc: 1.0000, Valid Loss: 0.6401, Valid Acc: 0.8376
Epoch [11/20], Train Loss: 0.0024, Train Acc: 1.0000, Valid Loss: 0.6022, Valid 

SGD optimization function

In [76]:
# construct the nn model
class Model(nn.Module):
    def __init__(self):
        super(Model, self).__init__()
        self.flatten = nn.Flatten()
        self.stack = nn.Sequential(
            nn.Linear(2048, 512), 
            nn.BatchNorm1d(512), 
            nn.ReLU(), 
            nn.Linear(512, 512), 
            nn.BatchNorm1d(512), 
            nn.ReLU(), 
            nn.Linear(512, 2)
        )
        
    def forward(self, x):
        x = self.flatten(x)
        logits = self.stack(x)
        return logits
    
model = Model()
model

Model(
  (flatten): Flatten(start_dim=1, end_dim=-1)
  (stack): Sequential(
    (0): Linear(in_features=2048, out_features=512, bias=True)
    (1): BatchNorm1d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (2): ReLU()
    (3): Linear(in_features=512, out_features=512, bias=True)
    (4): BatchNorm1d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (5): ReLU()
    (6): Linear(in_features=512, out_features=2, bias=True)
  )
)

In [77]:
# evaluate the model
loss_fn = nn.CrossEntropyLoss()
learning_rate = 0.001
optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate)

train_and_validate(model, init_train_dl, valid_dl, optimizer, loss_fn)

Epoch [01/20], Train Loss: 0.6813, Train Acc: 0.5786, Valid Loss: 0.6762, Valid Acc: 0.5584
Epoch [02/20], Train Loss: 0.6349, Train Acc: 0.6478, Valid Loss: 0.6526, Valid Acc: 0.6091
Epoch [03/20], Train Loss: 0.5967, Train Acc: 0.7673, Valid Loss: 0.6325, Valid Acc: 0.6396
Epoch [04/20], Train Loss: 0.5736, Train Acc: 0.7421, Valid Loss: 0.6188, Valid Acc: 0.6701
Epoch [05/20], Train Loss: 0.5475, Train Acc: 0.7925, Valid Loss: 0.6081, Valid Acc: 0.6802
Epoch [06/20], Train Loss: 0.5154, Train Acc: 0.8491, Valid Loss: 0.5875, Valid Acc: 0.7259
Epoch [07/20], Train Loss: 0.4997, Train Acc: 0.8553, Valid Loss: 0.5727, Valid Acc: 0.7360
Epoch [08/20], Train Loss: 0.4752, Train Acc: 0.8491, Valid Loss: 0.5636, Valid Acc: 0.7310
Epoch [09/20], Train Loss: 0.4653, Train Acc: 0.8679, Valid Loss: 0.5590, Valid Acc: 0.7665
Epoch [10/20], Train Loss: 0.4408, Train Acc: 0.8742, Valid Loss: 0.5441, Valid Acc: 0.7665
Epoch [11/20], Train Loss: 0.4260, Train Acc: 0.8679, Valid Loss: 0.5446, Valid 

Adam optimization leads to better performance of the model on both the training set and the validation set.

#### The best model so far

In [78]:
# construct the nn model
class Model(nn.Module):
    def __init__(self):
        super(Model, self).__init__()
        self.flatten = nn.Flatten()
        self.stack = nn.Sequential(
            nn.Linear(2048, 512), 
            nn.BatchNorm1d(512), 
            nn.ReLU(), 
            nn.Linear(512, 512), 
            nn.BatchNorm1d(512), 
            nn.ReLU(), 
            nn.Linear(512, 2)
        )
        
    def forward(self, x):
        x = self.flatten(x)
        logits = self.stack(x)
        return logits
    
model = Model()
model

Model(
  (flatten): Flatten(start_dim=1, end_dim=-1)
  (stack): Sequential(
    (0): Linear(in_features=2048, out_features=512, bias=True)
    (1): BatchNorm1d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (2): ReLU()
    (3): Linear(in_features=512, out_features=512, bias=True)
    (4): BatchNorm1d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (5): ReLU()
    (6): Linear(in_features=512, out_features=2, bias=True)
  )
)

In [79]:
# evaluate the model
loss_fn = nn.CrossEntropyLoss()
learning_rate = 0.0001
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)

train_and_validate(model, init_train_dl, valid_dl, optimizer, loss_fn)

Epoch [01/20], Train Loss: 0.6455, Train Acc: 0.6667, Valid Loss: 0.6006, Valid Acc: 0.6701
Epoch [02/20], Train Loss: 0.4146, Train Acc: 0.9119, Valid Loss: 0.5330, Valid Acc: 0.7868
Epoch [03/20], Train Loss: 0.2695, Train Acc: 0.9937, Valid Loss: 0.4997, Valid Acc: 0.8071
Epoch [04/20], Train Loss: 0.2010, Train Acc: 1.0000, Valid Loss: 0.4629, Valid Acc: 0.8274
Epoch [05/20], Train Loss: 0.1426, Train Acc: 1.0000, Valid Loss: 0.4332, Valid Acc: 0.8274
Epoch [06/20], Train Loss: 0.1130, Train Acc: 1.0000, Valid Loss: 0.4254, Valid Acc: 0.8274
Epoch [07/20], Train Loss: 0.0864, Train Acc: 1.0000, Valid Loss: 0.4145, Valid Acc: 0.8528
Epoch [08/20], Train Loss: 0.0689, Train Acc: 1.0000, Valid Loss: 0.4117, Valid Acc: 0.8325
Epoch [09/20], Train Loss: 0.0570, Train Acc: 1.0000, Valid Loss: 0.3982, Valid Acc: 0.8325
Epoch [10/20], Train Loss: 0.0510, Train Acc: 1.0000, Valid Loss: 0.4002, Valid Acc: 0.8325
Epoch [11/20], Train Loss: 0.0413, Train Acc: 1.0000, Valid Loss: 0.4097, Valid 

## Train the best model on the training set

In [80]:
# train on the training set
train_and_validate(model, train_dl, valid_dl, optimizer, loss_fn)

Epoch [01/20], Train Loss: 0.3614, Train Acc: 0.8594, Valid Loss: 0.3059, Valid Acc: 0.9086
Epoch [02/20], Train Loss: 0.1584, Train Acc: 0.9550, Valid Loss: 0.2903, Valid Acc: 0.8934
Epoch [03/20], Train Loss: 0.0924, Train Acc: 0.9831, Valid Loss: 0.2989, Valid Acc: 0.8934
Epoch [04/20], Train Loss: 0.0610, Train Acc: 0.9923, Valid Loss: 0.3022, Valid Acc: 0.9086
Epoch [05/20], Train Loss: 0.0436, Train Acc: 0.9958, Valid Loss: 0.3293, Valid Acc: 0.9036
Epoch [06/20], Train Loss: 0.0302, Train Acc: 0.9972, Valid Loss: 0.3544, Valid Acc: 0.8883
Epoch [07/20], Train Loss: 0.0233, Train Acc: 0.9972, Valid Loss: 0.3519, Valid Acc: 0.8883
Epoch [08/20], Train Loss: 0.0195, Train Acc: 0.9979, Valid Loss: 0.3453, Valid Acc: 0.8883
Epoch [09/20], Train Loss: 0.0186, Train Acc: 0.9965, Valid Loss: 0.3851, Valid Acc: 0.8934
Epoch [10/20], Train Loss: 0.0158, Train Acc: 0.9986, Valid Loss: 0.3565, Valid Acc: 0.8934
Epoch [11/20], Train Loss: 0.0140, Train Acc: 0.9965, Valid Loss: 0.4080, Valid 

The training accuracy reaches 99.69, and the validation accuracy reaches 89.85%.

## Evaluate the best model on the test set

In [82]:
# evaluate the model on the test set
correct = 0
total = 0

with torch.no_grad():
    for inputs, labels in test_dl:
        outputs = mo87%del(inputs)
        _, predicted = torch.max(outputs.data, 1)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()

print('Accuracy on the test set: {:.2f}%'.format(100 * correct / total))

Accuracy on the test set: 89.34%


## Summary

I learned that it is important to explore many different model architectures and training methods. Similar to my HW8b, in this experiment, the architecture of my final model was very different from what I expected. The final model is very different from my final model in HW8b. I used different numbers of hidden layers, different hidden layer dimensions, and different activation functions. There are probably many reasons for the differences between my two models. HW8b was a multi-class classification problem, while BBBP is a binary classification problem. Additionally, MNIST has much more data available to learn from than BBBP. It seems clear that there is no single optimal model that works for general tasks, and different factors mean different architectures should be explored.

The performance of my BBBP model in HW5c, measured by accuracy, was 87%. My neural network had an accuracy of 89.34%, which is an approxiate 2% increase in performance. This is a big improvement. In addition, my neural network model has much more room for futher tweaking which could potentially improve the model even further.