# 4. Design own Experiment

## A. Develop a plan

Come up with a plan for what you want to explore and the metrics you will use. Determine the range of options in each dimension to explore (e.g. L options in dimension 1, M options in dimension 2, and N options in dimension 3). You don't have to evaluate all L * M * N options unless you want to. Instead, think about using a linear search strategy where you hold two parameters constant and optimize the third, then switch things up, optimizing one parameter at a time in a round-robin or randomized fashion. Overall, plan to evaluate 50-100 network variations (again, automate this process).


ROUGH PLAN:

dimension 1 = number of convolution filters channels. L = 4

dimension 2 = number of epochs of training. M

dimension 3 = number of batch size. N = 4 (multiple of 32)

- Conv. filter channels is chosen since conv is the workhorse of the NN. 
Increasing the number if filter means increaseing the number of features that will be learned

- Number of epochs was proven to increase the accuracy as can be seen in task 1. Need to make sure its not overfitting

- The higher the number of batch size means CPU capacity will be forced to work fully means better optimization


In [1]:
# torch
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
from torch.utils.data import DataLoader
from torch.utils.data.dataloader import default_collate # to reshape

import torchvision
from torchvision import datasets, transforms

# import previous notebook
import nbimporter
import Task1AE as Note1AE
import Task1FG as Note1FG

# for visualizationg
from matplotlib import pyplot as plt
import numpy as np

# for tuning and reshape for GridSearch
from skorch.dataset import Dataset
from skorch import NeuralNetClassifier
from sklearn.model_selection import GridSearchCV
from sklearn.model_selection import RandomizedSearchCV

In [2]:
def get_train_test_MNIST_data():
    """
    Get MNIST dataset as X and target numpy array as y.
    Return both the train and test data. 
    """
    train_loader = Note1AE.get_loader(is_train = True)

    transform=torchvision.transforms.Compose(
                    [torchvision.transforms.ToTensor(),
                     torchvision.transforms.Normalize(
                        # normalize with mean and std
                        (0.1307,), (0.3801,)
                    )
                    ])
    #dataset
    X_train = torchvision.datasets.MNIST(
                'mnist',
                train=True,
                download=True,
                transform=transform)

    y_train = np.array([y for x, y in iter(X_train)])

    X_test= torchvision.datasets.MNIST(
                'mnist',
                train=False,
                download=True,
                transform=transform)
    y_test = np.array([y for x, y in iter(X_test)])
    
    return X_train, y_train, X_test, y_test

In [3]:
class NeuralNetworkDesign(nn.Module):
    """
    Another neural network for MNIST that takes parameters for 
    the number of channels
    """
    
    def __init__(self, conv1_out_channels):
        # call the parent constructor
        super(NeuralNetworkDesign, self).__init__()
        print("conv1_out_channels:",conv1_out_channels)
        
        # 1. CNN
        # input_pixel = 28 
        # out_channels = [10,..
        # output_pixel = (input_pixel) - 4 /2
        # final output = 10 X 12 X 12
        self.conv1 = nn.Conv2d(in_channels=1, 
                               out_channels=conv1_out_channels, 
                               kernel_size=5) 
        
       
        
        # input_pixel = 12
        # out_channels = [20,..
        # output_pixel = (input_pixel) - 4 /2
        # final output = 20 X 4 X 4 = 320
        self.conv2 = nn.Conv2d(in_channels=conv1_out_channels, 
                               out_channels=conv1_out_channels*2, 
                               kernel_size=5)
        self.conv2_drop = nn.Dropout2d() # default is 0.5 or half
        
        # 2. ANN
        self.in_features = conv1_out_channels * 2 * 4 * 4
        
        self.fc1 = nn.Linear(in_features=self.in_features, out_features=50)
        self.fc2 = nn.Linear(50, 10)
        self.flatten = nn.Flatten()
     

    def forward(self, x):
        # 1. first conv, max pool, relu
        x = F.relu(F.max_pool2d(self.conv1(x), 2))
        
        # 2. 2nd conv, droptout layer, max pool, relu
        x = F.relu(F.max_pool2d(self.conv2_drop(self.conv2(x)), 2))
        
        # 3. reshape tensor . Question: why to -1, 320. this is same as flatten
        x = x.view(-1,self.in_features)
        
        # 4. fully connected, relu
        x = F.relu(self.fc1(x))
        # x = F.dropout(x, training=self.training)
        x = self.fc2(x)
        
        # 5. convert the output from a Linear layer
        # into a categorical probability distribution
        return F.log_softmax(x, -1)
    

In [4]:
# >>>>>> For randomoized search
class SliceDatasetX(Dataset):
    """Helper class that wraps a torch dataset to make it work with sklearn"""
    def __init__(self, dataset, collate_fn=default_collate):
        self.dataset = dataset
        self.collate_fn = collate_fn

        self._indices = list(range(len(self.dataset)))
        
    def __len__(self):
        return len(self.dataset)
        
    @property
    def shape(self):
        return len(self),
    
    def __getitem__(self, i):
        if isinstance(i, (int, np.integer)):
            Xb = self.transform(*self.dataset[i])[0]
            return Xb
        
        if isinstance(i, slice):
            i = self._indices[i]

        Xb = self.collate_fn([self.transform(*self.dataset[j])[0] for j in i])
        return Xb
    

In [5]:
# Question: this doesnt work. state keep changing
torch.manual_seed(42)
DEVICE= 'cuda' if torch.cuda.is_available() else 'cpu'

# 1. get data
X_train, y_train, X_test, y_test = get_train_test_MNIST_data()

# 2. Parameters to iterate
CONV_CHANNELS = [5, 10, 15, 20] # number of channels for 1st conv. layers
BS = [64, 128, 160, 224] # batch sizes
EPOCHS = [5, 8, 10, 12]

# 3. init Skorch NN to plug in randomized search
net = NeuralNetClassifier(
    
    # Question: Do I fill these in if Im doing randomized?
    NeuralNetworkDesign(conv1_out_channels = 10),
    batch_size=64,
    max_epochs=5,
    
    # Question: how come i dont get loss and accuracy with this:
    # optimizer=optim.SGD,
    # criterion=nn.NLLLoss,
    optimizer=optim.Adam,
    criterion=nn.CrossEntropyLoss,
    iterator_train__num_workers=4,
    device=DEVICE
)

net.fit(X_train, y_train)

conv1_out_channels: 10


## B. Predict the results

Before starting your evaluation, come up with a hypothesis for how you expect the network to behave along each dimension. Include these hypotheses in your report and then discuss whether the evaluation supported the hypothesis.


## TODO: 
come up with a hypothesis for how you expect the network to behave along each dimension.
discuss whether the evaluation supported the hypothesis.

In [7]:
# Question: This method returns the mean accuracy on the given data and labels for classifiers
# what score is this ? correct / total?
net.score(X_test, y_test)

0.9823

## C. Execute your plan

Run the evaluation and report on the results.


In [8]:
# 1. convert data to numpy
X_slicable =  SliceDatasetX(X_train)

# 2. set params
params = {
    'module__conv1_out_channels': CONV_CHANNELS,
    'batch_size': BS,
    'max_epochs': EPOCHS,
}


# 3. Create randomized search object
rs = RandomizedSearchCV(
                  net,
                  params,
                  refit=False,
                  cv=3,
                  scoring='accuracy',
                  verbose=2,
                  n_iter=50,
                  random_state=42)


# 4. run and evaluate
rs.fit(X_slicable, y_train)
print("best score: {:.3f}, best params: {}".format(rs.best_score_, rs.best_params_))



Fitting 3 folds for each of 50 candidates, totalling 150 fits
conv1_out_channels: 5
  epoch    train_loss    valid_acc    valid_loss     dur
-------  ------------  -----------  ------------  ------
      1        [36m0.5952[0m       [32m0.9515[0m        [35m0.1491[0m  3.3847
      2        [36m0.2425[0m       [32m0.9630[0m        [35m0.1098[0m  3.2699
      3        [36m0.1931[0m       [32m0.9689[0m        [35m0.0971[0m  3.2749
      4        [36m0.1820[0m       [32m0.9701[0m        [35m0.0932[0m  3.1324
      5        [36m0.1664[0m       [32m0.9714[0m        [35m0.0928[0m  3.1815
      6        [36m0.1572[0m       0.9695        0.0956  3.2179
      7        0.1591       0.9712        [35m0.0893[0m  3.1792
      8        [36m0.1476[0m       [32m0.9758[0m        [35m0.0775[0m  3.1682
[CV] END batch_size=224, max_epochs=8, module__conv1_out_channels=5; total time=  28.4s
conv1_out_channels: 5
  epoch    train_loss    valid_acc    valid_loss     dur

ERROR: Unexpected segmentation fault encountered in worker.
 

[CV] END batch_size=224, max_epochs=10, module__conv1_out_channels=20; total time=  11.6s
conv1_out_channels: 20
  epoch    train_loss    valid_acc    valid_loss     dur
-------  ------------  -----------  ------------  ------
      1        [36m0.5877[0m       [32m0.9574[0m        [35m0.1486[0m  5.2752
      2        [36m0.2235[0m       [32m0.9718[0m        [35m0.1051[0m  4.8477
      3        [36m0.1869[0m       [32m0.9754[0m        [35m0.0852[0m  4.8152
      4        [36m0.1570[0m       [32m0.9766[0m        [35m0.0783[0m  4.7937
      5        [36m0.1409[0m       0.9724        0.0988  4.8050
      6        [36m0.1324[0m       0.9752        0.0896  4.8248
      7        [36m0.1212[0m       [32m0.9776[0m        0.0789  4.8300
      8        [36m0.1132[0m       [32m0.9792[0m        0.0787  4.8138
      9        0.1198       0.9742        0.0939  4.8188
     10        [36m0.1054[0m       0.9779        0.0812  4.8252
[CV] END batch_size=224, max_epo

1 fits failed out of a total of 150.
The score on these train-test partitions for these parameters will be set to nan.
If these failures are not expected, you can try to debug them by setting error_score='raise'.

Below are more details about the failures:
--------------------------------------------------------------------------------
1 fits failed with the following error:
Traceback (most recent call last):
  File "/opt/homebrew/Caskroom/miniforge/base/lib/python3.9/site-packages/sklearn/model_selection/_validation.py", line 686, in _fit_and_score
    estimator.fit(X_train, y_train, **fit_params)
  File "/opt/homebrew/Caskroom/miniforge/base/lib/python3.9/site-packages/skorch/classifier.py", line 141, in fit
    return super(NeuralNetClassifier, self).fit(X, y, **fit_params)
  File "/opt/homebrew/Caskroom/miniforge/base/lib/python3.9/site-packages/skorch/net.py", line 1215, in fit
    self.partial_fit(X, y, **fit_params)
  File "/opt/homebrew/Caskroom/miniforge/base/lib/python3.9/sit