# TRISEP ML tutorial part II: Building your first fully connected network and a CNN 

## Building a simple fully connected network (a Multi-Layer Perceptron)

Let's set up the paths and make a dataset again:

In [1]:
import os,sys
currentdir = os.getcwd()
parentdir = os.path.dirname(currentdir)
sys.path.insert(0,parentdir) 

In [2]:
from utils.data_handling import WCH5Dataset

In [3]:
dset=WCH5Dataset("/data/TRISEP_data/NUPRISM.h5",val_split=0.1,test_split=0.1)

Now Let's make our model. We'll talk about 
  - model parameters
  - inputs and the forward method
  - Modules containing modules
  - Sequential Module  
  Lets open [simpleMLP](/edit/models/simpleMLP.py)

In [4]:
from models.simpleMLP import SimpleMLP

In [5]:
model_MLP=SimpleMLP(num_classes=3)

Let's look at the parameters:

In [6]:
for name, param in model_MLP.named_parameters():
    print("name of a parameter: {}, type: {}, parameter requires a gradient?: {}".
          format(name, type(param),param.requires_grad))

name of a parameter: fc1.weight, type: <class 'torch.nn.parameter.Parameter'>, parameter requires a gradient?: True
name of a parameter: fc1.bias, type: <class 'torch.nn.parameter.Parameter'>, parameter requires a gradient?: True
name of a parameter: fc2.weight, type: <class 'torch.nn.parameter.Parameter'>, parameter requires a gradient?: True
name of a parameter: fc2.bias, type: <class 'torch.nn.parameter.Parameter'>, parameter requires a gradient?: True
name of a parameter: fc3.weight, type: <class 'torch.nn.parameter.Parameter'>, parameter requires a gradient?: True
name of a parameter: fc3.bias, type: <class 'torch.nn.parameter.Parameter'>, parameter requires a gradient?: True
name of a parameter: fc4.weight, type: <class 'torch.nn.parameter.Parameter'>, parameter requires a gradient?: True
name of a parameter: fc4.bias, type: <class 'torch.nn.parameter.Parameter'>, parameter requires a gradient?: True
name of a parameter: fc5.weight, type: <class 'torch.nn.parameter.Parameter'>, p

As we can see by default the parameters have `requires_grad` set - i.e. we will be able to obtain gradient of the loss function with respect to these parameters.

Let's quickly look at the [source](https://pytorch.org/docs/stable/_modules/torch/nn/modules/linear.html#Linear) for the linear module

The parameters descend from the `Tensor` class. When `Parameter` object is instantiated as a member of a `Module` object class the parameter is added to `Module`s list of parameters automatically. This list and values are captured in the 'state dictionary' of a module:

In [7]:
model_MLP.state_dict()

OrderedDict([('fc1.weight',
              tensor([[ 0.0054, -0.0058, -0.0051,  ...,  0.0046,  0.0031, -0.0048],
                      [-0.0063,  0.0056,  0.0024,  ...,  0.0058,  0.0018, -0.0010],
                      [ 0.0013, -0.0045, -0.0029,  ..., -0.0062,  0.0038, -0.0017],
                      ...,
                      [ 0.0048, -0.0030,  0.0003,  ...,  0.0031,  0.0052,  0.0043],
                      [ 0.0050, -0.0021,  0.0047,  ...,  0.0020, -0.0011,  0.0059],
                      [ 0.0048, -0.0045, -0.0029,  ..., -0.0027, -0.0034, -0.0017]])),
             ('fc1.bias',
              tensor([-0.0030,  0.0004, -0.0036,  ...,  0.0033,  0.0046, -0.0026])),
             ('fc2.weight',
              tensor([[-0.0044,  0.0055,  0.0056,  ..., -0.0034,  0.0097,  0.0081],
                      [ 0.0050,  0.0051, -0.0008,  ...,  0.0020,  0.0031, -0.0077],
                      [-0.0093,  0.0099, -0.0008,  ..., -0.0088,  0.0013, -0.0056],
                      ...,
                    

Now let's look at sequential version

In [8]:
from models.simpleMLP import SimpleMLPSEQ
model_MLPSEQ=SimpleMLPSEQ(num_classes=3)

In [9]:
for name, param in model_MLPSEQ.named_parameters():
    print("name of a parameter: {}, type: {}, parameter requires a gradient?: {}".
          format(name, type(param),param.requires_grad))

name of a parameter: _sequence.0.weight, type: <class 'torch.nn.parameter.Parameter'>, parameter requires a gradient?: True
name of a parameter: _sequence.0.bias, type: <class 'torch.nn.parameter.Parameter'>, parameter requires a gradient?: True
name of a parameter: _sequence.2.weight, type: <class 'torch.nn.parameter.Parameter'>, parameter requires a gradient?: True
name of a parameter: _sequence.2.bias, type: <class 'torch.nn.parameter.Parameter'>, parameter requires a gradient?: True
name of a parameter: _sequence.4.weight, type: <class 'torch.nn.parameter.Parameter'>, parameter requires a gradient?: True
name of a parameter: _sequence.4.bias, type: <class 'torch.nn.parameter.Parameter'>, parameter requires a gradient?: True
name of a parameter: _sequence.6.weight, type: <class 'torch.nn.parameter.Parameter'>, parameter requires a gradient?: True
name of a parameter: _sequence.6.bias, type: <class 'torch.nn.parameter.Parameter'>, parameter requires a gradient?: True
name of a parame

In [10]:
print(model_MLPSEQ.state_dict())

OrderedDict([('_sequence.0.weight', tensor([[ 0.0032, -0.0050, -0.0031,  ..., -0.0045, -0.0046,  0.0055],
        [ 0.0036, -0.0037,  0.0061,  ...,  0.0063, -0.0015, -0.0021],
        [ 0.0006,  0.0035, -0.0003,  ...,  0.0012, -0.0033,  0.0062],
        ...,
        [ 0.0010, -0.0024, -0.0051,  ...,  0.0025, -0.0047, -0.0034],
        [ 0.0053,  0.0035, -0.0058,  ..., -0.0059,  0.0055,  0.0054],
        [-0.0015, -0.0051, -0.0025,  ...,  0.0026, -0.0023,  0.0035]])), ('_sequence.0.bias', tensor([-6.8265e-05, -1.7591e-03, -3.8521e-03,  ..., -2.1804e-03,
         6.3318e-04,  1.2940e-03])), ('_sequence.2.weight', tensor([[-0.0029, -0.0075,  0.0025,  ..., -0.0008, -0.0069,  0.0002],
        [ 0.0063, -0.0079, -0.0031,  ...,  0.0092,  0.0011,  0.0092],
        [ 0.0032, -0.0101, -0.0040,  ...,  0.0016, -0.0048,  0.0091],
        ...,
        [ 0.0077,  0.0040,  0.0023,  ...,  0.0071, -0.0001, -0.0074],
        [ 0.0011, -0.0078, -0.0038,  ..., -0.0013, -0.0034, -0.0005],
        [ 0.0061, 

As we can see the parameters look similar but have different names

In [11]:
import numpy as np
transform=np.ravel
dset=WCH5Dataset("/fast_scratch/TRISEP_data/NUPRISM.h5",val_split=0.1,test_split=0.1,transform=transform)

In [12]:
from utils.engine import Engine

Let's first create a configuration object -we'll use this to set up our training engine

In [13]:
class CONFIG:
    pass
config=CONFIG()
config.batch_size_test = 1024
config.batch_size_train = 32
config.batch_size_val = 2048
config.lr=0.001
config.device = 'gpu'
config.num_workers_train=3
config.num_workers_val=2
config.num_workers_test=2
config.dump_path = '../model_state_dumps'


In [14]:
engine=Engine(model_MLP,dset,config)

Requesting a GPU
CUDA is available
Creating a directory for run dump: ../model_state_dumps/20190727_003326/


In [15]:
print(vars(config))

{'batch_size_test': 1024, 'batch_size_train': 32, 'batch_size_val': 2048, 'lr': 0.001, 'device': 'gpu', 'num_workers_train': 3, 'num_workers_val': 2, 'num_workers_test': 2, 'dump_path': '../model_state_dumps'}


In [None]:
%%time
engine.train(epochs=2.5,report_interval=10,valid_interval=100)

Epoch 0 Starting @ 2019-07-27 00:33:26
... Iteration 1 ... Epoch 0.00 ... Loss 4.277 ... Accuracy 0.188
... Iteration 1 ... Epoch 0.00 ... Validation Loss 63.753 ... Validation Accuracy 0.376
Saved checkpoint as: ../model_state_dumps/20190727_003326/SimpleMLP.pth
best validation loss so far!: 63.7532958984375
Saved checkpoint as: ../model_state_dumps/20190727_003326/SimpleMLPBEST.pth
... Iteration 11 ... Epoch 0.00 ... Loss 1.154 ... Accuracy 0.469
... Iteration 21 ... Epoch 0.00 ... Loss 1.012 ... Accuracy 0.500
... Iteration 31 ... Epoch 0.00 ... Loss 1.071 ... Accuracy 0.438
... Iteration 41 ... Epoch 0.00 ... Loss 1.097 ... Accuracy 0.469
... Iteration 51 ... Epoch 0.00 ... Loss 0.841 ... Accuracy 0.562
... Iteration 61 ... Epoch 0.00 ... Loss 0.979 ... Accuracy 0.531
... Iteration 71 ... Epoch 0.00 ... Loss 0.696 ... Accuracy 0.750
... Iteration 81 ... Epoch 0.00 ... Loss 0.851 ... Accuracy 0.625
... Iteration 91 ... Epoch 0.00 ... Loss 0.740 ... Accuracy 0.594
... Iteration 101 .

In [None]:
model_MLP._get_name()

In [None]:
from models.simpleCNN import SimpleCNN
model_CNN=SimpleCNN(num_input_channels=38,num_classes=3)

In [None]:
def rotate_chan(x):
    return np.transpose(x,(2,0,1))

In [None]:
dset=WCH5Dataset("/fast_scratch/TRISEP_data/NUPRISM.h5",val_split=0.1,test_split=0.1,transform=rotate_chan)

In [None]:
engine=Engine(model_CNN,dset,config)

In [None]:
for name, param in model_CNN.named_parameters():
    print("name of a parameter: {}, type: {}, parameter requires a gradient?: {}".
          format(name, type(param),param.requires_grad))

In [None]:
%%time
engine.train(epochs=5,report_interval=10,valid_interval=100)