# TRISEP ML tutorial part II: Building your first fully connected network and a CNN 

## Building a simple fully connected network (a Multi-Layer Perceptron)

Let's set up the paths and make a dataset again:

In [1]:
import os,sys
currentdir = os.getcwd()
parentdir = os.path.dirname(currentdir)
sys.path.insert(0,parentdir) 

In [2]:
from utils.data_handling import WCH5Dataset

In [3]:
dset=WCH5Dataset("/data/TRISEP_data/NUPRISM.h5",val_split=0.1,test_split=0.1)

Now Let's make our model. We'll talk about 
  - model parameters
  - inputs and the forward method
  - Modules containing modules
  - Sequential Module  
  Lets open [simpleMLP](/edit/models/simpleMLP.py)

In [4]:
from models.simpleMLP import SimpleMLP

In [5]:
model_MLP=SimpleMLP(num_classes=3)

Let's look at the parameters:

In [6]:
for name, param in model_MLP.named_parameters():
    print("name of a parameter: {}, type: {}, parameter requires a gradient?: {}".
          format(name, type(param),param.requires_grad))

name of a parameter: fc1.weight, type: <class 'torch.nn.parameter.Parameter'>, parameter requires a gradient?: True
name of a parameter: fc1.bias, type: <class 'torch.nn.parameter.Parameter'>, parameter requires a gradient?: True
name of a parameter: fc2.weight, type: <class 'torch.nn.parameter.Parameter'>, parameter requires a gradient?: True
name of a parameter: fc2.bias, type: <class 'torch.nn.parameter.Parameter'>, parameter requires a gradient?: True
name of a parameter: fc3.weight, type: <class 'torch.nn.parameter.Parameter'>, parameter requires a gradient?: True
name of a parameter: fc3.bias, type: <class 'torch.nn.parameter.Parameter'>, parameter requires a gradient?: True
name of a parameter: fc4.weight, type: <class 'torch.nn.parameter.Parameter'>, parameter requires a gradient?: True
name of a parameter: fc4.bias, type: <class 'torch.nn.parameter.Parameter'>, parameter requires a gradient?: True
name of a parameter: fc5.weight, type: <class 'torch.nn.parameter.Parameter'>, p

As we can see by default the parameters have `requires_grad` set - i.e. we will be able to obtain gradient of the loss function with respect to these parameters.

Let's quickly look at the [source](https://pytorch.org/docs/stable/_modules/torch/nn/modules/linear.html#Linear) for the linear module

The parameters descend from the `Tensor` class. When `Parameter` object is instantiated as a member of a `Module` object class the parameter is added to `Module`s list of parameters automatically. This list and values are captured in the 'state dictionary' of a module:

In [7]:
model_MLP.state_dict()

OrderedDict([('fc1.weight',
              tensor([[ 0.0037,  0.0020,  0.0032,  ..., -0.0040, -0.0032,  0.0021],
                      [-0.0005,  0.0009, -0.0030,  ...,  0.0037,  0.0004, -0.0043],
                      [ 0.0008,  0.0003,  0.0047,  ...,  0.0038, -0.0048, -0.0016],
                      ...,
                      [ 0.0020,  0.0013,  0.0036,  ..., -0.0047,  0.0015, -0.0063],
                      [ 0.0046,  0.0015, -0.0002,  ..., -0.0033,  0.0026, -0.0042],
                      [-0.0028, -0.0045,  0.0060,  ..., -0.0004, -0.0010, -0.0053]])),
             ('fc1.bias',
              tensor([-0.0026, -0.0056,  0.0012,  ..., -0.0037, -0.0030,  0.0054])),
             ('fc2.weight',
              tensor([[-3.8633e-03,  3.7373e-03,  5.5108e-03,  ..., -6.2778e-03,
                        2.9362e-03,  9.6882e-03],
                      [ 3.6844e-03,  8.5249e-04, -1.9293e-03,  ...,  2.4259e-03,
                       -5.0504e-03, -5.6467e-03],
                      [-3.5439e-03, -

Now let's look at sequential version

In [8]:
from models.simpleMLP import SimpleMLPSEQ
model_MLPSEQ=SimpleMLPSEQ(num_classes=3)

In [9]:
for name, param in model_MLPSEQ.named_parameters():
    print("name of a parameter: {}, type: {}, parameter requires a gradient?: {}".
          format(name, type(param),param.requires_grad))

name of a parameter: _sequence.0.weight, type: <class 'torch.nn.parameter.Parameter'>, parameter requires a gradient?: True
name of a parameter: _sequence.0.bias, type: <class 'torch.nn.parameter.Parameter'>, parameter requires a gradient?: True
name of a parameter: _sequence.2.weight, type: <class 'torch.nn.parameter.Parameter'>, parameter requires a gradient?: True
name of a parameter: _sequence.2.bias, type: <class 'torch.nn.parameter.Parameter'>, parameter requires a gradient?: True
name of a parameter: _sequence.4.weight, type: <class 'torch.nn.parameter.Parameter'>, parameter requires a gradient?: True
name of a parameter: _sequence.4.bias, type: <class 'torch.nn.parameter.Parameter'>, parameter requires a gradient?: True
name of a parameter: _sequence.6.weight, type: <class 'torch.nn.parameter.Parameter'>, parameter requires a gradient?: True
name of a parameter: _sequence.6.bias, type: <class 'torch.nn.parameter.Parameter'>, parameter requires a gradient?: True
name of a parame

In [10]:
print(model_MLPSEQ.state_dict())

OrderedDict([('_sequence.0.weight', tensor([[ 0.0048,  0.0032,  0.0023,  ...,  0.0050, -0.0008, -0.0044],
        [-0.0008,  0.0045,  0.0008,  ..., -0.0037, -0.0038,  0.0049],
        [-0.0030,  0.0057, -0.0057,  ...,  0.0014,  0.0024, -0.0004],
        ...,
        [ 0.0011,  0.0037, -0.0052,  ...,  0.0009, -0.0023,  0.0050],
        [ 0.0045, -0.0045,  0.0064,  ...,  0.0040, -0.0055,  0.0055],
        [-0.0016,  0.0045,  0.0035,  ..., -0.0005, -0.0014,  0.0019]])), ('_sequence.0.bias', tensor([ 0.0037, -0.0038, -0.0006,  ...,  0.0022,  0.0045, -0.0027])), ('_sequence.2.weight', tensor([[ 2.3485e-03, -8.5430e-03, -6.2439e-03,  ..., -1.8988e-03,
         -8.5869e-04,  2.6310e-06],
        [-1.8037e-03, -4.6619e-03,  5.9323e-03,  ..., -5.2055e-03,
          4.5018e-03, -8.4762e-04],
        [-2.0174e-04, -1.1191e-03,  4.4874e-03,  ..., -1.0695e-03,
          7.5351e-03, -6.6694e-03],
        ...,
        [ 1.0623e-03, -9.5664e-03,  7.8184e-03,  ...,  4.2625e-03,
          6.6599e-03,  5

As we can see the parameters look similar but have different names

In [11]:
import numpy as np
transform=np.ravel
dset=WCH5Dataset("/fast_scratch/TRISEP_data/NUPRISM.h5",val_split=0.1,test_split=0.1,transform=transform)

In [12]:
from utils.engine import Engine

Let's first create a configuration object -we'll use this to set up our training engine

In [13]:
class CONFIG:
    pass
config=CONFIG()
config.batch_size_test = 1024
config.batch_size_train = 32
config.batch_size_val = 8192
config.lr=0.001
config.device = 'gpu'
config.num_workers_train=3
config.num_workers_val=2
config.num_workers_test=2
config.dump_path = '../model_state_dumps'


In [14]:
engine=Engine(model_MLP,dset,config)

Requesting a GPU
CUDA is available
Creating a directory for run dump: ../model_state_dumps/20190726_093030/


In [15]:
print(vars(config))

{'batch_size_test': 1024, 'batch_size_train': 32, 'batch_size_val': 8192, 'lr': 0.001, 'device': 'gpu', 'num_workers_train': 3, 'num_workers_val': 2, 'num_workers_test': 2, 'dump_path': '../model_state_dumps'}


In [None]:
%%time
engine.train(epochs=2.5,report_interval=10,valid_interval=100)

Epoch 0 Starting @ 2019-07-26 09:30:30
... Iteration 1 ... Epoch 0.00 ... Loss 2.964 ... Accuracy 0.312
... Iteration 1 ... Epoch 0.00 ... Validation Loss 69.738 ... Validation Accuracy 0.334
Saved checkpoint as: ../model_state_dumps/20190726_093030/SimpleMLP.pth
best validation loss so far!: 69.7379150390625
Saved checkpoint as: ../model_state_dumps/20190726_093030/SimpleMLPBEST.pth
... Iteration 11 ... Epoch 0.00 ... Loss 1.098 ... Accuracy 0.375
... Iteration 21 ... Epoch 0.00 ... Loss 1.210 ... Accuracy 0.375
... Iteration 31 ... Epoch 0.00 ... Loss 1.069 ... Accuracy 0.469
... Iteration 41 ... Epoch 0.00 ... Loss 0.939 ... Accuracy 0.531
... Iteration 51 ... Epoch 0.00 ... Loss 1.034 ... Accuracy 0.469
... Iteration 61 ... Epoch 0.00 ... Loss 0.822 ... Accuracy 0.656
... Iteration 71 ... Epoch 0.00 ... Loss 0.800 ... Accuracy 0.719
... Iteration 81 ... Epoch 0.00 ... Loss 0.896 ... Accuracy 0.500
... Iteration 91 ... Epoch 0.00 ... Loss 0.854 ... Accuracy 0.531
... Iteration 101 .

... Iteration 861 ... Epoch 0.04 ... Loss 0.524 ... Accuracy 0.781
... Iteration 871 ... Epoch 0.04 ... Loss 0.338 ... Accuracy 0.781
... Iteration 881 ... Epoch 0.04 ... Loss 0.474 ... Accuracy 0.750
... Iteration 891 ... Epoch 0.04 ... Loss 0.467 ... Accuracy 0.750
... Iteration 901 ... Epoch 0.04 ... Loss 0.538 ... Accuracy 0.688
... Iteration 901 ... Epoch 0.04 ... Validation Loss 0.505 ... Validation Accuracy 0.701
Saved checkpoint as: ../model_state_dumps/20190726_093030/SimpleMLP.pth
... Iteration 911 ... Epoch 0.04 ... Loss 0.463 ... Accuracy 0.688
... Iteration 921 ... Epoch 0.04 ... Loss 0.539 ... Accuracy 0.625
... Iteration 931 ... Epoch 0.04 ... Loss 0.611 ... Accuracy 0.594
... Iteration 941 ... Epoch 0.04 ... Loss 0.444 ... Accuracy 0.750
... Iteration 951 ... Epoch 0.04 ... Loss 0.442 ... Accuracy 0.719
... Iteration 961 ... Epoch 0.04 ... Loss 0.622 ... Accuracy 0.562
... Iteration 971 ... Epoch 0.04 ... Loss 0.508 ... Accuracy 0.719
... Iteration 981 ... Epoch 0.04 ..

... Iteration 1801 ... Epoch 0.08 ... Loss 0.386 ... Accuracy 0.812
... Iteration 1801 ... Epoch 0.08 ... Validation Loss 0.477 ... Validation Accuracy 0.720
Saved checkpoint as: ../model_state_dumps/20190726_093030/SimpleMLP.pth
... Iteration 1811 ... Epoch 0.08 ... Loss 0.380 ... Accuracy 0.781
... Iteration 1821 ... Epoch 0.08 ... Loss 0.445 ... Accuracy 0.750
... Iteration 1831 ... Epoch 0.08 ... Loss 0.442 ... Accuracy 0.750
... Iteration 1841 ... Epoch 0.08 ... Loss 0.409 ... Accuracy 0.812
... Iteration 1851 ... Epoch 0.08 ... Loss 0.498 ... Accuracy 0.750
... Iteration 1861 ... Epoch 0.08 ... Loss 0.425 ... Accuracy 0.719
... Iteration 1871 ... Epoch 0.08 ... Loss 0.422 ... Accuracy 0.719
... Iteration 1881 ... Epoch 0.08 ... Loss 0.459 ... Accuracy 0.781
... Iteration 1891 ... Epoch 0.08 ... Loss 0.468 ... Accuracy 0.875
... Iteration 1901 ... Epoch 0.08 ... Loss 0.522 ... Accuracy 0.750
... Iteration 1901 ... Epoch 0.08 ... Validation Loss 0.471 ... Validation Accuracy 0.723


... Iteration 2771 ... Epoch 0.12 ... Loss 0.426 ... Accuracy 0.719
... Iteration 2781 ... Epoch 0.12 ... Loss 0.471 ... Accuracy 0.750
... Iteration 2791 ... Epoch 0.12 ... Loss 0.322 ... Accuracy 0.875
... Iteration 2801 ... Epoch 0.12 ... Loss 0.540 ... Accuracy 0.562
... Iteration 2801 ... Epoch 0.12 ... Validation Loss 0.457 ... Validation Accuracy 0.742
Saved checkpoint as: ../model_state_dumps/20190726_093030/SimpleMLP.pth
... Iteration 2811 ... Epoch 0.12 ... Loss 0.364 ... Accuracy 0.812
... Iteration 2821 ... Epoch 0.13 ... Loss 0.465 ... Accuracy 0.688
... Iteration 2831 ... Epoch 0.13 ... Loss 0.432 ... Accuracy 0.812
... Iteration 2841 ... Epoch 0.13 ... Loss 0.561 ... Accuracy 0.562
... Iteration 2851 ... Epoch 0.13 ... Loss 0.401 ... Accuracy 0.781
... Iteration 2861 ... Epoch 0.13 ... Loss 0.322 ... Accuracy 0.844
... Iteration 2871 ... Epoch 0.13 ... Loss 0.551 ... Accuracy 0.688
... Iteration 2881 ... Epoch 0.13 ... Loss 0.505 ... Accuracy 0.656
... Iteration 2891 ...

... Iteration 3721 ... Epoch 0.17 ... Loss 0.391 ... Accuracy 0.844
... Iteration 3731 ... Epoch 0.17 ... Loss 0.324 ... Accuracy 0.812
... Iteration 3741 ... Epoch 0.17 ... Loss 0.561 ... Accuracy 0.625
... Iteration 3751 ... Epoch 0.17 ... Loss 0.494 ... Accuracy 0.656
... Iteration 3761 ... Epoch 0.17 ... Loss 0.435 ... Accuracy 0.719
... Iteration 3771 ... Epoch 0.17 ... Loss 0.340 ... Accuracy 0.906
... Iteration 3781 ... Epoch 0.17 ... Loss 0.467 ... Accuracy 0.719
... Iteration 3791 ... Epoch 0.17 ... Loss 0.439 ... Accuracy 0.750
... Iteration 3801 ... Epoch 0.17 ... Loss 0.357 ... Accuracy 0.844
... Iteration 3801 ... Epoch 0.17 ... Validation Loss 0.463 ... Validation Accuracy 0.726
Saved checkpoint as: ../model_state_dumps/20190726_093030/SimpleMLP.pth
... Iteration 3811 ... Epoch 0.17 ... Loss 0.514 ... Accuracy 0.688
... Iteration 3821 ... Epoch 0.17 ... Loss 0.507 ... Accuracy 0.688
... Iteration 3831 ... Epoch 0.17 ... Loss 0.462 ... Accuracy 0.781
... Iteration 3841 ...

... Iteration 4691 ... Epoch 0.21 ... Loss 0.412 ... Accuracy 0.750
... Iteration 4701 ... Epoch 0.21 ... Loss 0.425 ... Accuracy 0.844
... Iteration 4701 ... Epoch 0.21 ... Validation Loss 0.434 ... Validation Accuracy 0.751
Saved checkpoint as: ../model_state_dumps/20190726_093030/SimpleMLP.pth
best validation loss so far!: 0.43406054377555847
Saved checkpoint as: ../model_state_dumps/20190726_093030/SimpleMLPBEST.pth
... Iteration 4711 ... Epoch 0.21 ... Loss 0.454 ... Accuracy 0.750
... Iteration 4721 ... Epoch 0.21 ... Loss 0.399 ... Accuracy 0.781
... Iteration 4731 ... Epoch 0.21 ... Loss 0.442 ... Accuracy 0.719
... Iteration 4741 ... Epoch 0.21 ... Loss 0.497 ... Accuracy 0.781
... Iteration 4751 ... Epoch 0.21 ... Loss 0.444 ... Accuracy 0.750


In [None]:
model_MLP._get_name()

In [None]:
from models.simpleCNN import SimpleCNN
model_CNN=SimpleCNN(num_input_channels=38,num_classes=3)

In [None]:
def rotate_chan(x):
    return np.transpose(x,(2,0,1))

In [None]:
dset=WCH5Dataset("/fast_scratch/TRISEP_data/NUPRISM.h5",val_split=0.1,test_split=0.1,transform=rotate_chan)

In [None]:
engine=Engine(model_CNN,dset,config)

In [None]:
for name, param in model_CNN.named_parameters():
    print("name of a parameter: {}, type: {}, parameter requires a gradient?: {}".
          format(name, type(param),param.requires_grad))

In [None]:
engine.train(epochs=5,report_interval=10,valid_interval=100)