# Training the model
After we have set up the dataset for the CT scans to use as input of a model, we have to build the model and define the training loop. We will use the Python script to build our **LunaDataset**, a subclass of the PyTorch Dataset that creates the tensor to be used as input to our model from the CT scans data. Our goal is twofold. First we want to find out which of the candidates is a nodule and wich is not, then we want to classify a nodule according to its nature: benign or malignant. We will need two different models, one for each task. Use the GPU runtime.  

In [None]:
!git clone https://github.com/deep-learning-with-pytorch/dlwpt-code.git

Cloning into 'dlwpt-code'...
remote: Enumerating objects: 703, done.[K
remote: Total 703 (delta 0), reused 0 (delta 0), pack-reused 703[K
Receiving objects: 100% (703/703), 176.00 MiB | 21.71 MiB/s, done.
Resolving deltas: 100% (309/309), done.
Checking out files: 100% (228/228), done.


The training loop for the project is more complex than what we have before and it is defined in a Python script. Training and validation are defined in the **LunaTrainingApp** Python class implemented in the dlwpt-code/p2ch11/training.py script 

## LunaModel
The model that we will use is also complex enough to be defined in a separate PyThon script dlwpt-code/p2ch11/model.py. It consists of an input layer and a Batch Normalization layer, a backbone of four blocks (LunaBlock) and a final fully connected layer and softmax to return the probabilities for the input to be a nodule or not. A LunaBlock contains two 3D convolutional layers with a ReLU activation function, and a max pooling layer. The 3D convolutions have a 3x3x3 kernel. A 3D convolution is not different from a 2D one, the difference being that there are more neighboring units to be taken into account to compute a mean or a maximum, and that the kernel is shifted along three directions instead of two. A 3x3x3 kernel applied to a 3x3x3 volume outputs one voxel. If padding is used, then the size of the output of a 3D convolution is the same as the size of the input. The max pooling layers within each LunaBlock reduce an input voxel from 32x48x48 (1 channel) to an output of size 2x3x3 (64 channels) at the end of the backbone. The output of the backbone is flattened to a 1 dimensional vector in order to be used as input to the following fuly connected layer. 

## Downloading the data

In [None]:
cd dlwpt-code/

/content/dlwpt-code


In [None]:
mkdir data-unversioned

In [None]:
cd data-unversioned

/content/dlwpt-code/data-unversioned


In [None]:
mkdir part2

In [None]:
cd part2

/content/dlwpt-code/data-unversioned/part2


In [None]:
mkdir luna

In [None]:
cd luna

/content/dlwpt-code/data-unversioned/part2/luna


In [None]:
!wget https://zenodo.org/record/3723295/files/subset0.zip

--2022-11-27 14:15:24--  https://zenodo.org/record/3723295/files/subset0.zip
Resolving zenodo.org (zenodo.org)... 188.185.124.72
Connecting to zenodo.org (zenodo.org)|188.185.124.72|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 6811924508 (6.3G) [application/octet-stream]
Saving to: ‘subset0.zip’


2022-11-27 14:19:38 (25.8 MB/s) - ‘subset0.zip’ saved [6811924508/6811924508]



In [None]:
!7z x subset0.zip


7-Zip [64] 16.02 : Copyright (c) 1999-2016 Igor Pavlov : 2016-05-21
p7zip Version 16.02 (locale=en_US.UTF-8,Utf16=on,HugeFiles=on,64 bits,2 CPUs Intel(R) Xeon(R) CPU @ 2.20GHz (406F0),ASM,AES-NI)

Scanning the drive for archives:
  0M Scan         1 file, 6811924508 bytes (6497 MiB)

Extracting archive: subset0.zip

ERRORS:
Headers Error

--
Path = subset0.zip
Type = zip
ERRORS:
Headers Error
Physical Size = 6811924508
64-bit = +

  0%      0% 1 - subset0/1.3.6.1.4.1.14519.5.2.1.6 . 105756658031515062000744821260.raw                                                                                 0% 2        0% 3 - subset0/1.3.6.1.4.1.14519.5.2.1.6 . 108197895896446896160048741492.raw                                

In [None]:
!pip install SimpleITK

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting SimpleITK
  Downloading SimpleITK-2.2.0-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (52.8 MB)
[K     |████████████████████████████████| 52.8 MB 229 kB/s 
[?25hInstalling collected packages: SimpleITK
Successfully installed SimpleITK-2.2.0


In [None]:
!pip install "diskcache==4.1.0"

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting diskcache==4.1.0
  Downloading diskcache-4.1.0-py2.py3-none-any.whl (44 kB)
[K     |████████████████████████████████| 44 kB 3.0 MB/s 
[?25hInstalling collected packages: diskcache
Successfully installed diskcache-4.1.0


In [None]:
cd /content/dlwpt-code/

/content/dlwpt-code


## Setting up the LunaDataset
We set up the LunaDataset to train the model

In [None]:
from p2ch10.dsets import getCandidateInfoList, getCt, LunaDataset
candidateInfo_list = getCandidateInfoList(requireOnDisk_bool=True)
positiveInfo_list = [x for x in candidateInfo_list if x[0]]
diameter_list = [x[1] for x in positiveInfo_list]

In [None]:
print(len(positiveInfo_list))
print(positiveInfo_list[0])

122
CandidateInfoTuple(isNodule_bool=True, diameter_mm=25.23320204, series_uid='1.3.6.1.4.1.14519.5.2.1.6279.6001.511347030803753100045216493273', center_xyz=(63.4740118048, 73.9174523314, -213.736128767))


In [None]:
for i in range(0, len(diameter_list), 100):
    print('{:4}  {:4.1f} mm'.format(i, diameter_list[i]))

   0  25.2 mm
 100   0.0 mm


In [None]:
for candidateInfo_tup in positiveInfo_list[:10]:
    print(candidateInfo_tup)
for candidateInfo_tup in positiveInfo_list[-10:]:
    print(candidateInfo_tup)
    
for candidateInfo_tup in positiveInfo_list:
    if candidateInfo_tup.series_uid.endswith('565'):
        print(candidateInfo_tup)

CandidateInfoTuple(isNodule_bool=True, diameter_mm=25.23320204, series_uid='1.3.6.1.4.1.14519.5.2.1.6279.6001.511347030803753100045216493273', center_xyz=(63.4740118048, 73.9174523314, -213.736128767))
CandidateInfoTuple(isNodule_bool=True, diameter_mm=21.58311204, series_uid='1.3.6.1.4.1.14519.5.2.1.6279.6001.905371958588660410240398317235', center_xyz=(109.142472723, 49.6356928166, -121.183579092))
CandidateInfoTuple(isNodule_bool=True, diameter_mm=19.65387738, series_uid='1.3.6.1.4.1.14519.5.2.1.6279.6001.752756872840730509471096155114', center_xyz=(56.1226132601, 67.868268695, -65.6269886453))
CandidateInfoTuple(isNodule_bool=True, diameter_mm=18.7832325, series_uid='1.3.6.1.4.1.14519.5.2.1.6279.6001.202811684116768680758082619196', center_xyz=(-82.79150362, -21.43587141, -97.18427459))
CandidateInfoTuple(isNodule_bool=True, diameter_mm=17.75323185, series_uid='1.3.6.1.4.1.14519.5.2.1.6279.6001.187451715205085403623595258748', center_xyz=(94.1132711884, -15.8936132585, -202.8472282

In [None]:
from p2ch10.vis import findPositiveSamples, showCandidate
positiveSample_list = findPositiveSamples()

2022-11-27 14:24:33,369 INFO     pid:75 p2ch10.dsets:173:__init__ <p2ch10.dsets.LunaDataset object at 0x7f0629ca3fd0>: 56938 training samples


0 CandidateInfoTuple(isNodule_bool=True, diameter_mm=25.23320204, series_uid='1.3.6.1.4.1.14519.5.2.1.6279.6001.511347030803753100045216493273', center_xyz=(63.4740118048, 73.9174523314, -213.736128767))
1 CandidateInfoTuple(isNodule_bool=True, diameter_mm=21.58311204, series_uid='1.3.6.1.4.1.14519.5.2.1.6279.6001.905371958588660410240398317235', center_xyz=(109.142472723, 49.6356928166, -121.183579092))
2 CandidateInfoTuple(isNodule_bool=True, diameter_mm=19.65387738, series_uid='1.3.6.1.4.1.14519.5.2.1.6279.6001.752756872840730509471096155114', center_xyz=(56.1226132601, 67.868268695, -65.6269886453))
3 CandidateInfoTuple(isNodule_bool=True, diameter_mm=18.7832325, series_uid='1.3.6.1.4.1.14519.5.2.1.6279.6001.202811684116768680758082619196', center_xyz=(-82.79150362, -21.43587141, -97.18427459))
4 CandidateInfoTuple(isNodule_bool=True, diameter_mm=17.75323185, series_uid='1.3.6.1.4.1.14519.5.2.1.6279.6001.187451715205085403623595258748', center_xyz=(94.1132711884, -15.8936132585, -2

In [None]:
tuple_list = LunaDataset()

2022-11-27 14:24:40,504 INFO     pid:75 p2ch10.dsets:173:__init__ <p2ch10.dsets.LunaDataset object at 0x7f0629ca3e50>: 56938 training samples


In [None]:
pwd

'/content/dlwpt-code'

## Train and run the model
Use the script p2_run_everything.ipynb 

In [None]:
import datetime

from util.util import importstr
from util.logconf import logging
log = logging.getLogger('nb')

In [None]:
def run(app, *argv):
    argv = list(argv)
    argv.insert(0, '--num-workers=4')  # <1>
    log.info("Running: {}({!r}).main()".format(app, argv))
    
    app_cls = importstr(*app.rsplit('.', 1))  # <2>
    app_cls(argv).main()
    
    log.info("Finished: {}.{!r}).main()".format(app, argv))

In [None]:
import os
import shutil

# clean up any old data that might be around.
# We don't call this by default because it's destructive, 
# and would waste a lot of time if it ran when nothing 
# on the application side had changed.
def cleanCache():
    shutil.rmtree('data-unversioned/cache')
    os.mkdir('data-unversioned/cache')

# cleanCache()


In [None]:
training_epochs = 20
experiment_epochs = 10
final_epochs = 50

training_epochs = 2
experiment_epochs = 2
final_epochs = 5
seg_epochs = 10

In [None]:
run('p2ch11.prepcache.LunaPrepCacheApp')

2022-11-27 14:25:07,448 INFO     pid:75 nb:004:run Running: p2ch11.prepcache.LunaPrepCacheApp(['--num-workers=4']).main()
2022-11-27 14:25:09,280 INFO     pid:75 p2ch11.prepcache:043:main Starting LunaPrepCacheApp, Namespace(batch_size=1024, num_workers=4)
2022-11-27 14:25:11,250 INFO     pid:75 p2ch11.dsets:185:__init__ <p2ch11.dsets.LunaDataset object at 0x7f062ea27050>: 56938 training samples
  cpuset_checked))
2022-11-27 14:25:58,492 INFO     pid:75 util.util:241:enumerateWithEstimate Stuffing cache    8/56, done at 2022-11-27 14:29:47, 0:04:13
2022-11-27 14:26:46,715 INFO     pid:75 util.util:241:enumerateWithEstimate Stuffing cache   16/56, done at 2022-11-27 14:30:24, 0:04:50
2022-11-27 14:28:14,732 INFO     pid:75 util.util:241:enumerateWithEstimate Stuffing cache   32/56, done at 2022-11-27 14:30:22, 0:04:47
2022-11-27 14:30:12,163 INFO     pid:75 nb:009:run Finished: p2ch11.prepcache.LunaPrepCacheApp.['--num-workers=4']).main()


In [None]:
run('p2ch11.training.LunaTrainingApp', '--epochs=1')

2022-11-27 14:31:16,768 INFO     pid:75 nb:004:run Running: p2ch11.training.LunaTrainingApp(['--num-workers=4', '--epochs=1']).main()
2022-11-27 14:31:17,795 INFO     pid:75 p2ch11.training:079:initModel Using CUDA; 1 devices.
2022-11-27 14:31:22,008 INFO     pid:75 p2ch11.training:138:main Starting LunaTrainingApp, Namespace(batch_size=32, comment='dwlpt', epochs=1, num_workers=4, tb_prefix='p2ch11')
2022-11-27 14:31:22,047 INFO     pid:75 p2ch11.dsets:185:__init__ <p2ch11.dsets.LunaDataset object at 0x7f062bbfb890>: 51244 training samples
2022-11-27 14:31:22,055 INFO     pid:75 p2ch11.dsets:185:__init__ <p2ch11.dsets.LunaDataset object at 0x7f0634e87210>: 5694 validation samples
2022-11-27 14:31:22,056 INFO     pid:75 p2ch11.training:151:main Epoch 1 of 1, 1602/178 batches of size 32*1
2022-11-27 14:31:31,407 INFO     pid:75 util.util:241:enumerateWithEstimate E1 Training   16/1602, done at 2022-11-27 14:33:43, 0:02:13
2022-11-27 14:31:35,419 INFO     pid:75 util.util:241:enumerateWi

In [None]:
ls -lA runs/p2ch11/

total 8
drwxr-xr-x 2 root root 4096 Nov 26 15:31 [0m[01;34m2022-11-26_15.28.55-trn_cls-dwlpt[0m/
drwxr-xr-x 2 root root 4096 Nov 26 15:31 [01;34m2022-11-26_15.28.55-val_cls-dwlpt[0m/


## Tensorboard
Tensorboard does not work on Google Colab. This is a known issue so do not spend time trying to fix it. Mazbe one solution could be to download the data in the runs/p2ch11/ folder and use Tensorboard locally.

In [None]:
!pip uninstall -y tensorboard-plugin-wit

Found existing installation: tensorboard-plugin-wit 1.8.1
Uninstalling tensorboard-plugin-wit-1.8.1:
  Successfully uninstalled tensorboard-plugin-wit-1.8.1


In [None]:
%reload_ext tensorboard

In [None]:
%tensorboard --logdir runs

Reusing TensorBoard on port 6008 (pid 720), started 0:00:33 ago. (Use '!kill 720' to kill it.)

<IPython.core.display.Javascript object>