# **Deformable Convolution for Lung Cancer Prediction Based on Transcriptomic Data : a Deformable Convolutional Neural Network**
> Author : **Aymen MERROUCHE**. <br>
> In this notebook, we implement a deformable convolutional network for our binary classification task. First we pre train our MLP on the non Lung cancer dataset. Then, we fine tune it on the lung cancer dataset (we don't keep the final classification layer) :

In [32]:
import torch
import torch.nn as nn
from torch.nn import functional as F
import numpy as np
import datetime

from utils import *
from train import *
from data_utils import *
from modules.deformable_cnn import *
from modules.focal_loss import *
%load_ext autoreload
%autoreload 2

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


In [33]:
# device to use, if cuda available then use cuda else use cpu
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print("Working on : ", device)

Working on :  cuda


In [34]:
# load hyperparametrs
# data paths args
with open('./configs/data_paths.yaml', 'r') as stream:
    data_paths_args  = yaml.load(stream,Loader=yaml.Loader)
    
    
# basic cnn args
with open('./configs/def_cnn.yaml', 'r') as stream:
    basic_cnn_args  = yaml.load(stream,Loader=yaml.Loader)

## **1 - Pre-Training on the Non Lung Dataset :**

### **1 - 1 - Get the Data :**

In [26]:
%%time
# Getting the data
# dataset
non_lung_dataset = TranscriptomicImagesDatasetNonLung(data_paths_args["path_to_pan_cancer_hdf5_files"], data_paths_args["path_to_treemap_images"])
non_lung_dataloader_train, non_lung_dataloader_validation = get_data_loaders(non_lung_dataset, batch_size_train = basic_cnn_args["batch_size_pt"],\
                                                                             batch_size_validation = basic_cnn_args["batch_size_pt"])

CPU times: user 4.75 s, sys: 985 ms, total: 5.74 s
Wall time: 2.91 s


### **1 - 2 - Network, Criterion and Training :**

In [None]:
# network
net = DeformConvNet().to(device).double()

# loss and optimizer  
criterion = FocalLoss(gamma=3).to(device)
optimizer = optim.Adam(net.parameters(), lr=basic_cnn_args['lr_pt'])

# Logging + Experiment

ignore_keys = {'no_tensorboard'}
# get hyperparameters with values in a dict
hparams = {**basic_cnn_args}
# generate a name for the experiment
expe_name = '_'.join([f"{key}={val}" for key, val in hparams.items()])
print("Experimenting with : \n \t"+expe_name)
# path where to save the model
savepath = Path('/tempory/transcriptomic_data/pre_trained_def_cnn_checkpt.pt')
# Tensorboard summary writer
if basic_cnn_args['no_tensorboard']:
    writer = None
else:
    writer = SummaryWriter("runs/runs"+"_"+datetime.datetime.now().strftime("%Y%m%d-%H%M%S")+expe_name)
    
# start the experiment
checkpoint = CheckpointState(net, optimizer, savepath=savepath)
fit(checkpoint, criterion, non_lung_dataloader_train, non_lung_dataloader_validation, basic_cnn_args['epochs'], writer=writer)
if not basic_cnn_args['no_tensorboard']:
    writer.close()

Epoch 1/50:   0%|          | 1/646 [00:00<01:12,  8.92it/s, loss=1.0283e-01]

Experimenting with : 
 	epochs=50_batch_size_pt=8_lr_pt=0.001_batch_size_ft=8_lr_ft=0.001_no_tensorboard=True
Training on GPU 



Epoch 1/50: 100%|██████████| 646/646 [00:58<00:00, 10.99it/s, loss=8.0245e-02]
Epoch 2/50:   0%|          | 2/646 [00:00<01:01, 10.47it/s, loss=9.0169e-02]

Epoch 1/50, Train Loss: 8.5232e-02, Test Loss: 0.0901
Epoch 1/50, Train Accuracy: 55.96%, Test Accuracy: 60.36%
Epoch 1/50, Train AUC: 55.97%, Test AUC: 56.59%
Classification Report on Val Set : 
              precision    recall  f1-score   support

   No Cancer       0.90      0.62      0.73      2233
      Cancer       0.16      0.52      0.24       310

    accuracy                           0.60      2543
   macro avg       0.53      0.57      0.49      2543
weighted avg       0.81      0.60      0.67      2543



Epoch 2/50: 100%|██████████| 646/646 [00:59<00:00, 10.91it/s, loss=1.0464e-01]
Epoch 3/50:   0%|          | 2/646 [00:00<01:01, 10.51it/s, loss=8.1974e-02]

Epoch 2/50, Train Loss: 8.2363e-02, Test Loss: 0.0762
Epoch 2/50, Train Accuracy: 57.47%, Test Accuracy: 75.98%
Epoch 2/50, Train AUC: 57.72%, Test AUC: 55.34%
Classification Report on Val Set : 
              precision    recall  f1-score   support

   No Cancer       0.89      0.83      0.86      2233
      Cancer       0.18      0.28      0.22       310

    accuracy                           0.76      2543
   macro avg       0.54      0.55      0.54      2543
weighted avg       0.81      0.76      0.78      2543



Epoch 3/50: 100%|██████████| 646/646 [00:59<00:00, 10.91it/s, loss=8.7185e-02]
Epoch 4/50:   0%|          | 2/646 [00:00<01:01, 10.47it/s, loss=1.2705e-01]

Epoch 3/50, Train Loss: 7.9354e-02, Test Loss: 0.0779
Epoch 3/50, Train Accuracy: 57.89%, Test Accuracy: 66.81%
Epoch 3/50, Train AUC: 58.17%, Test AUC: 57.21%
Classification Report on Val Set : 
              precision    recall  f1-score   support

   No Cancer       0.90      0.70      0.79      2233
      Cancer       0.17      0.45      0.25       310

    accuracy                           0.67      2543
   macro avg       0.54      0.57      0.52      2543
weighted avg       0.81      0.67      0.72      2543



Epoch 4/50: 100%|██████████| 646/646 [00:59<00:00, 10.90it/s, loss=9.4914e-02]
Epoch 5/50:   0%|          | 2/646 [00:00<01:01, 10.50it/s, loss=8.4482e-02]

Epoch 4/50, Train Loss: 7.5585e-02, Test Loss: 0.0554
Epoch 4/50, Train Accuracy: 61.63%, Test Accuracy: 81.40%
Epoch 4/50, Train AUC: 61.84%, Test AUC: 57.88%
Classification Report on Val Set : 
              precision    recall  f1-score   support

   No Cancer       0.90      0.89      0.89      2233
      Cancer       0.25      0.27      0.26       310

    accuracy                           0.81      2543
   macro avg       0.57      0.58      0.58      2543
weighted avg       0.82      0.81      0.82      2543



Epoch 5/50: 100%|██████████| 646/646 [00:59<00:00, 10.90it/s, loss=1.3326e-01]
Epoch 6/50:   0%|          | 2/646 [00:00<01:01, 10.48it/s, loss=5.1298e-02]

Epoch 5/50, Train Loss: 6.0190e-02, Test Loss: 0.0633
Epoch 5/50, Train Accuracy: 70.20%, Test Accuracy: 70.36%
Epoch 5/50, Train AUC: 70.11%, Test AUC: 61.31%
Classification Report on Val Set : 
              precision    recall  f1-score   support

   No Cancer       0.91      0.73      0.81      2233
      Cancer       0.20      0.49      0.29       310

    accuracy                           0.70      2543
   macro avg       0.56      0.61      0.55      2543
weighted avg       0.83      0.70      0.75      2543



Epoch 6/50: 100%|██████████| 646/646 [01:02<00:00, 10.41it/s, loss=2.1352e-01]
Epoch 7/50:   0%|          | 2/646 [00:00<01:01, 10.49it/s, loss=4.1966e-02]

Epoch 6/50, Train Loss: 5.2916e-02, Test Loss: 0.0530
Epoch 6/50, Train Accuracy: 73.59%, Test Accuracy: 74.05%
Epoch 6/50, Train AUC: 73.53%, Test AUC: 60.08%
Classification Report on Val Set : 
              precision    recall  f1-score   support

   No Cancer       0.91      0.79      0.84      2233
      Cancer       0.21      0.42      0.28       310

    accuracy                           0.74      2543
   macro avg       0.56      0.60      0.56      2543
weighted avg       0.82      0.74      0.77      2543



Epoch 7/50: 100%|██████████| 646/646 [01:01<00:00, 10.45it/s, loss=8.0148e-02]
Epoch 8/50:   0%|          | 2/646 [00:00<01:01, 10.50it/s, loss=4.9956e-02]

Epoch 7/50, Train Loss: 5.1984e-02, Test Loss: 0.0353
Epoch 7/50, Train Accuracy: 71.48%, Test Accuracy: 80.77%
Epoch 7/50, Train AUC: 71.74%, Test AUC: 56.13%
Classification Report on Val Set : 
              precision    recall  f1-score   support

   No Cancer       0.89      0.89      0.89      2233
      Cancer       0.22      0.24      0.23       310

    accuracy                           0.81      2543
   macro avg       0.56      0.56      0.56      2543
weighted avg       0.81      0.81      0.81      2543



Epoch 8/50: 100%|██████████| 646/646 [00:59<00:00, 10.90it/s, loss=3.0353e-02]
Epoch 9/50:   0%|          | 2/646 [00:00<01:01, 10.50it/s, loss=6.0170e-02]

Epoch 8/50, Train Loss: 4.4937e-02, Test Loss: 0.0998
Epoch 8/50, Train Accuracy: 74.94%, Test Accuracy: 56.23%
Epoch 8/50, Train AUC: 74.88%, Test AUC: 61.33%
Classification Report on Val Set : 
              precision    recall  f1-score   support

   No Cancer       0.92      0.55      0.69      2233
      Cancer       0.17      0.68      0.27       310

    accuracy                           0.56      2543
   macro avg       0.55      0.61      0.48      2543
weighted avg       0.83      0.56      0.64      2543



Epoch 9/50: 100%|██████████| 646/646 [01:00<00:00, 10.73it/s, loss=1.1202e-01]
Epoch 10/50:   0%|          | 2/646 [00:00<01:01, 10.49it/s, loss=9.1683e-02]

Epoch 9/50, Train Loss: 5.0095e-02, Test Loss: 0.0289
Epoch 9/50, Train Accuracy: 69.20%, Test Accuracy: 83.21%
Epoch 9/50, Train AUC: 69.45%, Test AUC: 56.41%
Classification Report on Val Set : 
              precision    recall  f1-score   support

   No Cancer       0.89      0.92      0.91      2233
      Cancer       0.26      0.21      0.23       310

    accuracy                           0.83      2543
   macro avg       0.58      0.56      0.57      2543
weighted avg       0.82      0.83      0.82      2543



Epoch 10/50: 100%|██████████| 646/646 [00:59<00:00, 10.91it/s, loss=4.0607e-03]
Epoch 11/50:   0%|          | 2/646 [00:00<01:01, 10.49it/s, loss=1.8917e-02]

Epoch 10/50, Train Loss: 3.4008e-02, Test Loss: 0.0859
Epoch 10/50, Train Accuracy: 78.70%, Test Accuracy: 65.12%
Epoch 10/50, Train AUC: 78.69%, Test AUC: 60.00%
Classification Report on Val Set : 
              precision    recall  f1-score   support

   No Cancer       0.91      0.67      0.77      2233
      Cancer       0.18      0.53      0.27       310

    accuracy                           0.65      2543
   macro avg       0.55      0.60      0.52      2543
weighted avg       0.82      0.65      0.71      2543



Epoch 11/50: 100%|██████████| 646/646 [00:59<00:00, 10.90it/s, loss=6.8228e-03]
Epoch 12/50:   0%|          | 2/646 [00:00<01:01, 10.50it/s, loss=8.5094e-02]

Epoch 11/50, Train Loss: 2.1782e-02, Test Loss: 0.0541
Epoch 11/50, Train Accuracy: 84.13%, Test Accuracy: 73.57%
Epoch 11/50, Train AUC: 84.11%, Test AUC: 59.67%
Classification Report on Val Set : 
              precision    recall  f1-score   support

   No Cancer       0.91      0.78      0.84      2233
      Cancer       0.21      0.41      0.28       310

    accuracy                           0.74      2543
   macro avg       0.56      0.60      0.56      2543
weighted avg       0.82      0.74      0.77      2543



Epoch 12/50: 100%|██████████| 646/646 [00:59<00:00, 10.90it/s, loss=2.2693e-02]
Epoch 13/50:   0%|          | 2/646 [00:00<01:01, 10.50it/s, loss=2.5345e-02]

Epoch 12/50, Train Loss: 2.2438e-02, Test Loss: 0.0651
Epoch 12/50, Train Accuracy: 83.78%, Test Accuracy: 69.64%
Epoch 12/50, Train AUC: 83.71%, Test AUC: 59.10%
Classification Report on Val Set : 
              precision    recall  f1-score   support

   No Cancer       0.91      0.73      0.81      2233
      Cancer       0.19      0.45      0.27       310

    accuracy                           0.70      2543
   macro avg       0.55      0.59      0.54      2543
weighted avg       0.82      0.70      0.74      2543



Epoch 13/50: 100%|██████████| 646/646 [00:59<00:00, 10.90it/s, loss=3.6578e-03]
Epoch 14/50:   0%|          | 2/646 [00:00<01:01, 10.47it/s, loss=5.3339e-03]

Epoch 13/50, Train Loss: 2.6003e-02, Test Loss: 0.1019
Epoch 13/50, Train Accuracy: 83.73%, Test Accuracy: 63.79%
Epoch 13/50, Train AUC: 83.78%, Test AUC: 60.35%
Classification Report on Val Set : 
              precision    recall  f1-score   support

   No Cancer       0.91      0.65      0.76      2233
      Cancer       0.18      0.56      0.27       310

    accuracy                           0.64      2543
   macro avg       0.55      0.60      0.52      2543
weighted avg       0.82      0.64      0.70      2543



Epoch 14/50: 100%|██████████| 646/646 [00:59<00:00, 10.88it/s, loss=1.5305e-03]
Epoch 15/50:   0%|          | 2/646 [00:00<01:01, 10.49it/s, loss=1.6844e-02]

Epoch 14/50, Train Loss: 2.4751e-02, Test Loss: 0.0619
Epoch 14/50, Train Accuracy: 82.37%, Test Accuracy: 72.79%
Epoch 14/50, Train AUC: 82.36%, Test AUC: 59.36%
Classification Report on Val Set : 
              precision    recall  f1-score   support

   No Cancer       0.90      0.77      0.83      2233
      Cancer       0.20      0.42      0.27       310

    accuracy                           0.73      2543
   macro avg       0.55      0.59      0.55      2543
weighted avg       0.82      0.73      0.76      2543



Epoch 15/50: 100%|██████████| 646/646 [00:59<00:00, 10.90it/s, loss=1.9866e-02]
Epoch 16/50:   0%|          | 2/646 [00:00<01:01, 10.51it/s, loss=2.0364e-02]

Epoch 15/50, Train Loss: 1.6671e-02, Test Loss: 0.0331
Epoch 15/50, Train Accuracy: 85.22%, Test Accuracy: 80.42%
Epoch 15/50, Train AUC: 85.31%, Test AUC: 56.49%
Classification Report on Val Set : 
              precision    recall  f1-score   support

   No Cancer       0.89      0.88      0.89      2233
      Cancer       0.23      0.25      0.24       310

    accuracy                           0.80      2543
   macro avg       0.56      0.56      0.56      2543
weighted avg       0.81      0.80      0.81      2543



Epoch 16/50: 100%|██████████| 646/646 [01:08<00:00,  9.49it/s, loss=1.4979e-02]
Epoch 17/50:   0%|          | 2/646 [00:00<01:01, 10.51it/s, loss=4.2240e-02]

Epoch 16/50, Train Loss: 3.5144e-02, Test Loss: 0.1064
Epoch 16/50, Train Accuracy: 79.02%, Test Accuracy: 65.08%
Epoch 16/50, Train AUC: 79.06%, Test AUC: 62.34%
Classification Report on Val Set : 
              precision    recall  f1-score   support

   No Cancer       0.92      0.66      0.77      2233
      Cancer       0.19      0.59      0.29       310

    accuracy                           0.65      2543
   macro avg       0.56      0.62      0.53      2543
weighted avg       0.83      0.65      0.71      2543



Epoch 17/50: 100%|██████████| 646/646 [00:59<00:00, 10.91it/s, loss=2.7165e-02]
Epoch 18/50:   0%|          | 2/646 [00:00<01:01, 10.52it/s, loss=1.0459e-01]

Epoch 17/50, Train Loss: 1.7555e-02, Test Loss: 0.0492
Epoch 17/50, Train Accuracy: 85.33%, Test Accuracy: 75.58%
Epoch 17/50, Train AUC: 85.32%, Test AUC: 58.04%
Classification Report on Val Set : 
              precision    recall  f1-score   support

   No Cancer       0.90      0.81      0.85      2233
      Cancer       0.20      0.35      0.26       310

    accuracy                           0.76      2543
   macro avg       0.55      0.58      0.56      2543
weighted avg       0.82      0.76      0.78      2543



Epoch 18/50: 100%|██████████| 646/646 [00:59<00:00, 10.90it/s, loss=2.6877e-03]
Epoch 19/50:   0%|          | 2/646 [00:00<01:01, 10.52it/s, loss=8.2447e-03]

Epoch 18/50, Train Loss: 2.8713e-02, Test Loss: 0.0683
Epoch 18/50, Train Accuracy: 80.30%, Test Accuracy: 70.98%
Epoch 18/50, Train AUC: 80.25%, Test AUC: 59.03%
Classification Report on Val Set : 
              precision    recall  f1-score   support

   No Cancer       0.90      0.75      0.82      2233
      Cancer       0.19      0.43      0.27       310

    accuracy                           0.71      2543
   macro avg       0.55      0.59      0.54      2543
weighted avg       0.82      0.71      0.75      2543



Epoch 19/50: 100%|██████████| 646/646 [00:59<00:00, 10.80it/s, loss=1.4446e-04]
Epoch 20/50:   0%|          | 2/646 [00:00<01:01, 10.52it/s, loss=9.3817e-03]

Epoch 19/50, Train Loss: 1.4012e-02, Test Loss: 0.0357
Epoch 19/50, Train Accuracy: 88.76%, Test Accuracy: 80.10%
Epoch 19/50, Train AUC: 88.74%, Test AUC: 58.95%
Classification Report on Val Set : 
              precision    recall  f1-score   support

   No Cancer       0.90      0.87      0.88      2233
      Cancer       0.25      0.31      0.28       310

    accuracy                           0.80      2543
   macro avg       0.57      0.59      0.58      2543
weighted avg       0.82      0.80      0.81      2543



Epoch 20/50: 100%|██████████| 646/646 [00:59<00:00, 10.90it/s, loss=1.1034e-01]
Epoch 21/50:   0%|          | 2/646 [00:00<01:01, 10.50it/s, loss=1.4875e-01]

Epoch 20/50, Train Loss: 2.2058e-02, Test Loss: 0.0701
Epoch 20/50, Train Accuracy: 83.09%, Test Accuracy: 71.37%
Epoch 20/50, Train AUC: 83.08%, Test AUC: 54.53%
Classification Report on Val Set : 
              precision    recall  f1-score   support

   No Cancer       0.89      0.77      0.82      2233
      Cancer       0.16      0.32      0.22       310

    accuracy                           0.71      2543
   macro avg       0.53      0.55      0.52      2543
weighted avg       0.80      0.71      0.75      2543



Epoch 21/50: 100%|██████████| 646/646 [00:59<00:00, 10.90it/s, loss=9.5141e-03]
Epoch 22/50:   0%|          | 2/646 [00:00<01:01, 10.51it/s, loss=3.4503e-02]

Epoch 21/50, Train Loss: 1.4336e-02, Test Loss: 0.0498
Epoch 21/50, Train Accuracy: 87.31%, Test Accuracy: 76.71%
Epoch 21/50, Train AUC: 87.33%, Test AUC: 58.69%
Classification Report on Val Set : 
              precision    recall  f1-score   support

   No Cancer       0.90      0.83      0.86      2233
      Cancer       0.22      0.35      0.27       310

    accuracy                           0.77      2543
   macro avg       0.56      0.59      0.56      2543
weighted avg       0.82      0.77      0.79      2543



Epoch 22/50: 100%|██████████| 646/646 [00:59<00:00, 10.90it/s, loss=2.0727e-03]
Epoch 23/50:   0%|          | 2/646 [00:00<01:01, 10.48it/s, loss=4.8833e-02]

Epoch 22/50, Train Loss: 2.3596e-02, Test Loss: 0.0696
Epoch 22/50, Train Accuracy: 82.49%, Test Accuracy: 71.61%
Epoch 22/50, Train AUC: 82.45%, Test AUC: 57.86%
Classification Report on Val Set : 
              precision    recall  f1-score   support

   No Cancer       0.90      0.76      0.82      2233
      Cancer       0.19      0.40      0.25       310

    accuracy                           0.72      2543
   macro avg       0.54      0.58      0.54      2543
weighted avg       0.81      0.72      0.76      2543



Epoch 23/50: 100%|██████████| 646/646 [00:59<00:00, 10.90it/s, loss=1.0776e-03]
Epoch 24/50:   0%|          | 2/646 [00:00<01:01, 10.50it/s, loss=1.7703e-02]

Epoch 23/50, Train Loss: 1.4030e-02, Test Loss: 0.0520
Epoch 23/50, Train Accuracy: 88.41%, Test Accuracy: 75.93%
Epoch 23/50, Train AUC: 88.40%, Test AUC: 57.27%
Classification Report on Val Set : 
              precision    recall  f1-score   support

   No Cancer       0.90      0.82      0.86      2233
      Cancer       0.20      0.33      0.25       310

    accuracy                           0.76      2543
   macro avg       0.55      0.57      0.55      2543
weighted avg       0.81      0.76      0.78      2543



Epoch 24/50:  84%|████████▍ | 542/646 [00:49<00:09, 10.88it/s, loss=2.5638e-03]

## **2 - Fine-Tuning on the Lung dataset :**

### **2 - 1 - Load Pre-Trained Model :**

In [12]:
# Load the pretrained Model
net =  MLP(len(non_lung_dataset[0][0])).to(device).double()
optimizer = optim.Adam(net.parameters(), lr=basic_cnn_args['lr_ft'])
# path where the pre-trained model is saved : defined above+"_best"
savepath = Path('/tempory/transcriptomic_data/pre_trained_def_cnn_checkpt_best.pt')
checkpoint = CheckpointState(net, optimizer, savepath=savepath)
checkpoint.load()
pretrained = checkpoint.model

### **2 - 2 - Get the Data :**

In [None]:
%%time
# Getting the data
# dataset
lung_dataset = TranscriptomicVectorsDatasetLung(data_paths_args["path_to_pan_cancer_hdf5_files"])
lung_dataloader_train, lung_dataloader_validation = get_data_loaders(lung_dataset, batch_size_train = basic_cnn_args["batch_size_ft"],\
                                                                             batch_size_validation = basic_cnn_args["batch_size_ft"])

### **2 - 3 - Fine Tuning Procedure :**

In [None]:
# Beginnig Of Transfer Learnig Procedure
net = fine_tune_mlp(pretrained)
net = net.to(device).double()
criterion = FocalLoss().to(device)
optimizer = optim.Adam(net.parameters(), lr=basic_cnn_args['lr_ft'])
savepath = Path('models_finetuned/fine_tuned_mlp_checkpt.pt')
checkpoint = CheckpointState(net, optimizer, savepath=savepath)
fit(checkpoint, criterion, lung_dataloader_train, lung_dataloader_validation, basic_cnn_args['epochs'])