<p style="font-size:20px">Before I say anything y'll need to go to <a href="https://www.kaggle.com/pestipeti/vinbigdata-fasterrcnn-pytorch-train">Peter Pesti</a>'s notebook and upvote it right away.<br><br>Also, if you upvote this notebook:<br> There is a chance my girlfriend will take me out for a McDonalds Date after such a long time.😋
</p>



<p style="font-size:20px">So,are you here for the first time? <br>If yes:</p>
<p style="font-size:27px;">Say a Hi 👋🏽 to me on <a href="https://www.linkedin.com/in/vyom-bhatia-40ba79181/">LinkedIn</a><a href="https://www.linkedin.com/in/vyom-bhatia-40ba79181/"><img src="https://cdn2.iconfinder.com/data/icons/simple-social-media-shadow/512/14-512.png" style="display:inline-block;width:15%;margin-right:-2vw;margin-left:-2vw;"></a>!</p>

<p style="font-size:20px">So, what are we really doing here?</p><br>
<p style="font-size:27px"><b>Object Detection with FasterRCNN 🏃🏽‍♀️👩‍💻</b></p>

# Importing Libraries
<p style="font-size:20px;">Importing the libraries in here to started. BTW, this is my first time working with Object Detection and Dicom format images.</p>

In [None]:

# You can literally be drunk and yet not forget to import these bad boys.

import pandas as pd
import numpy as np
import cv2
import os
import re
import pydicom
import warnings

# PIL for the Images.
from PIL import Image

# Transforms for the Transformations.
import albumentations as A
from albumentations.pytorch.transforms import ToTensorV2

# Py🔦
import torch
import torchvision
from torchvision.models.detection.faster_rcnn import FastRCNNPredictor
from torchvision.models.detection import FasterRCNN
from torchvision.models.detection.rpn import AnchorGenerator
from torch.utils.data import DataLoader, Dataset
from torch.utils.data.sampler import SequentialSampler

# Matplot for the visualizations.
import matplotlib.pyplot as plt

<p style="font-size:27px;">Paths</p>

In [None]:
# Defining the Path where all our files exist:
dir_input = "../input/vinbigdata-chest-xray-abnormalities-detection"

# Defining the Traning Set's Path:
trainpath = f'{dir_input}/train'

# Defining the Test Set's Path:
testpath = f'{dir_input}/test'

# Preprocessing the Data
<p style="font-size:20px;">Some really basic preprocessing to make life easier and to let the FRCNN do its job but there are some complications into it. If you really want to take a dive into understanding it better, <a href ="https://www.youtube.com/watch?v=v5bFVbQvFRk">check this video out</a>.</p>

In [None]:
# Grabbing the Dataframe which has the image ids.
train_df = pd.read_csv(f"{dir_input}/train.csv")

# The file has nul coordinate value there is nothing abnormal/detect lets fix that.
train_df.fillna(0, inplace=True)

#Time to fix the other two coordinate values of the "No finding" category
train_df.loc[train_df["class_id"] == 14, ["x_max", "y_max"]] = 1.0

# The FasterRCNN handles the class_id==0 as the background.
train_df["class_id"] = train_df["class_id"] + 1

# Lets make the "No finding" category 0.
train_df.loc[train_df["class_id"] == 15, ["class_id"]] = 0

print(f'The total number of classes are {train_df["class_id"].nunique()}.')
train_df.head()

# Visualizing the Data


In [None]:
image_ids = train_df['image_id'].unique()

# Lets make the Valid and Training set:

# Grab Everything but the last 3000 ids.
train_ids = image_ids[:-3000]

# Grab the left over 3000 ids.
valid_ids = image_ids[-3000:]


In [None]:
filename = trainpath + "/" + train_ids[5] + ".dicom"

ds = pydicom.dcmread(filename)
plt.xlabel(train_df["class_name"][5])
plt.imshow(ds.pixel_array, cmap=plt.cm.bone) 

In [None]:
filename = trainpath + "/" + train_ids[2] + ".dicom"

ds = pydicom.dcmread(filename)
plt.xlabel(train_df["class_name"][2])
plt.imshow(ds.pixel_array, cmap=plt.cm.bone) 

In [None]:
filename = trainpath + "/" + train_ids[1] + ".dicom"

ds = pydicom.dcmread(filename)
plt.xlabel(train_df["class_name"][1])
plt.imshow(ds.pixel_array, cmap=plt.cm.bone) 

In [None]:
valid_df = train_df[train_df['image_id'].isin(valid_ids)]
train_df = train_df[train_df['image_id'].isin(train_ids)]

<p style="font-size:27px">Dataset</p>
<p style="font-size:20px">Defining the class for the Dataset (Heavily adapted, sorry not sorry. Never mind am I being rude?)

In [None]:
class ObjectDetection(Dataset):
    
    def __init__(self, dataframe, image_dir, transforms=None):
        super().__init__()
        
        self.image_ids = dataframe["image_id"].unique()
        self.df = dataframe
        self.image_dir = image_dir
        self.transforms = transforms
        
    def __getitem__(self, index):
        
        image_id = self.image_ids[index]
        records = self.df[(self.df['image_id'] == image_id)]
        records = records.reset_index(drop=True)

        dicom = pydicom.dcmread(f"{self.image_dir}/{image_id}.dicom")
        
        image = dicom.pixel_array
        
        if "PhotometricInterpretation" in dicom:
            if dicom.PhotometricInterpretation == "MONOCHROME1":
                image = np.amax(image) - image
        
        intercept = dicom.RescaleIntercept if "RescaleIntercept" in dicom else 0.0
        slope = dicom.RescaleSlope if "RescaleSlope" in dicom else 1.0
        
        if slope != 1:
            image = slope * image.astype(np.float64)
            image = image.astype(np.int16)
            
        image += np.int16(intercept)        
        
        # Seems like it's the time to Normalize.
        image = np.stack([image, image, image])
        image = image.astype('float32')
        image = image - image.min()
        image = image / image.max()
        image = image * 255.0
        image = image.transpose(1,2,0)
       
        if records.loc[0, "class_id"] == 0:
            records = records.loc[[0], :]
        
        boxes = records[['x_min', 'y_min', 'x_max', 'y_max']].values
        area = (boxes[:, 3] - boxes[:, 1]) * (boxes[:, 2] - boxes[:, 0])
        area = torch.as_tensor(area, dtype=torch.float32)
        labels = torch.tensor(records["class_id"].values, dtype=torch.int64)

        # suppose all instances are not crowd
        iscrowd = torch.zeros((records.shape[0],), dtype=torch.int64)

        target = {}
        target['boxes'] = boxes
        target['labels'] = labels
        target['image_id'] = torch.tensor([index])
        target['area'] = area
        target['iscrowd'] = iscrowd

        if self.transforms:
            sample = {
                'image': image,
                'bboxes': target['boxes'],
                'labels': labels
            }
            sample = self.transforms(**sample)
            image = sample['image']
            
            target['boxes'] = torch.tensor(sample['bboxes'])

        if target["boxes"].shape[0] == 0:
            # Albumentation cuts the target (class 14, 1x1px in the corner)
            target["boxes"] = torch.from_numpy(np.array([[0.0, 0.0, 1.0, 1.0]]))
            target["area"] = torch.tensor([1.0], dtype=torch.float32)
            target["labels"] = torch.tensor([0], dtype=torch.int64)
            
        return image, target
    
    def __len__(self):
        return self.image_ids.shape[0]

# Data Augmentation

<p style="font-size:20px">Starting with a transformations of the images for data augmentations. Here 

In [None]:
def get_train_transform():
    
    return A.Compose([
        
        A.Flip(0.3),
        
        # Normalizing the Values.
        A.Normalize(mean=(0,0,0), std=(1,1,1),
                    max_pixel_value = 255.0, p=1.0),
        # Converting them to tensors.
        ToTensorV2(p=1.0)
        
    ], bbox_params = {'format': 'pascal_voc', 
                      'label_fields': ['labels']})


def get_valid_transform():
    
    return A.Compose([
        
        # Normalizing the Values.
        A.Normalize(mean=(0,0,0), std=(1,1,1),
                    max_pixel_value = 255.0, p=1.0),
        # Converting them to tensors.
        ToTensorV2(p=1.0)
    
    ], bbox_params = {'format': 'pascal_voc',
                      'label_fields': ['labels']})

# Grabbing the Model

In [None]:
models = torchvision.models.detection.fasterrcnn_resnet50_fpn(pretrained = True)

In [None]:
# Defining the number of classes.
num_classes = 15

in_features = models.roi_heads.box_predictor.cls_score.in_features

models.roi_heads.box_predictor = FastRCNNPredictor(in_features, num_classes)


In [None]:
def collate_function(batch):
    return tuple(zip(*batch))

# Finally working to collect all the stuff written up there into datasets.

# Train:
train_dataset = ObjectDetection(train_df, trainpath, get_train_transform())

# Valid:
valid_dataset = ObjectDetection(valid_df, trainpath, get_valid_transform())

indices = torch.randperm(len(train_dataset)).tolist()

# And ofcourse the dataloaders.

train_data_loader = DataLoader(train_dataset, batch_size=3,
                               shuffle=True, num_workers=8,
                               collate_fn=collate_function)

valid_data_loader = DataLoader(valid_dataset, batch_size=8,
                               shuffle=False, num_workers=4,
                               collate_fn=collate_function)


In [None]:
device = torch.device('cuda')

In [None]:
images, targets = next(iter(train_data_loader))

# Training the Model

In [None]:
class Averager:
    def __init__(self):
        self.current_total = 0.0
        self.iterations = 0.0
        
    def send(self, value):
        self.current_total += value
        self.iterations += 1
    @property
    def value(self):
        if self.iterations == 0:
            return 0
        else:
            return 1.0 * self.current_total / self.iterations
        
    def reset(self):
        self.current_total = 0.0
        self.iterations = 0.0

<p style="font-size:27px;">Setting some Hyperparameters</p>

In [None]:
models.to(device)

params = [p for p in models.parameters()] 
optimizer = torch.optim.SGD(params, lr = 0.005, momentum=0.9, weight_decay=0.0004)

lr_scheduler = torch.optim.lr_scheduler.StepLR(optimizer, step_size=4, gamma=0.1)

num_epochs = 10

<p style="font-size:27px;">Training the Models</p>
<p style="font-size:20px;"> Well, let's see how the model does.

In [None]:
loss_hist = Averager()
itr = 1

for epoch in range(num_epochs):
    loss_hist.reset()
    
    for images, targets in train_data_loader:
        
        images = list(image.to(device) for image in images)
        targets = [{k: v.to(device) for k, v in t.items()} for t in targets]
        
        loss_dict = models(images, targets)
        
        losses = sum(loss for loss in loss_dict.values())
        loss_value = losses.item()
        
        loss_hist.send(loss_value)
        
        optimizer.zero_grad()
        losses.backward()
        optimizer.step()
        
        if itr % 100 == 0:
            print("The Loss of Iteration", itr,"is", loss_hist.value + ".")
            
        itr += 1
        
        break
        
        
    if lr_scheduler is not None:
        lr_scheduler.step()
        
    print("The loss for Epoch", epoch, "is:", loss_hist.value)

<p style="font-size:20px">Well, this seems like the end to this notebook.<br><br> Yes. <br>
    <br>
    I had been sitting on this one for a while now and felt like just posting it.
    So, this does feel like lifting a weight off my chest. Juggling College, Deep Learning, 
    NLP and of course startup ideas that seem totally unfeasible but I'd rather still pursue them.
    Why? Because we are young my friend.</p>
    
    

<p style="font-size:20px">Please Upvote if you found it useful. It will help me edge further towards my goal of becoming a Data Scientist.<br>Please <a href="https://www.kaggle.com/pestipeti/">check out the author</a> from whom I took some assistance and give them a follow!</p>