<div style="border-radius:10px;
            border:#0b0265 solid;
           background-color:#0077be;
           font-size:110%;
           letter-spacing:0.5px;
            text-align: center">

<center><h1 style="padding: 25px 0px; background color:#0077be; font-weight: bold; font-family: Cursive">
Lung Cancer Detection Primary Investigation</h1></center>

</div>

## Lung Cancer Detection

One of the leading causes of death worldwide has been recognised as lung cancer . It is 
one of the most dangerous tumours that can harm a person. Its death rate ranks among 
all tumor deaths and is the leading cause of cancer death in both men and women .
Lung cancer accounts for 13% of all cancer cases with over 1.8 million new cases and
1.6 million deaths annually . A tumour is created when abnormal cells continue to 
proliferate and grow. Lung cancer has the highest mortality rate compared to other 
cancer types. Yet individuals have a greater success rate it will be found in the early 
stages of life. Cancer cells are distributed in blood from the lungs.To Examination and 
treatment of lung disease has become one of the biggest obstacles that humanity faces 
in recent years. To overcome this we used convolutional neural Network (CNN) to 
identify the lung tumors as malignant/benign

In [1]:
import numpy as np
import pandas as pd 

import os
for dirname, _, filenames in os.walk('"C:/Users/91812/Downloads/archive"'):
    for filename in filenames:
        os.path.join(dirname, filename)


## Importing Libraries & Modules

In [2]:
import torch
from torch import Tensor, nn
from torchvision import utils as U
from torchvision.transforms import transforms as T
from torch.utils.data import DataLoader
from PIL import Image
import torch.optim
import torch.nn.functional as F
from pathlib import Path
import matplotlib.pyplot as plt
import keras
from tqdm import tqdm

In [3]:
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

## Pre-Processing Images

Preprocessing images plays a crucial role in the accurate detection of lung cancer. Through a series of preprocessing steps, such as image resizing, normalization, noise reduction, and contrast enhancement, the quality and suitability of the lung images are improved. These preprocessing techniques help in optimizing the images for subsequent analysis and feature extraction, enabling sophisticated algorithms to identify potential abnormalities, nodules, or tumors indicative of lung cancer. The careful application of preprocessing methods enhances the accuracy and reliability of the overall lung cancer detection system, assisting medical professionals in early diagnosis and timely intervention

In [4]:
dictionary = {0: 'NORMAL', 1: 'PNEUMONIA'}

def map_img(img):
    label = img.parent.name
    img = Image.open(img).convert('L')
    img = img.resize((150, 150))
    y = lambda x: torch.tensor(0, device = device) if x == dictionary[0] else torch.tensor(1, device = device)
    process = T.Compose([T.ConvertImageDtype(dtype = torch.float32)])
    return img, process(y(label))

In [5]:
def load_tensors(path):
    return [map_img(img) for folder in path.iterdir() for img in list(folder.iterdir())]

In [6]:
test = Path("C:/Users/91812/Downloads/archive/chest_xray/test")
val = Path('C:/Users/91812/Downloads/archive/chest_xray/val')

In [7]:
test_ts = load_tensors(test)
val_ts = load_tensors(val)

In [8]:
normal_train = 'C:/Users/91812/Downloads/archive/chest_xray/train/NORMAL'
pneumonia_train = 'C:/Users/91812/Downloads/archive/chest_xray/train/PNEUMONIA'

train_normal = [os.path.join(normal_train, file) for file in os.listdir(normal_train)]
train_pneumonia = [os.path.join(pneumonia_train, file) for file in os.listdir(pneumonia_train)]

np.random.seed(30)

train_normal_sample = np.random.choice(train_normal, size = 1000)
train_normal = list(zip(train_normal_sample, np.zeros(1000)))

train_pneumonia_sample = np.random.choice(train_pneumonia, size = 1000)
train_pneumonia = list(zip(train_pneumonia_sample, np.ones(1000)))

train_dataset = train_normal + train_pneumonia

In [9]:
def map_train(item):
    image, label = item
    image = Image.open(image).convert('L')
    image = image.resize((150, 150))
    label = torch.tensor(label, device = device)
    process = T.Compose([T.ConvertImageDtype(dtype = torch.float32)])
    return image, process(label)

train_ts = [map_train(item) for item in train_dataset]

In [10]:
transform = T.Compose([T.PILToTensor(),T.ConvertImageDtype(dtype = torch.float32), T.RandomRotation(20), T. RandomAffine(10)])


x_train = torch.cat([transform(t[0]).unsqueeze(1) for t in train_ts])
y_train = torch.cat([t[1].unsqueeze(0).unsqueeze(1) for t in train_ts])

x_test = torch.cat([transform(t[0]).unsqueeze(1) for t in test_ts])
y_test = torch.cat([t[1].unsqueeze(0).unsqueeze(1) for t in test_ts])

x_valid = torch.cat([transform(t[0]).unsqueeze(1) for t in val_ts])
y_valid = torch.cat([t[1].unsqueeze(0).unsqueeze(1) for t in val_ts])

In [11]:
train = tuple(zip(x_train, y_train))
test = tuple(zip(x_test, y_test))
valid = tuple(zip(x_valid, y_valid))

## Loading the DataLoaders

In [12]:
train_dl = DataLoader(train, batch_size = 100, shuffle = True)
test_dl = DataLoader(test, batch_size = 100, shuffle = True)
val_dl = DataLoader(valid, batch_size = 100, shuffle = True)

## Defining the Convolutional Neural Net

In [13]:
class NeuralNetwork(nn.Module):
    def __init__(self):
        super(NeuralNetwork, self).__init__()
        self.conv1 = self.conv(1, 32)
        self.conv2 = self.conv(32, 64, dropout = True)
        self.conv3 = self.conv(64, 64)
        self.conv4 = self.conv(64, 128, dropout = True)
        self.conv5 = self.conv(128, 256, dropout = True)
        
        self.relu = nn.ReLU()
        self.flt = nn.Flatten(start_dim = 1)
        self.dropout = nn.Dropout(0.2)
        self.sigmoid = nn.Sigmoid()
        
        self.fc1 = nn.Linear(4096, 1024)
        self.fc2 = nn.Linear(1024, 1)


    def conv(self, in_channels, out_channels, stride = 1, dropout = False):
        layers = [nn.Conv2d(in_channels, out_channels, (3, 3), stride = stride, padding = (1, 1)), nn.ReLU()]
        if dropout == True:
            layers.append(nn.Dropout(0.1))
        layers.append(nn.BatchNorm2d(out_channels))
        layers.append(nn.MaxPool2d((2,2), stride = 2))
        return nn.Sequential(*layers)
    
    def forward(self, x):
        x = self.conv1(x)
        x = self.conv2(x)
        x = self.conv3(x)
        x = self.conv4(x)
        x = self.conv5(x)
        x = self.flt(x)
        x = self.relu(self.fc1(x))
        x = self.dropout(x)
        x = self.fc2(x)
        x = self.sigmoid(x)
        return x      

In [14]:
model = NeuralNetwork()
model.to(device)

optimise = torch.optim.Adam(model.parameters(), lr = 1e-4, weight_decay = 0.01)


l = nn.BCELoss()

In [15]:
def run(epochs):
    training_loss = []
    training_accuracy = []
    test_loss = []
    test_accuracy = []
    
    # train
    for i in range(epochs):
        model.train()
        epoch_loss = 0
        epoch_accuracy = 0
        for batch in tqdm(train_dl):
            img, labels = batch
            img = img.to(device)
            labels = labels.to(device)
            
            optimise.zero_grad()
            
            preds = model(img)
            loss = l(preds, labels)
            
            
            loss.backward()
            
            optimise.step()

            epoch_loss += loss.item()
            accuracy = (preds>0.5).float() == labels.float()
            accuracy = accuracy.float().mean()
            epoch_accuracy += accuracy.item()
                                
        
        model.eval()
        img, labels = next(iter(test_dl))
        img = img.to(device)
        labels = labels.to(device)

        preds = model(img)
        eval_loss = l(preds, labels)

        eval_accuracy = (preds>0.5).float() == labels.float()
        eval_accuracy = accuracy.float().mean().item()    
                
        training_loss.append(epoch_loss)
        training_accuracy.append(epoch_accuracy)
        test_loss.append(eval_loss.item())
        test_accuracy.append(eval_accuracy)
        
        print(f"Epoch: {i+1}\t Training Loss: {epoch_loss/len(train_dl)}\t Training Accuracy: {epoch_accuracy/len(train_dl)}\nTest Loss: {eval_loss}, \tAccuracy: {accuracy}")

In [16]:
run(5)

100%|██████████████████████████████████████████████████████████████████████████████████| 20/20 [01:44<00:00,  5.21s/it]
  0%|                                                                                           | 0/20 [00:00<?, ?it/s]

Epoch: 1	 Training Loss: 0.3048105966299772	 Training Accuracy: 0.8644999966025353
Test Loss: 3.579519748687744, 	Accuracy: 0.8999999761581421


100%|██████████████████████████████████████████████████████████████████████████████████| 20/20 [01:48<00:00,  5.43s/it]
  0%|                                                                                           | 0/20 [00:00<?, ?it/s]

Epoch: 2	 Training Loss: 0.1493612129241228	 Training Accuracy: 0.9435000032186508
Test Loss: 5.083327770233154, 	Accuracy: 0.9300000071525574


100%|██████████████████████████████████████████████████████████████████████████████████| 20/20 [01:48<00:00,  5.43s/it]
  0%|                                                                                           | 0/20 [00:00<?, ?it/s]

Epoch: 3	 Training Loss: 0.11020331010222435	 Training Accuracy: 0.959500002861023
Test Loss: 4.252012729644775, 	Accuracy: 0.9599999785423279


100%|██████████████████████████████████████████████████████████████████████████████████| 20/20 [01:47<00:00,  5.37s/it]
  0%|                                                                                           | 0/20 [00:00<?, ?it/s]

Epoch: 4	 Training Loss: 0.08108987398445607	 Training Accuracy: 0.9700000077486038
Test Loss: 5.047478199005127, 	Accuracy: 0.9900000095367432


100%|██████████████████████████████████████████████████████████████████████████████████| 20/20 [01:46<00:00,  5.34s/it]


Epoch: 5	 Training Loss: 0.05446241702884436	 Training Accuracy: 0.9840000122785568
Test Loss: 3.997859001159668, 	Accuracy: 1.0
