<a href="https://colab.research.google.com/github/Joycechidi/Deep-Learning-/blob/master/CNN.Malaria_Cell_Detection_Using_VGGNet_Transfer_Learning.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
from google.colab import drive
drive.mount('/content/gdrive')

Go to this URL in a browser: https://accounts.google.com/o/oauth2/auth?client_id=947318989803-6bn6qk8qdgf4n4g3pfee6491hc0brc4i.apps.googleusercontent.com&redirect_uri=urn%3Aietf%3Awg%3Aoauth%3A2.0%3Aoob&scope=email%20https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fdocs.test%20https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fdrive%20https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fdrive.photos.readonly%20https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fpeopleapi.readonly&response_type=code

Enter your authorization code:
··········
Mounted at /content/gdrive


## Transfer Learning
In this notebook, I'll be using VGGNet trained on the ImageNet dataset as a feature extractor. 

I'm using VGGNet because it's simple and has great performance, coming in second in the ImageNet competition. 

All the convolutional layers will be kept but I will replace the final fully-connected layer with my own classifier. This way I can use VGGNet as a fixed feature extractor for my images then easily train a simple classifier on top of that.

*   Use all but the last fully-connected layer as a fixed feature extractor.
*   Define a new, final classification layer and apply it to a task of our choice!

I am trying to use the dataset containing infected and uninfected malaria cell images. Downloaded from Kaggle here [https://www.kaggle.com/iarunava/cell-images-for-detecting-malaria/download]

### Objective
Build a convolutional neural network that can detect the presence of malaria parasite in blood cells.

After training this dataset with VGGNet, I will also train with ResNet pre-trained model and then compare the performance between the two models.

In [0]:
import os 
import numpy as np
import pandas as pd


import torch
import torchvision
from torchvision import datasets, models, transforms

import torch.nn as nn
from torch.utils.data import DataLoader
from torch.utils.data.sampler import SubsetRandomSampler
import torch.nn.functional as F

import matplotlib.pyplot as plt


%matplotlib inline

In [3]:
print(os.listdir("/content/gdrive/My Drive/Udacity_AI_Codes/DLND_Works/data/cell_images/cell_images/"))

['Uninfected', 'Parasitized', '.DS_Store']


In [4]:
# check if CUDA is available
train_on_gpu = torch.cuda.is_available()

if not train_on_gpu:
    print('CUDA is not available.  Training on CPU ...')
else:
    print('CUDA is available!  Training on GPU ...') 

CUDA is available!  Training on GPU ...


## Loading the Data

In [0]:
# define training and test data directories
data_dir = os.path.join("/content/gdrive/My Drive/Udacity_AI_Codes/DLND_Works/data/cell_images/cell_images/")

# train_dir = os.path.join(data_dir, 'train/')
# test_dir = os.path.join(data_dir, 'test/')

# classes are folders in each directory with these names
classes = ['Uninfected', 'Parasitized']

## Transforming the Data

Since I am training this dataset with a pre-trained model, I will have to shape the input data into the shape the pre-trained model expects.

VGG16 expects 224-dim square images as input, so I'll resize each cell image to into this mold/shape.

### Data Augmentation


1.   Resized all images as input to 224 as expected by VGGNet
2.   Apply different transformation by rotating the images horizontally and vertically
3.   Convert images into PyTorch Tensors






In [0]:
# load and transform data using ImageFolder

# VGG-16 Takes 224x224 images as input, so we resize all of them
data_transform = {"train_transforms": transforms.Compose([transforms.RandomResizedCrop(224),
                                     transforms.ColorJitter(0.05),
                                     transforms.RandomHorizontalFlip(),
                                     transforms.RandomVerticalFlip(),
                                     transforms.RandomRotation(20),
                                      transforms.ToTensor()]),
                  "valid_transforms": transforms.Compose([transforms.Resize(224),
                                                         transforms.CenterCrop(224),
                                                         transforms.ToTensor()]),
                  "test_transforms": transforms.Compose([transforms.Resize(224),
                                           transforms.CenterCrop(224),
                                           transforms.ToTensor()])}

In [0]:
test_size = 0.2

num_train = len(train_data)
indices = list(range(num_train))
np.random.shuffle(indices)

test_split = int(np.floor((test_size) * num_train))
test_index, train_index = indices[:test_split - 1], indices[test_split - 1:]

train_sampler = SubsetRandomSampler(train_index)
test_sampler = SubsetRandomSampler(test_index)

train_loader = DataLoader(train_data, sampler=train_sampler, batch_size=104)
test_loader = DataLoader(train_data, sampler=test_sampler, batch_size=58)
print("Images in Test set: {}\nImages in Train set: {}".format(len(test_index), len(train_index)))

## **DataLoaders**

In [0]:
# Define dataloaders parameters
batch_size = 100 
num_workers = 0

#prepare dataloaders
train_loader = torch.utils.data.DataLoader(train_data, batch_size=batch_size, 
                                           num_workers=num_workers, shuffle=True)
test_loader = torch.utils.data.DataLoader(test_data, batch_size=batch_size, 
                                          num_workers=num_workers, shuffle=True)

## **Data Visualization**



In [0]:
# obtain one batch of training images
dataiter = iter(train_loader)
images, labels = dataiter.next()
images = images.numpy() # convert images to numpy for display

# plot the images in the batch, along with the corresponding labels
fig = plt.figure(figsize=(25, 4))
for idx in np.arange(100):
    ax = fig.add_subplot(10, 100/10, idx+1, xticks=[], yticks=[])
    plt.imshow(np.transpose(images[idx], (1, 2, 0)))
    ax.set_title(classes[labels[idx]])