## Computer Vision (Coding L2)

This aim of this challenge to assess your skills in applying creative and quantitative computer vision and deep learning techniques to measuring the location and diameters of a subject’s pupils and iris from a cropped eye image.

Included here are 500 random images from our Training dataset:

https://www.dropbox.com/sh/wsxpcjcr0f4exg4/AAC8yyF0Q8qFHLmvw07QlAcUa?dl=0


The challenge is specifically as follows:
- Estimate the average diameter (in pixels) of both a) the pupil and b) the iris in each image.
- Alternatively, you can treat these objects as ellipses and estimate the major and minor axes.
- Train a deep learning model with these images and demonstrate the model’s performance on the testing images provided in the link.
- The same approach must be applied to each image.
- You may choose to implement any state-of-the-art model and may also use pre-trained weights to speed up the training process.
- We are interested in seeing how different your approach is from an existing off-the-shelf model, catering to our problem.
- We will go through the code to understand your implementation in depth and to get a sense of your coding style.
- The challenge will be assessed by maximizing accuracy of the inferred diameter across the provided testing images.
- Due: Within 7 days from this email.


### Notes:
The provided mask images contain 3 channels. You may select any channel to access pupil and iris co-ordinates. In any selected channel, pixel values with '0' correspond to the background class, pixels with value '1' correspond to the pupil and '2' refers to the iris class.

### Metrics to implement (6 in total):

• Mean % Pupil diameter error = Absolute(Predicted_pupil_diameter – Groundtruth_pupil_diameter) / Groundtruth_pupil_diameter * 100
[2 in total: using radiusY and radius for the pupil]

• Mean % Iris diameter error = Absolute(Predicted_iris_diameter – Groundtruth_iris_diameter) / Groundtruth_iris_diameter * 100
[2 in total: using radiusY and radius for the iris]

• Mean Pupil IoU (0 to 1)

• Mean Iris IoU (0 to 1)

An example of a cropped eye image and its’ corresponding segmentation mask is shown below.

##  Submission should include a zip file containing the following:

1. Well-commented Python script file or Jupyter Notebook file(.py or .ipynb)
2. If using just a Python script file (.py), please include a document (.docx or .pdf) explaining your approach in brief
3. Please mention your reasoning for different decisions you made in the code comments or the attached document
4. You may use any DL framework from Keras/Tensorflow/PyTorch, although we would prefer PyTorch
5. Trained weights of the model architecture you used/implemented
6. Evaluation code to easily test out the trained model on our internal testing dataset
7. Metric results of your code on the provided testing set

PS: If the size of the zip file is larger than the allowable size limit on DropBox, please upload your submission in a shareable folder and attach the link to your code + model into the document attached to the response.

In [None]:
!pip install git+https://github.com/qubvel/segmentation_models.pytorch

In [None]:
import torch
import cv2
from pathlib import Path
from torch.utils import data
import time

from __future__ import print_function, division
import os
import torch
import pandas as pd

import numpy as np
import matplotlib.pyplot as plt
import segmentation_models_pytorch as smp

from torchvision import transforms, utils

# Ignore warnings
import warnings
warnings.filterwarnings("ignore")

plt.ion()   # interactive mode
proj_path = Path('/content/drive/MyDrive/Colab Notebooks/Code Tasks/Segmentation')

In [None]:
# create fake test data
x = np.random.randint(0, 256, size=(128, 128, 3), dtype=np.uint8)
y = np.random.randint(0, 1, size=(128, 128), dtype=np.uint8)
cv2.imwrite(str(proj_path/"Sample Data"/"img_x.png"), x)
cv2.imwrite(str(proj_path/"Sample Data"/"mask_y.png"), y)

True

In [None]:
class dataset_segmentation(data.Dataset):
    """Image segmentation dataset with caching, pretransforms and multiprocessing. Output is a dict."""

    def __init__(self, inputs: list, targets: list, transform=None, use_cache=False, pre_transform=None):
        self.inputs = inputs
        self.targets = targets
        self.transform = transform
        self.inputs_dtype = torch.float32
        self.targets_dtype = torch.long
        self.use_cache = use_cache
        self.pre_transform = pre_transform

        if self.use_cache:
            from itertools import repeat
            from multiprocessing import Pool

            with Pool() as pool:
                self.cached_data = pool.starmap(
                    self.read_images, zip(inputs, targets, repeat(self.pre_transform)))

    def __len__(self):
        return len(self.inputs)

    def __getitem__(self, index: int):
        if self.use_cache:
            x, y = self.cached_data[index]
        else:
            # Select the sample
            input_ID = self.inputs[index]
            target_ID = self.targets[index]

            # Load input and target
            x, y = cv2.imread(str(input_ID)), cv2.imread(str(target_ID))
            if x is None or y is None: raise Exception("File not found")

        # Preprocessing
        if self.transform is not None: x, y = self.transform(x, y)

        # Typecasting
        x, y = torch.from_numpy(x).type(self.inputs_dtype), torch.from_numpy(y).type(
            self.targets_dtype
        )

        return {
            "x": x,
            "y": y,
            "x_name": self.inputs[index].name,
            "y_name": self.targets[index].name,
        }

    @staticmethod
    def read_images(inp, tar, pre_transform):
        inp, tar = cv2.imread(str(inp)), cv2.imread(str(tar))
        if pre_transform:
            inp, tar = pre_transform(inp, tar)
        return inp, tar


In [None]:
inputs = [proj_path/"Sample Data"/"img_x.png"]
targets = [proj_path/"Sample Data"/"mask_y.png"]

training_dataset = dataset_segmentation(inputs=inputs, targets=targets, transform=None)

training_dataloader = data.DataLoader(dataset=training_dataset, batch_size=1, shuffle=True)

example = next(iter(training_dataloader))
x, y = example['x'], example['y']
print(f'x = shape: {x.shape}; type: {x.dtype}')
print(f'x = min: {x.min()}; max: {x.max()}')
print(f'y = shape: {y.shape}; class: {y.unique()}; type: {y.dtype}')

x = shape: torch.Size([1, 128, 128, 3]); type: torch.float32
x = min: 0.0; max: 255.0
y = shape: torch.Size([1, 128, 128, 3]); class: tensor([0]); type: torch.int64


Data Augmentation

In [None]:
import torchvision.transforms.functional as TF
data_transformations = {
    'train': transforms.Compose([
        transforms.RandomResizedCrop(224),
        transforms.RandomHorizontalFlip(),
        transforms.RandomVerticalFlip(),
        transforms.ToTensor(),
    ]),
    'val': transforms.Compose([
        transforms.Resize(256),
        transforms.CenterCrop(224),
        transforms.ToTensor(),
    ]),
}

In [None]:
# dataset training
dataset_train = dataset_segmentation(inputs=inputs_train,
                                    targets=targets_train,
                                    transform=data_transformations)

# dataset validation
dataset_valid = dataset_segmentation(inputs=inputs_valid,
                                    targets=targets_valid,
                                    transform=data_transformations)

# dataloader training
dataloader_training = data.DataLoader(dataset=dataset_train,
                                 batch_size=2,
                                 shuffle=True)

# dataloader validation
dataloader_validation = data.DataLoader(dataset=dataset_valid,
                                   batch_size=2,
                                   shuffle=True)

Training

In [None]:
def train_model(model, criterion, optimizer, scheduler, num_epochs=25):
    since = time.time()

    best_model_wts = copy.deepcopy(model.state_dict())
    best_acc = 0.0

    for epoch in range(num_epochs):
        print('Epoch {}/{}'.format(epoch, num_epochs - 1))
        print('-' * 10)

        # Each epoch has a training and validation phase
        for phase in ['train', 'val']:
            if phase == 'train':
                model.train()  # Set model to training mode
            else:
                model.eval()   # Set model to evaluate mode

            running_loss = 0.0
            running_corrects = 0

            # Iterate over data.
            for inputs, labels in dataloaders[phase]:
                inputs = inputs.to(device)
                labels = labels.to(device)

                # zero the parameter gradients
                optimizer.zero_grad()

                # forward
                # track history if only in train
                with torch.set_grad_enabled(phase == 'train'):
                    outputs = model(inputs)
                    _, preds = torch.max(outputs, 1)
                    loss = criterion(outputs, labels)

                    # backward + optimize only if in training phase
                    if phase == 'train':
                        loss.backward()
                        optimizer.step()

                # statistics
                running_loss += loss.item() * inputs.size(0)
                running_corrects += torch.sum(preds == labels.data)
            if phase == 'train':
                scheduler.step()

            epoch_loss = running_loss / dataset_sizes[phase]
            epoch_acc = running_corrects.double() / dataset_sizes[phase]

            print('{} Loss: {:.4f} Acc: {:.4f}'.format(
                phase, epoch_loss, epoch_acc))

            # deep copy the model
            if phase == 'val' and epoch_acc > best_acc:
                best_acc = epoch_acc
                best_model_wts = copy.deepcopy(model.state_dict())

        print()

    time_elapsed = time.time() - since
    print('Training complete in {:.0f}m {:.0f}s'.format(
        time_elapsed // 60, time_elapsed % 60))
    print('Best val Acc: {:4f}'.format(best_acc))

    # load best model weights
    model.load_state_dict(best_model_wts)
    return model

Inference

In [None]:
# Imports
import pathlib

import numpy as np
import torch
from skimage.io import imread
from skimage.transform import resize

#from inference import predict
#from unet import UNet

# root directory
root = pathlib.Path.cwd() / 'Carvana' / 'Test'
def get_filenames_of_path(path: pathlib.Path, ext: str = '*'):
    """Returns a list of files in a directory/path. Uses pathlib."""
    filenames = [file for file in path.glob(ext) if file.is_file()]
    return filenames

# input and target files
images_names = get_filenames_of_path(root / 'Input')
targets_names = get_filenames_of_path(root / 'Target')

# read images and store them in memory
images = [imread(img_name) for img_name in images_names]
targets = [imread(tar_name) for tar_name in targets_names]

# Resize images and targets
images_res = [resize(img, (128, 128, 3)) for img in images]
resize_kwargs = {'order': 0, 'anti_aliasing': False, 'preserve_range': True}
targets_res = [resize(tar, (128, 128), **resize_kwargs) for tar in targets]

# device
if torch.cuda.is_available():
    device = torch.device('cuda')
else:
    torch.device('cpu')

# model
model = UNet(in_channels=3,
             out_channels=2,
             n_blocks=4,
             start_filters=32,
             activation='relu',
             normalization='batch',
             conv_mode='same',
             dim=2).to(device)


model_name = 'carvana_model.pt'
model_weights = torch.load(pathlib.Path.cwd() / model_name)

model.load_state_dict(model_weights)


# predict the segmentation maps 
output = [predict(img, model, preprocess, postprocess, device) for img in images_res]

Analysis

In [None]:

# Load image (as BGR for later drawing the circle)
image = cv2.imread('images/hvFJF.jpg', cv2.IMREAD_COLOR)

# Convert to grayscale
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

# Get rid of possible JPG artifacts (when do people learn to use PNG?...)
_, gray = cv2.threshold(gray, 128, 255, cv2.THRESH_BINARY)

# Downsize image (by factor 4) to speed up morphological operations
gray = cv2.resize(gray, dsize=(0, 0), fx=0.25, fy=0.25)

# Morphological Closing: Get rid of the hole
gray = cv2.morphologyEx(gray, cv2.MORPH_CLOSE, cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (5, 5)))

# Morphological opening: Get rid of the stuff at the top of the circle
gray = cv2.morphologyEx(gray, cv2.MORPH_OPEN, cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (121, 121)))

# Resize image to original size
gray = cv2.resize(gray, dsize=(image.shape[1], image.shape[0]))

# Find contours (only most external)
cnts, _ = cv2.findContours(gray, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)

# Draw found contour(s) in input image
image = cv2.drawContours(image, cnts, -1, (0, 0, 255), 2)

cv2.imwrite('images/intermediate.png', gray)
cv2.imwrite('images/result.png', image)