# [Diabetic retinopathy detection](https://www.kaggle.com/c/diabetic-retinopathy-detection/)

## [Launch this notebook in Google CoLab](https://colab.research.google.com/github/rahulremanan/HIMA/blob/master/examples/Notebooks/10_Kaggle_diabetic_retinopathy/Kaggle_diabetic_retinopathy_detection.ipynb)

## [An overview of the diabetic retinopathy](https://nei.nih.gov/health/diabetic/retinopathy)
Diabetic retinopathy is a complication of diabetes affects the retinas of the eyes. The retinas are the part of the eye that detects ligh and converts into nerve signals that are then conveyed to the visual cortex in the brain, via the optic nerve. It is the most common cause of vision loss among individuals with diabetes.

It's caused by damage to the blood vessels of the light-sensitive tissue at the back of the eye (retina), due to the leaking of fluid from the retinal blood vessels or due to a due to a hemorrhage (bleeding) from these blood vessels. At first, diabetic retinopathy may cause no symptoms or only mild vision problems.

## Creating a deep neural network that detects the presence of diabetic retinopathy:

The dataset is a large set of high-resolution retina images taken under a variety of imaging conditions. A left and right field is provided for every subject. Images are labeled with a subject id as well as either left or right (e.g. 1_left.jpeg is the left eye of patient id 1).

A clinician has rated the presence of diabetic retinopathy in each image on a scale of 0 to 4.

This staging has direct relevance to progression of the disease.

Using these images and corresponding labels a deep convolutional neural network can be trained to detect retinal images for features of diabetic retinopathy.

## An overview of the diabetic retinopathy staging:

**1) Mild nonproliferative retinopathy:** Small areas of balloon-like swelling in the retina’s tiny blood vessels, called microaneurysms, occur at this earliest stage of the disease. These microaneurysms may leak fluid into the retina.

**2) Moderate nonproliferative retinopathy:** As the disease progresses, blood vessels that nourish the retina may swell and distort. They may also lose their ability to transport blood. Both conditions cause characteristic changes to the appearance of the retina and may contribute to DME.

**3) Severe nonproliferative retinopathy:** Many more blood vessels are blocked, depriving blood supply to areas of the retina. These areas secrete growth factors that signal the retina to grow new blood vessels.

**4) Proliferative diabetic retinopathy (PDR):** At this advanced stage, growth factors secreted by the retina trigger the proliferation of new blood vessels, which grow along the inside surface of the retina and into the vitreous gel, the fluid that fills the eye. The new blood vessels are fragile, which makes them more likely to leak and bleed. Accompanying scar tissue can contract and cause retinal detachment—the pulling away of the retina from underlying tissue, like wallpaper peeling away from a wall. Retinal detachment can lead to permanent vision loss.

## About the dataset:

The images in the dataset come from different models and types of cameras, which can affect the visual appearance of left vs. right. Some images are shown as one would see the retina anatomically (macula on the left, optic nerve on the right for the right eye). Others are shown as one would see through a microscope condensing lens (i.e. inverted, as one sees in a typical live eye exam). There are generally two ways to tell if an image is inverted:

It is inverted if the macula (the small dark central area) is slightly higher than the midline through the optic nerve. If the macula is lower than the midline of the optic nerve, it's not inverted.
If there is a notch on the side of the image (square, triangle, or circle) then it's not inverted. If there is no notch, it's inverted.

Like any real-world data set, you will encounter noise in both the images and labels. Images may contain artifacts, be out of focus, underexposed, or overexposed. A major aim of this competition is to develop robust algorithms that can function in the presence of noise and variation.

## Data labels:
* 0 - No DR
* 1 - Mild
* 2 - Moderate
* 3 - Severe
* 4 - Proliferative DR

In [0]:
setup = True
fetch_raw_data = True
colab_mode = True

dataset_id = 'kaggle_adiabetic_retinopathy'

In [0]:
import os 
import sys
import subprocess
import gc

In [0]:
def execute_in_shell(command=None, 
                     verbose = False):
    """ 
        command -- keyword argument, takes a list as input
        verbsoe -- keyword argument, takes a boolean value as input
    
        This is a function that executes shell scripts from within python.
        
        Keyword argument 'command', should be a list of shell commands.
        Keyword argument 'verbose', should be a boolean value to set verbose level.
        
        Example usage: execute_in_shell(command = ['ls ./some/folder/',
                                                    ls ./some/folder/  -1 | wc -l'],
                                        verbose = True ) 
                                        
        This command returns dictionary with elements: Output and Error.
        
        Output records the console output,
        Error records the console error messages.
                                        
    """
    error = []
    output = []
    
    if isinstance(command, list):
        for i in range(len(command)):
            try:
                process = subprocess.Popen(command[i], shell=True, stdout=subprocess.PIPE)
                process.wait()
                out, err = process.communicate()
                error.append(err)
                output.append(out)
                if verbose:
                    print ('Success running shell command: {}'.format(command[i]))
            except Exception as e:
                print ('Failed running shell command: {}'.format(command[i]))
                if verbose:
                    print(type(e))
                    print(e.args)
                    print(e)
                
    else:
        print ('The argument command takes a list input ...')
    return {'Output': output, 'Error': error }

In [0]:
command = ['pip3 install -q kaggle PyDrive scikit-optimize >/dev/null 2>&1',
           'mkdir /content/',
           'mkdir /content/.kaggle/',
           'mkdir ./{}/'.format(dataset_id)]

In [0]:
if setup and colab_mode:
  execute_in_shell(command = command, 
                   verbose = True)

In [0]:
if colab_mode:
    from pydrive.auth import GoogleAuth
    from pydrive.drive import GoogleDrive
    from google.colab import auth
    from oauth2client.client import GoogleCredentials
    from googleapiclient.http import MediaIoBaseDownload
    
import io
import glob
import fnmatch
import random

from multiprocessing import Process

import os, sys, math
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from PIL import Image
import cv2
from imgaug import augmenters as iaa
from tqdm import tqdm

import warnings
warnings.filterwarnings("ignore")

In [0]:
import argparse
import os
import random
import time
import sys
import glob
try:
    import h5py
except:
    print ('Package h5py needed for saving model weights ...')
    sys.exit(1)
import json
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
try:
    import tensorflow
    import keras
except:
    print ('This code uses tensorflow deep-learning framework and keras api ...')
    print ('Install tensorflow and keras to train the classifier ...')
    sys.exit(1)
import PIL
from collections import defaultdict
from keras.applications.inception_v3 import InceptionV3,    \
                                            preprocess_input as preprocess_input_inceptionv3
from keras.applications.inception_resnet_v2 import InceptionResNetV2,    \
                                            preprocess_input as preprocess_input_inceptionv4
from keras.models import Model,                             \
                         model_from_json,                    \
                         load_model
from keras.layers import Dense,                             \
                         GlobalAveragePooling2D,            \
                         Dropout,                           \
                         BatchNormalization
from keras.layers.merge import concatenate
from keras.preprocessing.image import ImageDataGenerator
from keras.regularizers import l2
from keras.optimizers import SGD,                           \
                             RMSprop,                       \
                             Adagrad,                       \
                             Adadelta,                      \
                             Adam,                          \
                             Adamax,                        \
                             Nadam
from keras.callbacks import EarlyStopping,   \
                            ModelCheckpoint, \
                            ReduceLROnPlateau
                            
from multiprocessing import Process

## Authenticate notebook session to access Kaggle

The authentication of the notebook session in CoLab can be done via the Kaggle API. Download the kaggle.json file to your computer, from [Kaggle account settings](https://www.kaggle.com/{USERNAME}/account) under API, using the **"Create New API Token"** button. 

Upload that kaggle.json file to this session by executing the cell below.

In [0]:
if setup and fetch_raw_data and colab_mode:
  from google.colab import files
  uploaded = files.upload()

## Download the data

Using the Kaggle authentication token, the retinal imaging data will be downloaded, via the [Kaggle API](https://github.com/Kaggle/kaggle-api).

`kaggle competitions download -c diabetic-retinopathy-detection`

Before downloading the data, ensure that you have accepted the [Kaggle platform rules for the diabetic retinopathy detection competition](https://www.kaggle.com/c/diabetic-retinopathy-detection/rules).

In [0]:
command = ['mkdir ~/.kaggle/',
           'mv ./kaggle.json /root/.kaggle/',
           'chmod 600 ~/.kaggle/kaggle.json',
           'kaggle competitions download -c diabetic-retinopathy-detection']

In [0]:
if fetch_raw_data:
  execute_in_shell(command = command, verbose = True)

In [0]:
! kaggle competitions download -c diabetic-retinopathy-detection