**Overview**  
Imagine being able to detect blindness before it happened.

Millions of people suffer from diabetic retinopathy, the leading cause of blindness among working aged adults. Aravind Eye Hospital in India hopes to detect and prevent this disease among people living in rural areas where medical screening is difficult to conduct. Successful entries in this competition will improve the hospital’s ability to identify potential patients. Further, the solutions will be spread to other Ophthalmologists through the 4th Asia Pacific Tele-Ophthalmology Society (APTOS) Symposium.

Currently, Aravind technicians travel to these rural areas to capture images and then rely on highly trained doctors to review the images and provide diagnosis. Their goal is to scale their efforts through technology; to gain the ability to automatically screen images for disease and provide information on how severe the condition may be.

In this synchronous Kernels-only competition, we need to build a machine learning model to speed up disease detection. we will be working with thousands of images collected in rural areas to help identify diabetic retinopathy automatically. If successful, we will not only help to prevent lifelong blindness, but these models may be used to detect other sorts of diseases in the future, like glaucoma and macular degeneration.  

Competition Page: [Kaggle - Blindness Detection](https://www.kaggle.com/c/aptos2019-blindness-detection/overview)

**About the Dataset**  
We will be provided with a large set of retina images taken using fundus photography under a variety of imaging conditions.

A clinician has rated each image for the severity of diabetic retinopathy on a scale of 0 to 4:  

* 0 - No DR  
* 1 - Mild  
* 2 - Moderate  
* 3 - Severe  
* 4 - Proliferative DR  

Like any real-world data set, we will encounter noise in both the images and labels. Images may contain artifacts, be out of focus, underexposed, or overexposed. The images were gathered from multiple clinics using a variety of cameras over an extended period of time, which will introduce further variation.  

**Files for Analysis & Prediction**

In [None]:
import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)

import seaborn as sns
import matplotlib.pyplot as plt
%matplotlib inline

sns.set()

import os
print(os.listdir("../input"))

import warnings
warnings.filterwarnings("ignore", category=DeprecationWarning)
warnings.filterwarnings("ignore", category=UserWarning)
warnings.filterwarnings("ignore", category=FutureWarning)

**Training Sample**

In [None]:
train_labels = pd.read_csv("../input/train.csv")
print(train_labels.head())

print("There are {0} samples in the Training dataset".format(train_labels.shape[0]))

**Testing Dataset**

In [None]:
test_labels = pd.read_csv("../input/test.csv")
print("We need to predict {0} patients as what severity of diabetic retinopathy they have".format(test_labels.shape[0]))

In [None]:
from pandasql import sqldf
pysqldf = lambda q: sqldf(q, globals())

diag_q = """
select diagnosis, count(distinct id_code) as cnt
From train_labels
GROUP BY diagnosis;
"""

diag_df = pysqldf(diag_q)

import plotly.plotly as py
import plotly.graph_objs as go
from plotly.offline import download_plotlyjs, init_notebook_mode, plot, iplot
init_notebook_mode(connected=True)

fig = {
  "data": [
    {
      "values": diag_df.cnt,
      "labels": diag_df.diagnosis,
      "domain": {"x": [0, .5]},
      "hoverinfo":"label+percent",
      "hole": .2,
      "type": "pie"
    },],
 "layout": {
        "title":"Severity Proportion of Diabetic Retinopathy"
    }
}

iplot(fig)

Severity Proportion of Diabetic Retinopathy Summary
* 0 - No DR - 49.3%
* 2 - Moderate - 27.3%
* 1 - Mild - 10.1%
* 4 - Proliferative DR - 8.06%
* 3 - Severe - 5.27%

**Displaying sample original image without resizing**

In [None]:
from IPython.display import Image
from IPython.display import display
im_0 = Image(filename ='../input/train_images/002c21358ce6.png') 
im_1 = Image(filename ='../input/train_images/0024cdab0c1e.png')
im_2 = Image(filename ='../input/train_images/000c1434d8d7.png')
im_3 = Image(filename ='../input/train_images/0104b032c141.png')
im_4 = Image(filename ='../input/train_images/02685f13cefd.png')
display(im_0, im_1, im_2, im_3, im_4)

**Dividing the dataset based on Severity of Diabetic Retinopathy**

In [None]:
DATA_PATH = '../input/'
TRAIN_IMG_PATH = os.path.join(DATA_PATH, 'train_images')
TEST_IMG_PATH = os.path.join(DATA_PATH, 'test_images')
TRAIN_LABEL_PATH = os.path.join(DATA_PATH, 'train.csv')
TEST_LABEL_PATH = os.path.join(DATA_PATH, 'test.csv')

train_df = pd.read_csv(TRAIN_LABEL_PATH)
test_df = pd.read_csv(TEST_LABEL_PATH)

train_labels_0 = train_df[train_df.diagnosis == 0].reset_index()
train_labels_1 = train_df[train_df.diagnosis == 1].reset_index()
train_labels_2 = train_df[train_df.diagnosis == 2].reset_index()
train_labels_3 = train_df[train_df.diagnosis == 3].reset_index()
train_labels_4 = train_df[train_df.diagnosis == 4].reset_index()

In [None]:
%matplotlib inline
from PIL import Image
import matplotlib.pyplot as plt
f,ax = plt.subplots(1,5, figsize=(15,15))
plt.rcParams["axes.grid"] = False

#0 - No DR
for i in range(5):
    img_path = TRAIN_IMG_PATH+'/'+train_labels_0['id_code'][i]+'.png'
    img = Image.open(img_path)
    img.thumbnail((200,200))
    ax[i].imshow(img)
    ax[i].set_axis_off() 
print("Diabetic Retinopathy of Severity 0 - No DR")
plt.show()

In [None]:
from PIL import Image
f,ax = plt.subplots(1,5, figsize=(15,15))
plt.rcParams["axes.grid"] = False
#1 - Mild DR
for i in range(5):
    img_path = TRAIN_IMG_PATH+'/'+train_labels_1['id_code'][i]+'.png'
    img = Image.open(img_path)
    img.thumbnail((200,200))
    ax[i].imshow(img)
    ax[i].set_axis_off() 
print("Diabetic Retinopathy of Severity 1 - Mild DR")
plt.show()

In [None]:
from PIL import Image
f,ax = plt.subplots(1,5, figsize=(15,15))
plt.rcParams["axes.grid"] = False

#2 - Moderate DR
for i in range(5):
    img_path = TRAIN_IMG_PATH+'/'+train_labels_2['id_code'][i]+'.png'
    img = Image.open(img_path)
    img.thumbnail((200,200))
    ax[i].imshow(img)
    ax[i].set_axis_off() 
print("Diabetic Retinopathy of Severity 2 - Moderate DR")
plt.show()

In [None]:
from PIL import Image
f,ax = plt.subplots(1,5, figsize=(15,15))
plt.rcParams["axes.grid"] = False

#3 - Severe DR
for i in range(5):
    img_path = TRAIN_IMG_PATH+'/'+train_labels_3['id_code'][i]+'.png'
    img = Image.open(img_path)
    img.thumbnail((200,200))
    ax[i].imshow(img)
    ax[i].set_axis_off() 
print("Diabetic Retinopathy of Severity 3 - Severe DR")
plt.show()

In [None]:
from PIL import Image
f,ax = plt.subplots(1,5, figsize=(15,15))
plt.rcParams["axes.grid"] = False

#4 - Proliferative DR
for i in range(5):
    img_path = TRAIN_IMG_PATH+'/'+train_labels_4['id_code'][i]+'.png'
    img = Image.open(img_path)
    img.thumbnail((200,200))
    plt.title(train_labels_4['id_code'][i])
    ax[i].title.set_text(train_labels_4['id_code'][i])
    ax[i].imshow(img)
    ax[i].set_axis_off() 
print("Diabetic Retinopathy of Severity 4 - Proliferative DR")
plt.show()

**Applying 'jet' on Diabetic Retinopathy images**

In [None]:
import cv2
f,ax = plt.subplots(1,5, figsize=(15,15))
plt.rcParams["axes.grid"] = False

#0 - No DR
for i in range(5):
    img = cv2.imread(TRAIN_IMG_PATH+'/'+train_labels_0['id_code'][i]+'.png',0)
    plt.imshow(img)
    ax[i].imshow(img, cmap="jet")
    ax[i].set_axis_off() 
print("Diabetic Retinopathy of Severity 0 - No DR")
plt.show()

In [None]:
import cv2
f,ax = plt.subplots(1,5, figsize=(15,15))
plt.rcParams["axes.grid"] = False

#1 - Mild DR
for i in range(5):
    img = cv2.imread(TRAIN_IMG_PATH+'/'+train_labels_1['id_code'][i]+'.png',0)
    plt.imshow(img)
    ax[i].imshow(img, cmap="jet")
    ax[i].set_axis_off() 
print("Diabetic Retinopathy of Severity 1 - Mild DR")
plt.show()

In [None]:
import cv2
f,ax = plt.subplots(1,5, figsize=(15,15))
plt.rcParams["axes.grid"] = False

#2 - Moderate DR
for i in range(5):
    img = cv2.imread(TRAIN_IMG_PATH+'/'+train_labels_2['id_code'][i]+'.png',0)
    plt.imshow(img)
    ax[i].imshow(img, cmap="jet")
    ax[i].set_axis_off() 
print("Diabetic Retinopathy of Severity 2 - Moderate DR")
plt.show()

In [None]:
import cv2
f,ax = plt.subplots(1,5, figsize=(15,15))
plt.rcParams["axes.grid"] = False

#3 - Severe DR
for i in range(5):
    img = cv2.imread(TRAIN_IMG_PATH+'/'+train_labels_3['id_code'][i]+'.png',0)
    plt.imshow(img)
    ax[i].imshow(img, cmap="jet")
    ax[i].set_axis_off() 
print("Diabetic Retinopathy of Severity 3 - Severe DR")
plt.show()

In [None]:
import cv2
f,ax = plt.subplots(1,5, figsize=(15,15))
plt.rcParams["axes.grid"] = False

#4 - Proliferative DR
for i in range(5):
    img = cv2.imread(TRAIN_IMG_PATH+'/'+train_labels_4['id_code'][i]+'.png',0)
    plt.imshow(img)
    ax[i].imshow(img, cmap="jet")
    ax[i].set_axis_off()  
print("Diabetic Retinopathy of Severity 4 - Proliferative DR")
plt.show()

**Applying 'PiYG' on Diabetic Retinopathy images**

In [None]:
import cv2
f,ax = plt.subplots(1,5, figsize=(15,15))
plt.rcParams["axes.grid"] = False

#0 - No DR
for i in range(5):
    img = cv2.imread(TRAIN_IMG_PATH+'/'+train_labels_0['id_code'][i]+'.png',0)
    plt.imshow(img)
    ax[i].imshow(img, cmap="PiYG")
    ax[i].set_axis_off() 
print("Diabetic Retinopathy of Severity 0 - No DR")
plt.show()


In [None]:
import cv2
f,ax = plt.subplots(1,5, figsize=(15,15))
plt.rcParams["axes.grid"] = False

#1 - Mild DR
for i in range(5):
    img = cv2.imread(TRAIN_IMG_PATH+'/'+train_labels_1['id_code'][i]+'.png',0)
    plt.imshow(img)
    ax[i].imshow(img, cmap="PiYG")
    ax[i].set_axis_off() 
print("Diabetic Retinopathy of Severity 1 - Mild DR")
plt.show()

In [None]:
import cv2
f,ax = plt.subplots(1,5, figsize=(15,15))
plt.rcParams["axes.grid"] = False

#2 - Moderate DR
for i in range(5):
    img = cv2.imread(TRAIN_IMG_PATH+'/'+train_labels_2['id_code'][i]+'.png',0)
    plt.imshow(img)
    ax[i].imshow(img, cmap="PiYG")
    ax[i].set_axis_off() 
print("Diabetic Retinopathy of Severity 2 - Moderate DR")
plt.show()

In [None]:
import cv2
f,ax = plt.subplots(1,5, figsize=(15,15))
plt.rcParams["axes.grid"] = False

#3 - Severe DR
for i in range(5):
    img = cv2.imread(TRAIN_IMG_PATH+'/'+train_labels_3['id_code'][i]+'.png',0)
    plt.imshow(img)
    ax[i].imshow(img, cmap="PiYG")
    ax[i].set_axis_off() 
print("Diabetic Retinopathy of Severity 3 - Severe DR")
plt.show()

In [None]:
import cv2
f,ax = plt.subplots(1,5, figsize=(15,15))
plt.rcParams["axes.grid"] = False

#4 - Proliferative DR
for i in range(5):
    img = cv2.imread(TRAIN_IMG_PATH+'/'+train_labels_4['id_code'][i]+'.png',0)
    edges = cv2.Canny(img,100,200)
    plt.imshow(img)
    ax[i].imshow(img, cmap="PiYG")
    ax[i].set_axis_off()  
print("Diabetic Retinopathy of Severity 4 - Proliferative DR")
plt.show()

**Applying 'gray' on Diabetic Retinopathy images**

In [None]:
import cv2
f,ax = plt.subplots(1,5, figsize=(15,15))
plt.rcParams["axes.grid"] = False

#0 - No DR
for i in range(5):
    img = cv2.imread(TRAIN_IMG_PATH+'/'+train_labels_0['id_code'][i]+'.png',0)
    plt.imshow(img)
    ax[i].imshow(img, cmap="gray")
    ax[i].set_axis_off() 
print("Diabetic Retinopathy of Severity 0 - No DR")
plt.show()


In [None]:
import cv2
f,ax = plt.subplots(1,5, figsize=(15,15))
plt.rcParams["axes.grid"] = False

#1 - Mild DR
for i in range(5):
    img = cv2.imread(TRAIN_IMG_PATH+'/'+train_labels_1['id_code'][i]+'.png',0)
    plt.imshow(img)
    ax[i].imshow(img, cmap="gray")
    ax[i].set_axis_off() 
print("Diabetic Retinopathy of Severity 1 - Mild DR")
plt.show()

In [None]:
import cv2
f,ax = plt.subplots(1,5, figsize=(15,15))
plt.rcParams["axes.grid"] = False

#2 - Moderate DR
for i in range(5):
    img = cv2.imread(TRAIN_IMG_PATH+'/'+train_labels_2['id_code'][i]+'.png',0)
    plt.imshow(img)
    ax[i].imshow(img, cmap="gray")
    ax[i].set_axis_off() 
print("Diabetic Retinopathy of Severity 2 - Moderate DR")
plt.show()

In [None]:
import cv2
f,ax = plt.subplots(1,5, figsize=(15,15))
plt.rcParams["axes.grid"] = False

#3 - Severe DR
for i in range(5):
    img = cv2.imread(TRAIN_IMG_PATH+'/'+train_labels_3['id_code'][i]+'.png',0)
    plt.imshow(img)
    ax[i].imshow(img, cmap="gray")
    ax[i].set_axis_off() 
print("Diabetic Retinopathy of Severity 3 - Severe DR")
plt.show()

In [None]:
import cv2
f,ax = plt.subplots(1,5, figsize=(15,15))
plt.rcParams["axes.grid"] = False

#4 - Proliferative DR
for i in range(5):
    img = cv2.imread(TRAIN_IMG_PATH+'/'+train_labels_4['id_code'][i]+'.png',0)
    edges = cv2.Canny(img,100,200)
    plt.imshow(img)
    ax[i].imshow(img, cmap="gray")
    ax[i].set_axis_off()  
print("Diabetic Retinopathy of Severity 4 - Proliferative DR")
plt.show()