<a href="https://colab.research.google.com/github/Vengadore/Notebooks/blob/master/Training_DiabeticRetinopathy_Dataset_on_Efficientnet.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Diabetic Retinopathy Detection

Kaggle has a large competition of Diabetic Retinopathy detection which can be found here:
https://www.kaggle.com/c/diabetic-retinopathy-detection/

Their dataset consists in 35126 images labeled from 0 to 4 according to the degree of Retinopathy.
An analysis of the data is provided in this notebook.

In [1]:
!rm -rf sample_data
!nvidia-smi

NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.



In [2]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [None]:
!cp -R /content/drive/My\ Drive/PDR/ ./

### Install dependencies

In [3]:
from IPython.display import clear_output

!pip install torch==1.6.0+cu101 torchvision==0.7.0+cu101 -f https://download.pytorch.org/whl/torch_stable.html
!pip install efficientnet_pytorch
clear_output(wait=False)

### Load annotations

In [1]:
import pandas as pd
import os

data = pd.read_csv("/content/drive/My Drive/PDR/trainLabels.csv")
# Append path
data['image'] = data['image'].apply(lambda x : os.path.join("/content/drive/My Drive/PDR/train",x+".jpeg"))
data.head()

Unnamed: 0,image,level
0,/content/drive/My Drive/PDR/train/10_left.jpeg,0
1,/content/drive/My Drive/PDR/train/10_right.jpeg,0
2,/content/drive/My Drive/PDR/train/13_left.jpeg,0
3,/content/drive/My Drive/PDR/train/13_right.jpeg,0
4,/content/drive/My Drive/PDR/train/15_left.jpeg,1


In [2]:
data.groupby('level').count()

Unnamed: 0_level_0,image
level,Unnamed: 1_level_1
0,25810
1,2443
2,5292
3,873
4,708


#### Split data

In [3]:
from sklearn.model_selection import train_test_split

#Split data
X_train, X_test, y_train, y_test = train_test_split(data['image'], data['level'], test_size=0.22, random_state=42)

## Definition of the model

In [4]:
from efficientnet_pytorch import EfficientNet
import torch

model = EfficientNet.from_pretrained('efficientnet-b0')
## Change efficientnet final layer
model._fc = torch.nn.Linear(in_features=1280,out_features=5,bias = True)

Loaded pretrained weights for efficientnet-b0


In [5]:
from torchvision.transforms import Resize,ToTensor,Compose,Normalize
from torchvision.transforms import RandomHorizontalFlip,RandomVerticalFlip,RandomRotation
from PIL import Image

transforms = Compose([RandomHorizontalFlip(),RandomVerticalFlip(),RandomRotation(15)]) # Transformations for the training images

composed = Compose([Resize((224,312)), # Resize to a fit size for efficientnet
                    ToTensor(),  # Convert into sensor
                    Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))]) # Normalize image

### Training parameters

In [6]:
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
model.to(device);
torch.manual_seed(17)
criterion = torch.nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=0.0001)

In [7]:
classes = {0:[0,0,0,0,1],
           1:[0,0,0,1,0],
           2:[0,0,1,0,0],
           3:[0,1,0,0,0],
           4:[1,0,0,0,0]}

In [8]:
from tqdm import tqdm
import random

epochs = 10
batch_size = 2

for epoch in range(epochs):

    indexes = [idx for idx in range(len(X_train))]
    pbar = tqdm( range(len(X_train)//batch_size),ncols = 100)
    running_loss = 0.0
    running_acc = 0.0
    t = 0

    for step in pbar:
        # Load data
        idx = random.sample(indexes,batch_size)
        X = X_train.iloc[idx]
        y = y_train.iloc[idx]

        # Remove indexes
        [indexes.remove(i) for i in idx]

        # Load images
        images = [Image.open(File) for File in X]
        # Load y_true
        y_true = torch.LongTensor([c for c in y])
        
        # Convert images to tensor
        x_batch = torch.FloatTensor().to(device)
        for image in images:
            P = transforms(image)
            P = composed(P).unsqueeze(0).to(device)
            x_batch = torch.cat((x_batch,P))

        # zero the parameter gradients
        optimizer.zero_grad()
        # forward + backward + optimize
        outputs = model(x_batch)
        loss = criterion(outputs, y_true)
        loss.backward()
        optimizer.step()
        # print statistics
        running_loss += loss.item()
        t += batch_size

        _, preds = torch.max(outputs, 1)
        running_acc += torch.sum(preds == y_true).cpu().detach().numpy()
        acc = torch.sum(preds == y_true).cpu().detach().numpy()/batch_size;
        pbar.set_description("Epoch: {} Accuracy: {:0.5f} Loss: {:0.5f} ".format(epoch+1,running_acc/t,running_loss/t))
    #Validation
    val_acc = 0.0
    val_loss = 0.0
    t = 0
    for point in range(len(X_test)//batch_size):
        with torch.no_grad():

            X = X_test.iloc[point*batch_size:(point+1)*batch_size]
            y = y_test.iloc[point*batch_size:(point+1)*batch_size]


            # Load images
            images = [Image.open(File) for File in X]
            # Load y_true
            y_true = torch.FloatTensor([classes[c] for c in y]).to(device)
            
            # Convert images to tensor
            x_batch = torch.FloatTensor().to(device)
            for image in images:
                P = composed(image).unsqueeze(0).to(device)
                x_batch = torch.cat((x_batch,P))

            
            outputs = Model(x_batch)
            loss = criterion(outputs, y_batch)
            val_loss += loss.item()
            t += batch_size

            val_acc += torch.sum(y_batch.gt(0.5) == outputs.gt(0.5)).cpu().detach().numpy()
    print("\n Validation -- Accuracy: {:0.5f} Loss: {:0.5f} ".format(val_acc/t,val_loss/t))

Epoch: 1 Accuracy: 0.62963 Loss: 0.69403 :   0%|              | 27/13699 [01:24<11:09:45,  2.94s/it]

KeyboardInterrupt: ignored