<a href="https://colab.research.google.com/github/Vengadore/Notebooks/blob/master/Training_DiabeticRetinopathy_Dataset_on_Efficientnet.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Diabetic Retinopathy Detection

Kaggle has a large competition of Diabetic Retinopathy detection which can be found here:
https://www.kaggle.com/c/diabetic-retinopathy-detection/

Their dataset consists in 35126 images labeled from 0 to 4 according to the degree of Retinopathy.
An analysis of the data is provided in this notebook.

In [1]:
!rm -rf sample_data
!nvidia-smi

Sat Nov  7 06:27:47 2020       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 455.32.00    Driver Version: 418.67       CUDA Version: 10.1     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|   0  Tesla T4            Off  | 00000000:00:04.0 Off |                    0 |
| N/A   52C    P8    10W /  70W |      0MiB / 15079MiB |      0%      Default |
|                               |                      |                 ERR! |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Proces

In [2]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [1]:
!cp -R -v /content/drive/My\ Drive/PDR/ ./

'/content/drive/My Drive/PDR/trainLabels.csv' -> './PDR/trainLabels.csv'
'/content/drive/My Drive/PDR/train/11909_left.jpeg' -> './PDR/train/11909_left.jpeg'
'/content/drive/My Drive/PDR/train/11909_right.jpeg' -> './PDR/train/11909_right.jpeg'
'/content/drive/My Drive/PDR/train/11910_left.jpeg' -> './PDR/train/11910_left.jpeg'
'/content/drive/My Drive/PDR/train/11913_left.jpeg' -> './PDR/train/11913_left.jpeg'
'/content/drive/My Drive/PDR/train/11910_right.jpeg' -> './PDR/train/11910_right.jpeg'
'/content/drive/My Drive/PDR/train/11913_right.jpeg' -> './PDR/train/11913_right.jpeg'
'/content/drive/My Drive/PDR/train/11914_left.jpeg' -> './PDR/train/11914_left.jpeg'
'/content/drive/My Drive/PDR/train/11914_right.jpeg' -> './PDR/train/11914_right.jpeg'
'/content/drive/My Drive/PDR/train/11915_left.jpeg' -> './PDR/train/11915_left.jpeg'
'/content/drive/My Drive/PDR/train/11915_right.jpeg' -> './PDR/train/11915_right.jpeg'
'/content/drive/My Drive/PDR/train/11916_left.jpeg' -> './PDR/train

### Install dependencies

In [4]:
from IPython.display import clear_output

!pip install torch==1.6.0+cu101 torchvision==0.7.0+cu101 -f https://download.pytorch.org/whl/torch_stable.html
!pip install efficientnet_pytorch
clear_output(wait=False)

### Load annotations

In [1]:
import pandas as pd
import os

data = pd.read_csv("/content/drive/My Drive/PDR/trainLabels.csv")
# Append path
data['image'] = data['image'].apply(lambda x : os.path.join("/content/drive/My Drive/PDR/train",x+".jpeg"))
data.head()

Unnamed: 0,image,level
0,/content/drive/My Drive/PDR/train/10_left.jpeg,0
1,/content/drive/My Drive/PDR/train/10_right.jpeg,0
2,/content/drive/My Drive/PDR/train/13_left.jpeg,0
3,/content/drive/My Drive/PDR/train/13_right.jpeg,0
4,/content/drive/My Drive/PDR/train/15_left.jpeg,1


In [2]:
data.groupby('level').count()

Unnamed: 0_level_0,image
level,Unnamed: 1_level_1
0,25810
1,2443
2,5292
3,873
4,708


In [3]:
## Sample dataframes to make them even
level0 = data[data['level']==0].sample(708,random_state=42)
level1 = data[data['level']==1].sample(708,random_state=42)
level2 = data[data['level']==2].sample(708,random_state=42)
level3 = data[data['level']==3].sample(708,random_state=42)
level4 = data[data['level']==4].sample(708,random_state=42)

In [4]:
new_data = pd.concat((level0,level1,level2,level3,level4),axis=0).reset_index()
new_data.groupby('level').count()

Unnamed: 0_level_0,index,image
level,Unnamed: 1_level_1,Unnamed: 2_level_1
0,708,708
1,708,708
2,708,708
3,708,708
4,708,708


In [5]:
data = new_data.copy()
data.head()

Unnamed: 0,index,image,level
0,13639,/content/drive/My Drive/PDR/train/17123_right....,0
1,10003,/content/drive/My Drive/PDR/train/12616_right....,0
2,5196,/content/drive/My Drive/PDR/train/6541_left.jpeg,0
3,11487,/content/drive/My Drive/PDR/train/14418_right....,0
4,31342,/content/drive/My Drive/PDR/train/39598_left.jpeg,0


#### Split data

In [6]:
from sklearn.model_selection import train_test_split

#Split data
X_train, X_test, y_train, y_test = train_test_split(data['image'], data['level'], test_size=0.22, random_state=42)

## Definition of the model

In [7]:
from efficientnet_pytorch import EfficientNet
import torch

model = EfficientNet.from_pretrained('efficientnet-b0')
## Change efficientnet final layer
model._fc = torch.nn.Linear(in_features=1280,out_features=5,bias = True)

Downloading: "https://github.com/lukemelas/EfficientNet-PyTorch/releases/download/1.0/efficientnet-b0-355c32eb.pth" to /root/.cache/torch/hub/checkpoints/efficientnet-b0-355c32eb.pth


HBox(children=(FloatProgress(value=0.0, max=21388428.0), HTML(value='')))


Loaded pretrained weights for efficientnet-b0


In [8]:
from torchvision.transforms import Resize,ToTensor,Compose,Normalize
from torchvision.transforms import RandomHorizontalFlip,RandomVerticalFlip,RandomRotation
from PIL import Image

transforms = Compose([RandomHorizontalFlip(),RandomVerticalFlip(),RandomRotation(15)]) # Transformations for the training images

composed = Compose([Resize((224,312)), # Resize to a fit size for efficientnet
                    ToTensor(),  # Convert into sensor
                    Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))]) # Normalize image

### Training parameters

In [9]:
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
model.to(device);
torch.manual_seed(17)
criterion = torch.nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=0.0001)

In [10]:
try:
    model = torch.load('checkpoint.ph').to(device)
except:
    print("No Checkpoint loaded")

No Checkpoint loaded


In [11]:
classes = {0:[0,0,0,0,1],
           1:[0,0,0,1,0],
           2:[0,0,1,0,0],
           3:[0,1,0,0,0],
           4:[1,0,0,0,0]}

In [12]:
from tqdm import tqdm
import random

epochs = 10
batch_size = 64

for epoch in range(epochs):

    indexes = [idx for idx in range(len(X_train))]
    pbar = tqdm( range(len(X_train)//batch_size),ncols = 100)
    running_loss = 0.0
    running_acc = 0.0
    t = 0

    for step in pbar:
        # Load data
        idx = random.sample(indexes,batch_size)
        X = X_train.iloc[idx]
        y = y_train.iloc[idx]

        # Remove indexes
        [indexes.remove(i) for i in idx]

        # Load images
        try:
            images = [Image.open(File) for File in X]
        except:
            continue
        # Load y_true
        y_true = torch.LongTensor([c for c in y]).to(device)
        
        # Convert images to tensor
        x_batch = torch.FloatTensor().to(device)
        for image in images:
            P = transforms(image)
            P = composed(P).unsqueeze(0).to(device)
            x_batch = torch.cat((x_batch,P))

        # zero the parameter gradients
        optimizer.zero_grad()
        # forward + backward + optimize
        outputs = model(x_batch)
        loss = criterion(outputs, y_true)
        loss.backward()
        optimizer.step()
        # print statistics
        running_loss += loss.item()
        t += batch_size

        _, preds = torch.max(outputs, 1)
        running_acc += torch.sum(preds == y_true).cpu().detach().numpy()
        acc = torch.sum(preds == y_true).cpu().detach().numpy()/batch_size;
        pbar.set_description("Epoch: {} Accuracy: {:0.5f} Loss: {:0.5f} ".format(epoch+1,running_acc/t,loss.item()))
    #Validation
    val_acc = 0.0
    val_loss = 0.0
    t = 0
    for point in range(len(X_test)//batch_size):
        with torch.no_grad():

            X = X_test.iloc[point*batch_size:(point+1)*batch_size]
            y = y_test.iloc[point*batch_size:(point+1)*batch_size]


            # Load images
            try:
                images = [Image.open(File) for File in X]
            except:
                continue
            # Load y_true
            y_true = torch.LongTensor([c for c in y]).to(device)
            
            # Convert images to tensor
            x_batch = torch.FloatTensor().to(device)
            for image in images:
                P = composed(image).unsqueeze(0).to(device)
                x_batch = torch.cat((x_batch,P))

            
            outputs = model(x_batch)
            loss = criterion(outputs, y_true)
            val_loss += loss.item()
            t += batch_size
            _, preds = torch.max(outputs, 1)
            val_acc += torch.sum(preds == y_true).cpu().detach().numpy()
    print("\n Validation -- Accuracy: {:0.5f} Loss: {:0.5f} ".format(val_acc/t,loss.item()))
    try:
        torch.save(model,"/content/drive/My Drive/PDR/checkpoint{}.ph".format(epoch))
    except:
        continue

Epoch: 1 Accuracy: 0.33576 Loss: 1.41804 : 100%|████████████████████| 43/43 [23:01<00:00, 32.13s/it]
  0%|                                                                        | 0/43 [00:00<?, ?it/s]


 Validation -- Accuracy: 0.40755 Loss: 1.36848 


Epoch: 2 Accuracy: 0.44150 Loss: 1.11833 : 100%|████████████████████| 43/43 [10:57<00:00, 15.29s/it]
  0%|                                                                        | 0/43 [00:00<?, ?it/s]


 Validation -- Accuracy: 0.47526 Loss: 1.14137 


Epoch: 3 Accuracy: 0.50109 Loss: 1.19212 : 100%|████████████████████| 43/43 [10:54<00:00, 15.23s/it]
  0%|                                                                        | 0/43 [00:00<?, ?it/s]


 Validation -- Accuracy: 0.47656 Loss: 1.08872 


Epoch: 4 Accuracy: 0.52725 Loss: 1.07781 : 100%|████████████████████| 43/43 [10:55<00:00, 15.24s/it]



 Validation -- Accuracy: 0.50130 Loss: 0.98855 


Epoch: 5 Accuracy: 0.56650 Loss: 0.92028 : 100%|████████████████████| 43/43 [10:53<00:00, 15.21s/it]
  0%|                                                                        | 0/43 [00:00<?, ?it/s]


 Validation -- Accuracy: 0.50521 Loss: 0.96066 


Epoch: 6 Accuracy: 0.58685 Loss: 1.08278 : 100%|████████████████████| 43/43 [11:10<00:00, 15.59s/it]
  0%|                                                                        | 0/43 [00:00<?, ?it/s]


 Validation -- Accuracy: 0.51562 Loss: 0.94988 


Epoch: 7 Accuracy: 0.59230 Loss: 0.99861 : 100%|████████████████████| 43/43 [11:06<00:00, 15.50s/it]
  0%|                                                                        | 0/43 [00:00<?, ?it/s]


 Validation -- Accuracy: 0.50911 Loss: 0.95398 


Epoch: 8 Accuracy: 0.63917 Loss: 0.92362 : 100%|████████████████████| 43/43 [11:06<00:00, 15.50s/it]
  0%|                                                                        | 0/43 [00:00<?, ?it/s]


 Validation -- Accuracy: 0.52474 Loss: 0.97034 


Epoch: 9 Accuracy: 0.65262 Loss: 0.86281 : 100%|████████████████████| 43/43 [11:09<00:00, 15.56s/it]
  0%|                                                                        | 0/43 [00:00<?, ?it/s]


 Validation -- Accuracy: 0.52083 Loss: 1.01130 


Epoch: 10 Accuracy: 0.68568 Loss: 0.71870 : 100%|███████████████████| 43/43 [11:08<00:00, 15.55s/it]



 Validation -- Accuracy: 0.52083 Loss: 1.03131 


In [10]:
torch.save(model,"checkpoint.ph")