# kaggle learnings for Diabetic retinopathy competition

**Were you surprised by any of your findings?**

I was surprised by a couple of things. First, that increasing the scale of the images beyond radius=270 pixels did not seem to help. I was expecting the existence of very small features, only visible at higher resolutions, to tip the balance in favor of larger images. Perhaps the increase in processing times for larger images was too great.

I was also surprised by the fact that ensembling (taking multiple views of each image, and combining the results of different networks) did very little to improve accuracy. This is rather different to the case of normal photographs, where ensembling can make a huge difference.
- **from the Old competition winner [here](http://blog.kaggle.com/2015/09/09/diabetic-retinopathy-winners-interview-1st-place-ben-graham/)**

**What preprocessing and supervised learning methods did you use?**

For preprocessing, I first scaled the images to a given radius. I then subtracted local average color to reduce differences in lighting.



In [4]:
from IPython.display import Image
from IPython.core.display import HTML 
print("Preprocessing Techniques by Ben Graham[-Old competition winner-]")
Image(url= "https://i2.wp.com/blog.kaggle.com/wp-content/uploads/2015/09/Screen-Shot-2015-09-10-at-11.12.53-AM.png?w=773")

Preprocessing Techniques by Ben Graham[-Old competition winner-]


### Blog on OLD competition winner [here](http://blog.kaggle.com/2015/08/10/detecting-diabetic-retinopathy-in-eye-images/)
- Introduction
- Overview / TL;DR
- The opening (processing and augmenting, kappa metric and first models)
- The middlegame (basic architecture, visual attention)
- The endgame (camera artifacts, pseudo-labeling, decoding, error distribution, ensembling)
- Other (not) tried approaches and papers
- Conclusion
- Code, models and example activations



**NOTE**:-
Not surprisingly, all my models were convolutional networks (convnets) adapted for this task. I recommend reading the [well-written blogpost with clear explanation](http://benanne.github.io/2015/03/17/plankton.html) from the ≋ Deep Sea ≋ team that won the Kaggle National Data Science Bowl competition since there are a lot of similarities in our approaches and they provide some more/better explanation.
                                             
  **-- [old competition winner](http://blog.kaggle.com/2015/08/10/detecting-diabetic-retinopathy-in-eye-images/)**

### Data augmentation techniques from old competition winner
These augmentations (transformations) were:

1. Cropping with certain probability
2. Color balance adjustment
3. Brightness adjustment
4. Contrast adjustment
5. Flipping images with 50% chance
6. Rotating images by x degrees, with x an integer in [0, 360[
7. Zooming (equal cropping on x and y dimensions)

### for code, Camera Artifacts, Psuedo LAbelling, Better decoding, Error Distribution Ensembling [here](http://blog.kaggle.com/2015/08/10/detecting-diabetic-retinopathy-in-eye-images/)

## know layers and parameters

### Vgg16 layers

In [3]:
import torch
from torchvision import models
from torchsummary import summary

device = torch.device('cpu')
#load model
vgg = models.vgg16().to(device)
#view model or architecture
summary(vgg, (3, 224, 224))

----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
            Conv2d-1         [-1, 64, 224, 224]           1,792
              ReLU-2         [-1, 64, 224, 224]               0
            Conv2d-3         [-1, 64, 224, 224]          36,928
              ReLU-4         [-1, 64, 224, 224]               0
         MaxPool2d-5         [-1, 64, 112, 112]               0
            Conv2d-6        [-1, 128, 112, 112]          73,856
              ReLU-7        [-1, 128, 112, 112]               0
            Conv2d-8        [-1, 128, 112, 112]         147,584
              ReLU-9        [-1, 128, 112, 112]               0
        MaxPool2d-10          [-1, 128, 56, 56]               0
           Conv2d-11          [-1, 256, 56, 56]         295,168
             ReLU-12          [-1, 256, 56, 56]               0
           Conv2d-13          [-1, 256, 56, 56]         590,080
             ReLU-14          [-1, 256,

### Densenet:- 


In [4]:
import torch
from torchvision import models
from torchsummary import summary

device = torch.device('cpu')
#load model
dense = models.densenet121().to(device)
#view model or architecture
summary(dense, (3, 224, 224))

----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
            Conv2d-1         [-1, 64, 112, 112]           9,408
       BatchNorm2d-2         [-1, 64, 112, 112]             128
              ReLU-3         [-1, 64, 112, 112]               0
         MaxPool2d-4           [-1, 64, 56, 56]               0
       BatchNorm2d-5           [-1, 64, 56, 56]             128
              ReLU-6           [-1, 64, 56, 56]               0
            Conv2d-7          [-1, 128, 56, 56]           8,192
       BatchNorm2d-8          [-1, 128, 56, 56]             256
              ReLU-9          [-1, 128, 56, 56]               0
           Conv2d-10           [-1, 32, 56, 56]          36,864
      BatchNorm2d-11           [-1, 96, 56, 56]             192
             ReLU-12           [-1, 96, 56, 56]               0
           Conv2d-13          [-1, 128, 56, 56]          12,288
      BatchNorm2d-14          [-1, 128,

### Resnet121

In [6]:
import torch
from torchvision import models
from torchsummary import summary

device = torch.device('cpu')
#load model
dense = models.resnet101().to(device)
#view model or architecture
summary(dense, (3, 224, 224))

----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
            Conv2d-1         [-1, 64, 112, 112]           9,408
       BatchNorm2d-2         [-1, 64, 112, 112]             128
              ReLU-3         [-1, 64, 112, 112]               0
         MaxPool2d-4           [-1, 64, 56, 56]               0
            Conv2d-5           [-1, 64, 56, 56]           4,096
       BatchNorm2d-6           [-1, 64, 56, 56]             128
              ReLU-7           [-1, 64, 56, 56]               0
            Conv2d-8           [-1, 64, 56, 56]          36,864
       BatchNorm2d-9           [-1, 64, 56, 56]             128
             ReLU-10           [-1, 64, 56, 56]               0
           Conv2d-11          [-1, 256, 56, 56]          16,384
      BatchNorm2d-12          [-1, 256, 56, 56]             512
           Conv2d-13          [-1, 256, 56, 56]          16,384
      BatchNorm2d-14          [-1, 256,

In [8]:
for layer in dense.layers:
    print(layer.name)
    if 'conv' not in layer.name:
        continue

AttributeError: 'ResNet' object has no attribute 'layers'

### [Official link to verify sensitivity and specificity](https://www.eyediagnosis.co/)

In [2]:
from IPython.display import Image
from IPython.core.display import HTML 
print("Understanding sensitivity and specificity")
Image(url= "https://www.healthnewsreview.org/wp-content/uploads/2018/06/sensitivity-2.png")

Understanding sensitivity and specificity


above fig is from the link [here](https://www.healthnewsreview.org/toolkit/tips-for-understanding-studies/understanding-medical-tests-sensitivity-specificity-and-positive-predictive-value/)

### [Visualzing Densenet using pytorch](http://www.andrewjanowczyk.com/visualizing-densenet-using-pytorch/)

### [Understanding and visualzing Densenet](https://towardsdatascience.com/understanding-and-visualizing-densenets-7f688092391a)

#### optimization of predictions using You can get best coefficients using nelder-mead optimization.

#### [optimizer for quadratic weighted kappa](https://www.kaggle.com/abhishek/optimizer-for-quadratic-weighted-kappa)

How to do that?

Well, if your regression results range from 0 to infinity, you can say everything below 0.5 is class 0, between 0.5 and 1.5 is class 1, between 1.5 and 2.5 is class 2, between 2.3 and 3.5 is class 3 and everything above 3.5 is class 4.

Thus the coefficients are : [0.5, 1.5, 2.5, 3.5]

more of above from **[here](https://www.kaggle.com/c/petfinder-adoption-prediction/discussion/76107)**

#### Learnt how to apply QWK from [here](https://www.kaggle.com/venkat555/aptos-pytorch-starter)

In [3]:
from IPython.display import Image
from IPython.core.display import HTML 
print("Models with loss and parameters")
Image(url= "https://raw.githubusercontent.com/tensorflow/tpu/master/models/official/efficientnet/g3doc/params.png")

Understanding sensitivity and specificity


the above fig is from [here](https://www.kaggle.com/carlolepelaars/efficientnetb5-with-keras-aptos-2019)

### Get layer and its weights

In [4]:
import torchvision
import torch

model = torchvision.models.resnet152(pretrained  = False) #no internet to download weights
#model.load_state_dict(torch.load('../input/pytorch-pretrained-models/resnet152-b121ed2d.pth')) #loading model from local

print("Model's state_dict:")
for param_tensor in model.state_dict():
    print(param_tensor, "\t", model.state_dict()[param_tensor].size())


Model's state_dict:
conv1.weight 	 torch.Size([64, 3, 7, 7])
bn1.weight 	 torch.Size([64])
bn1.bias 	 torch.Size([64])
bn1.running_mean 	 torch.Size([64])
bn1.running_var 	 torch.Size([64])
bn1.num_batches_tracked 	 torch.Size([])
layer1.0.conv1.weight 	 torch.Size([64, 64, 1, 1])
layer1.0.bn1.weight 	 torch.Size([64])
layer1.0.bn1.bias 	 torch.Size([64])
layer1.0.bn1.running_mean 	 torch.Size([64])
layer1.0.bn1.running_var 	 torch.Size([64])
layer1.0.bn1.num_batches_tracked 	 torch.Size([])
layer1.0.conv2.weight 	 torch.Size([64, 64, 3, 3])
layer1.0.bn2.weight 	 torch.Size([64])
layer1.0.bn2.bias 	 torch.Size([64])
layer1.0.bn2.running_mean 	 torch.Size([64])
layer1.0.bn2.running_var 	 torch.Size([64])
layer1.0.bn2.num_batches_tracked 	 torch.Size([])
layer1.0.conv3.weight 	 torch.Size([256, 64, 1, 1])
layer1.0.bn3.weight 	 torch.Size([256])
layer1.0.bn3.bias 	 torch.Size([256])
layer1.0.bn3.running_mean 	 torch.Size([256])
layer1.0.bn3.running_var 	 torch.Size([256])
layer1.0.bn3.num

layer2.6.bn3.bias 	 torch.Size([512])
layer2.6.bn3.running_mean 	 torch.Size([512])
layer2.6.bn3.running_var 	 torch.Size([512])
layer2.6.bn3.num_batches_tracked 	 torch.Size([])
layer2.7.conv1.weight 	 torch.Size([128, 512, 1, 1])
layer2.7.bn1.weight 	 torch.Size([128])
layer2.7.bn1.bias 	 torch.Size([128])
layer2.7.bn1.running_mean 	 torch.Size([128])
layer2.7.bn1.running_var 	 torch.Size([128])
layer2.7.bn1.num_batches_tracked 	 torch.Size([])
layer2.7.conv2.weight 	 torch.Size([128, 128, 3, 3])
layer2.7.bn2.weight 	 torch.Size([128])
layer2.7.bn2.bias 	 torch.Size([128])
layer2.7.bn2.running_mean 	 torch.Size([128])
layer2.7.bn2.running_var 	 torch.Size([128])
layer2.7.bn2.num_batches_tracked 	 torch.Size([])
layer2.7.conv3.weight 	 torch.Size([512, 128, 1, 1])
layer2.7.bn3.weight 	 torch.Size([512])
layer2.7.bn3.bias 	 torch.Size([512])
layer2.7.bn3.running_mean 	 torch.Size([512])
layer2.7.bn3.running_var 	 torch.Size([512])
layer2.7.bn3.num_batches_tracked 	 torch.Size([])
layer

layer3.9.bn2.weight 	 torch.Size([256])
layer3.9.bn2.bias 	 torch.Size([256])
layer3.9.bn2.running_mean 	 torch.Size([256])
layer3.9.bn2.running_var 	 torch.Size([256])
layer3.9.bn2.num_batches_tracked 	 torch.Size([])
layer3.9.conv3.weight 	 torch.Size([1024, 256, 1, 1])
layer3.9.bn3.weight 	 torch.Size([1024])
layer3.9.bn3.bias 	 torch.Size([1024])
layer3.9.bn3.running_mean 	 torch.Size([1024])
layer3.9.bn3.running_var 	 torch.Size([1024])
layer3.9.bn3.num_batches_tracked 	 torch.Size([])
layer3.10.conv1.weight 	 torch.Size([256, 1024, 1, 1])
layer3.10.bn1.weight 	 torch.Size([256])
layer3.10.bn1.bias 	 torch.Size([256])
layer3.10.bn1.running_mean 	 torch.Size([256])
layer3.10.bn1.running_var 	 torch.Size([256])
layer3.10.bn1.num_batches_tracked 	 torch.Size([])
layer3.10.conv2.weight 	 torch.Size([256, 256, 3, 3])
layer3.10.bn2.weight 	 torch.Size([256])
layer3.10.bn2.bias 	 torch.Size([256])
layer3.10.bn2.running_mean 	 torch.Size([256])
layer3.10.bn2.running_var 	 torch.Size([256]

layer3.20.conv1.weight 	 torch.Size([256, 1024, 1, 1])
layer3.20.bn1.weight 	 torch.Size([256])
layer3.20.bn1.bias 	 torch.Size([256])
layer3.20.bn1.running_mean 	 torch.Size([256])
layer3.20.bn1.running_var 	 torch.Size([256])
layer3.20.bn1.num_batches_tracked 	 torch.Size([])
layer3.20.conv2.weight 	 torch.Size([256, 256, 3, 3])
layer3.20.bn2.weight 	 torch.Size([256])
layer3.20.bn2.bias 	 torch.Size([256])
layer3.20.bn2.running_mean 	 torch.Size([256])
layer3.20.bn2.running_var 	 torch.Size([256])
layer3.20.bn2.num_batches_tracked 	 torch.Size([])
layer3.20.conv3.weight 	 torch.Size([1024, 256, 1, 1])
layer3.20.bn3.weight 	 torch.Size([1024])
layer3.20.bn3.bias 	 torch.Size([1024])
layer3.20.bn3.running_mean 	 torch.Size([1024])
layer3.20.bn3.running_var 	 torch.Size([1024])
layer3.20.bn3.num_batches_tracked 	 torch.Size([])
layer3.21.conv1.weight 	 torch.Size([256, 1024, 1, 1])
layer3.21.bn1.weight 	 torch.Size([256])
layer3.21.bn1.bias 	 torch.Size([256])
layer3.21.bn1.running_mea

layer3.35.conv3.weight 	 torch.Size([1024, 256, 1, 1])
layer3.35.bn3.weight 	 torch.Size([1024])
layer3.35.bn3.bias 	 torch.Size([1024])
layer3.35.bn3.running_mean 	 torch.Size([1024])
layer3.35.bn3.running_var 	 torch.Size([1024])
layer3.35.bn3.num_batches_tracked 	 torch.Size([])
layer4.0.conv1.weight 	 torch.Size([512, 1024, 1, 1])
layer4.0.bn1.weight 	 torch.Size([512])
layer4.0.bn1.bias 	 torch.Size([512])
layer4.0.bn1.running_mean 	 torch.Size([512])
layer4.0.bn1.running_var 	 torch.Size([512])
layer4.0.bn1.num_batches_tracked 	 torch.Size([])
layer4.0.conv2.weight 	 torch.Size([512, 512, 3, 3])
layer4.0.bn2.weight 	 torch.Size([512])
layer4.0.bn2.bias 	 torch.Size([512])
layer4.0.bn2.running_mean 	 torch.Size([512])
layer4.0.bn2.running_var 	 torch.Size([512])
layer4.0.bn2.num_batches_tracked 	 torch.Size([])
layer4.0.conv3.weight 	 torch.Size([2048, 512, 1, 1])
layer4.0.bn3.weight 	 torch.Size([2048])
layer4.0.bn3.bias 	 torch.Size([2048])
layer4.0.bn3.running_mean 	 torch.Size

### Layer4 parameters

In [5]:
plist = [
         {'params': model.layer4.parameters(), 'lr': 1e-4, 'weight': 0.001},
         {'params': model.fc.parameters(), 'lr': 1e-3}
         ]

In [7]:
print(model.layer4.parameters)

<bound method Module.parameters of Sequential(
  (0): Bottleneck(
    (conv1): Conv2d(1024, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
    (bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
    (bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (conv3): Conv2d(512, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False)
    (bn3): BatchNorm2d(2048, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (relu): ReLU(inplace=True)
    (downsample): Sequential(
      (0): Conv2d(1024, 2048, kernel_size=(1, 1), stride=(2, 2), bias=False)
      (1): BatchNorm2d(2048, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    )
  )
  (1): Bottleneck(
    (conv1): Conv2d(2048, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
    (bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, trac