This project aims to detect the center of optic disc and fovea using only image processing techniques (no machine learning) in fundus images. Moreover, the segmentation of optic disc has been done for the purpose of the localization.
Indian Diabetic Retinopathy Image Dataset (IDRID) is used for this project.
The images of this dataset have a resolution of 4288×2848 pixels and are stored in JPG file format. Moreover, the dataset covers three distinct subsets of problems: Localization, Segmentation and Disease Grading. I only focused on localization and segmentation problems for this study.
a) Ground Truth of Fovea Localization | b) Ground Truth of Optic Disc Localization |
---|---|
Localization subset contains 413 sample images for training and 103 sample images for testing by including the center location of both the optic disc and fovea in CSV file format. Since this study does not address model training, I only used the test subset.
Segmentation subset of the dataset contains 54 training and 27 testing samples with corresponding segmentation masks. For this problem, U-Net, one of the well-known image segmentation algorithms in deep learning, was used to compare the performance of image processing methods.
As an approach, I realized that the brightest parts of the images are mostly the areas where the optic disc is located. Therefore, I applied the histogram equalization algorithm to increase contrast and make the optic disc brighter after converting color space of the image from RGB to grayscale.
Since the brightest parts are closer to the white value (255), I applied binary thresholding with a predefined value of 250. In this step, I iterate this process by lowering the threshold value until finding an object of sufficient area size. Sometimes small regions exceed the threshold and create noise in the segmentation, which I prevented by applying an erosion operation. This algorithm also uses a heap (priority queue) data structure to store the biggest objects because optic discs have a huge space in the images.
a) Input Image | b) Grayscale Image |
---|---|
c) Histogram Equalized Image | d) Thresholded Image |
---|---|
Both optic disc and fovea detection require similar processing steps to achieve meaningful results. Therefore, I applied most of the same methods with small adjustments. Because the fovea usually represents a black area in images, it is difficult to distinguish it from other parts of the image, even with the human eye. Thus, to convert dark pixels into light ones, I first convert the images into negative ones.
After analyzing images, the fovea is always close to the center of the fundus. For this reason, I applied a circle masking operation to extract the middle part of the image as a region of interest with a predefined radius value. Since the image is in RGB color space, I transformed it into a grayscale and applied histogram equalization to sharpen the brightness of the fovea. The remaining manipulations are the same as the optic disc detection algorithm: thresholding, erosion, storing the largest contours in a heap and finding the center of the area.
a) Input Image | b) Negative Image | c) Grayscale Masked Image |
---|---|---|
d) Histogram Eqaulized Image | e) Thresholded Image | f) Erosion Applied Image |
---|---|---|
- I observed that this region is sometimes very challenging to detect, even for humans.
- Therefore, I created a simple application with a user interface for human subjects to test their performance to set a threshold for our image processing pipeline.
a) Easy Example | b) Hard Example |
---|---|
I have used F1 score for our metrics but this metric is designed for classification problem. Therefore, I made some adjustments to use it for my problems. I added a distance factor d to accept prediction true if the model find a coordinate which is equal or smaller than this factor. I have used this factor for both fovea and optic disc localization.
Model | d = 100 | d = 200 | d = 300 | d = 400 | d = 500 |
---|---|---|---|---|---|
Human | 79.61% | 81.52% | 90.61% | 95.47% | 100% |
Ours | 53.46% | 77.67% | 86.41% | 87.38% | 93.20% |
Model | d = 100 | d = 150 | d = 200 | d = 250 | d = 300 |
---|---|---|---|---|---|
Ours | 75.73% | 94.17% | 96.12% | 96.12% | 97.09% |
As I mentioned, I have used U-Net image segmentation algortithm to compare the performances of my model and a well-known deep learning model. I have used 54 training samples from dataset to train the U-Net model and 27 testing samples to measure the performance in dice score. You can see the hyperparameters I have used for U-Net algorithm:
Hyperparameter | Value |
---|---|
Input Size | 512x256 |
Learning Rate | 10^-3 |
Loss Function | Binary Cross-Entropy |
Optimizer | Adam |
Batch Size | 6 |
Epochs | 150 |
In the table below, I show overall performance for my image processing model and the U-Net algorithm. Since my model is not trained with any image, its preliminary results are promising.
Model | Dice Score |
---|---|
U-Net | 85.86% |
Ours | 72.55% |
Ours | U-Net | Ground Truth |
---|---|---|
Ours | U-Net | Ground Truth |
---|---|---|