# RETINAL FUNDUS MULTI-DISEASE IMAGE DATASET (RFMID)

Source: *https://ieee-dataport.org/open-access/retinal-fundus-multi-disease-image-dataset-rfmid#files*

![Logo.png](attachment:b4a40087-fa04-4b9f-b384-2eb345d90c1f.png)

In [1]:
# imports
import matplotlib.pyplot as plt
import numpy as np
import os
import shutil
import pandas as pd
import os.path as path

In [4]:
df = pd.read_csv("Labels.csv")
pd.set_option('display.max_columns', None)

In [5]:
df.head()

Unnamed: 0,ID,Disease_Risk,DR,ARMD,MH,DN,MYA,BRVO,TSLN,ERM,LS,MS,CSR,ODC,CRVO,TV,AH,ODP,ODE,ST,AION,PT,RT,RS,CRS,EDN,RPEC,MHL,RP,CWS,CB,ODPM,PRH,MNF,HR,CRAO,TD,CME,PTCR,CF,VH,MCA,VS,BRAO,PLQ,HPED,CL
0,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,2,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,3,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,4,1,0,0,1,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,5,1,1,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


The datasets represents a matrix table, displaying which images belong to each DISEASE, if has a one Disease_Risk would be equal to 1.

- Diabetic retinopathy (DR) is a microvascular complication of diabetes mellitus and is a leading cause of vision loss in the elderly and working population. The image is labeled as DR if it shows any of the following clinical findings: microaneurysms, retinal dot and blot hemorrhage, hard exudates or cotton wool spots (see Figure 2a) [17].

- Myopia (MYA) is characterized by degenerative changes in the choroid, sclera, and RPE (see Figure 3b) [21]. Vision loss due to myopia can be progressive and irreversible.

### We are only to look for the image files that have a true value for Diabetic Retinopathy and Myopia

### 1. Diabetic Retinopathy(DR)

We are going to create a new dataframe that holds all the ID's that have Diabetic Retinopathy

In [7]:
dr = df[df['DR'] == 1]

In [8]:
dr.head()

Unnamed: 0,ID,Disease_Risk,DR,ARMD,MH,DN,MYA,BRVO,TSLN,ERM,LS,MS,CSR,ODC,CRVO,TV,AH,ODP,ODE,ST,AION,PT,RT,RS,CRS,EDN,RPEC,MHL,RP,CWS,CB,ODPM,PRH,MNF,HR,CRAO,TD,CME,PTCR,CF,VH,MCA,VS,BRAO,PLQ,HPED,CL
0,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,2,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,3,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,5,1,1,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
17,18,1,1,0,0,0,0,0,0,0,1,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


In [10]:
df.shape

(1920, 47)

### We are going to move all the pictures from this dataset to a new folder named dr

This loop will first check if the path exist with the OS libray, and then it would rename the path of the picture to an specific folder.

In [24]:
for i in dr['ID']:
    shutil.move(f"./Training_3/{i}.png",f"./dr/{i}.png")

Let's check how many elements we moved

In [25]:
path, dirs, files = next(os.walk("./dr"))
file_count = len(files)
file_count

376

Some of the files did not moved.

### 2. Myopia (MYA)

We are going to create a new dataframe that holds all the ID's that have Myopia

In [18]:
mya = df[df['MYA'] == 1]

In [19]:
mya.head()

Unnamed: 0,ID,Disease_Risk,DR,ARMD,MH,DN,MYA,BRVO,TSLN,ERM,LS,MS,CSR,ODC,CRVO,TV,AH,ODP,ODE,ST,AION,PT,RT,RS,CRS,EDN,RPEC,MHL,RP,CWS,CB,ODPM,PRH,MNF,HR,CRAO,TD,CME,PTCR,CF,VH,MCA,VS,BRAO,PLQ,HPED,CL
5,6,1,0,1,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
6,7,1,0,1,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
7,8,1,0,1,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
11,12,1,0,1,0,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
46,47,1,0,0,0,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


In [20]:
mya.shape

(101, 47)

### We are going to move all the pictures from this dataset to a new folder named dr

This loop will first check if the path exist with the OS libray, and then it would rename the path of the picture to an specific folder.

In [22]:
for i in mya['ID']:
    shutil.move(f"./Training_2/{i}.png",f"./mya/{i}.png")

Check if all the files got sorted

In [23]:
path, dirs, files = next(os.walk("./mya"))
file_count = len(files)
file_count

101

All the files moved

### After we took care of all the images and sorting them out. We need to delete the non necessary folder fill with non classified images.

In [26]:
shutil.rmtree("./Training_2/")

In [27]:
shutil.rmtree("./Training_3/")

In [28]:
shutil.rmtree("./Training/")