# Image Data Segmentation from CSV

### This notebook consists of code that lets you segregate/segment the images from a common folder into different folders according to the class of a particular image from the annotations given in a CSV/Excel file.
#### >> Tensorflow provides with an API called ImageDataGenerator that lets you create an automatic flow of image data to the neural network while training, validating as well as testing.
#### >> ImageDataGenerator recognizes the image class by the folder name in which the image exists. So dividing images into their respective destination is very useful and makes the task easy, instead to read a seperate annotation of a particular image while training.

In [1]:
import os
import random
import cv2
import matplotlib.pyplot as plt
import shutil

### Here 'aptos_2019' is a directory containing 3662 images in a single directory.

In [2]:
path = "aptos_2019/"

In [3]:
array = os.listdir(path)
len(array)

3662

In [4]:
# Shuffling helps you sample random images for the test set

random.shuffle(array)
test_array = random.sample(array, 662)
len(test_array)

662

In [5]:
# We will remove the randomly rampled images from the training data to avoid redundancy

temp = []
for i in test_array:
    if i not in temp:
        temp.append(i)
    else:
        pass
len(temp)

662

In [12]:
for i in array:
    if i in temp:
        array.remove(i)

In [13]:
len(array)

3000

### We have 5 classes and so we will make 5 seperate folders for training and testing sets both

In [19]:
arr = ["No_DR","Mild","Moderate","Severe","Proliferative"]

In [16]:
# os.mkdir("2019/train/")
# os.mkdir("2019/test/")

In [23]:
for i in arr:
    os.mkdir("2019/train/"+i)
    os.mkdir("2019/test/"+i)

In [20]:
path_train = "2019/train/"
path_test = "2019/test/"

### Here below is the code where we read the .csv annotation/label file and read every image's retinopathy grading. 
### Then accordingly we copy that image from source to destination folder according to our folder array

In [24]:
num = 1
with open("train_aptos.csv") as file:
    for row in file:
        if num == 1:
            pass
        else:
            index = row.split(",")
            img = index[0] + ".png"
            if img in array:
                shutil.copy(path+img, path_train+arr[int(index[1][:-1])])
            else:
                shutil.copy(path+img, path_test+arr[int(index[1][:-1])])
        num += 1