# Peter Moss Acute Myeloid & Lymphoblastic Leukemia AI Research Project

## Acute Myeloid & Lymphoblastic Leukemia Detection System

![Peter Moss Acute Myeloid & Lymphoblastic Leukemia AI Research Project](https://www.PeterMossAmlAllResearch.com/media/images/repositories/banner.png)

### Data Augmentation

The AML/ALL Classifier Data Augmentation program applies filters to datasets and increases the amount of training / test data available to use. The program is part of the computer vision research and development for the [Peter Moss Acute Myeloid/Lymphoblastic (AML/ALL) Leukemia AI Research Project](https://www.facebook.com/AMLResearchProject/).  

Before you start the tutorial on below you should complete the steps in the [Augmentation README](https://github.com/AMLResearchProject/AML-Detection-System/tree/master/Augmentation/)  


![Peter Moss Acute Myeloid & Lymphoblastic Leukemia Research Project](https://www.PeterMossAmlAllResearch.com/media/images/repositories/ALL_IDB1_Augmentation_Banner-Lg.png)

# Research papers followed
The papers that this part of the project is based on were provided by project team member, Ho Leung, Associate Professor of Biochemistry & Molecular Biophysics at Kansas State University. 

## Leukemia Blood Cell Image Classification Using Convolutional Neural Network
T. T. P. Thanh, Caleb Vununu, Sukhrob Atoev, Suk-Hwan Lee, and Ki-Ryong Kwon 
http://www.ijcte.org/vol10/1198-H0012.pdf

# Dataset  
The [Acute Lymphoblastic Leukemia Image Database for Image Processing](https://homes.di.unimi.it/scotti/all/) dataset is used for this project. The dataset was created by [Fabio Scotti, Associate Professor Dipartimento di Informatica, Università degli Studi di Milano](https://homes.di.unimi.it/scotti/). Big thanks to Fabio for his research and time put in to creating the dataset and documentation, it is one of his personal projects. You will need to follow the steps outlined [here](https://homes.di.unimi.it/scotti/all/#download) to gain access to the dataset.

## Data augmentation

![AML & ALL Data Augmentation](https://www.PeterMossAmlAllResearch.com/media/images/repositories/ALL_IDB1_Augmented_Slides.png)

I decided to use some augmentation proposals outlined in Leukemia Blood Cell Image Classification Using Convolutional Neural Network by T. T. P. Thanh, Caleb Vununu, Sukhrob Atoev, Suk-Hwan Lee, and Ki-Ryong Kwon. The augmentations I chose were grayscaling, histogram equalization, horizontal and vertical reflection and gaussian blur to start with. Using these techniques so far I have been able to increase a dataset from 49 positive and 49 negative images to 683 positive and 683 negative, with more augmentations to experiment with. 

The full Python class that holds the functions mentioned below can be found in [Classes/Data.py](Classes/Data.py), The Data class is a wrapper class around releated functions provided in popular computer vision libraries including as OpenCV and Scipy.

### Grayscaling

In general grayscaled images are not as complex as color images and result in a less complex model. In the paper the authors described using grayscaling to create more data easily. To create a greyscale copy of each image I wrapped the built in OpenCV function, [cv2.cvtColor()](https://opencv-python-tutroals.readthedocs.io/en/latest/py_tutorials/py_imgproc/py_colorspaces/py_colorspaces.html). The created images will be saved to the relevant directories in the default configuration.

    def grayScale(self, image, grayPath, show = False):
        
        ###############################################################
        #
        # Writes a grayscale copy of the image to the filepath provided. 
        #
        ###############################################################
        
        gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
        self.writeImage(grayPath, gray)
        self.filesMade += 1
        print("Grayscaled image written to: " + grayPath)
        
        if show is True:
            plt.imshow(gray)
            plt.show()
            
        return image, gray

### Histogram Equalization

Histogram equalization is basically stretching the histogram horizontally on both sides, increasing the intensity/contrast. Histogram equalization is described in the paper to enhance the contrast. 

In the case of this dataset, it makes both the white and red blood cells more distinguishable. The created images will be saved to the relevant directories in the default configuration.

    def equalizeHist(self, gray, histPath, show = False):
        
        ###############################################################
        #
        # Writes histogram equalized copy of the image to the filepath 
        # provided. 
        #
        ###############################################################
        
        hist = cv2.equalizeHist(gray)
        self.writeImage(histPath, cv2.equalizeHist(gray))
        self.filesMade += 1
        print("Histogram equalized image written to: " + histPath)
        
        if show is True:
            plt.imshow(hist)
            plt.show()
            
        return hist

### Reflection

Reflection is a way of increasing your dataset by creating a copy that is fliped on its X axis, and a copy that is flipped on its Y axis. The reflection function below uses the built in OpenCV function, cv2.flip, to flip the image on the mentioned axis.  The created images will be saved to the relevant directories in the default configuration.

    def reflection(self, image, horPath, verPath, show = False):
        
        ###############################################################
        #
        # Writes histogram equalized copy of the image to the filepath 
        # provided. 
        #
        ###############################################################
        
        horImg = cv2.flip(image, 0)
        self.writeImage(horPath, horImg)
        self.filesMade += 1
        print("Horizontally reflected image written to: " + horPath)
        
        if show is True:
            plt.imshow(horImg)
            plt.show()
            
        verImg = cv2.flip(image, 1)
        self.writeImage(verPath, verImg)
        self.filesMade += 1
        print("Vertical reflected image written to: " + verPath)
        
        if show is True:
            plt.imshow(verImg)
            plt.show()
            
        return horImg, verImg

### Gaussian Blur

Gaussian Blur is a popular technique used on images and is especially popular in the computer vision world. The function below uses the ndimage.gaussian_filter function. The created images will be saved to the relevant directories in the default configuration.

    def gaussian(self, filePath, gaussianPath, show = False):
        
        ###############################################################
        #
        # Writes gaussian blurred copy of the image to the filepath 
        # provided. 
        #
        ###############################################################
        
        gaussianBlur = ndimage.gaussian_filter(plt.imread(filePath), sigma=5.11)
        self.writeImage(gaussianPath, gaussianBlur)
        self.filesMade += 1
        print("Gaussian image written to: " + gaussianPath)

        if show is True:
            plt.imshow(gaussianBlur)
            plt.show()
            
        return gaussianBlur

### Rotation

Gaussian Blur is a popular technique used on images and is especially popular in the computer vision world. The function below uses the ndimage.gaussian_filter function. The created images will be saved to the relevant directories in the default configuration.
        
    def rotation(self, path, filePath, filename, show = False): 
        
        ###############################################################
        #
        # Writes rotated copies of the image to the filepath 
        # provided. 
        #
        ###############################################################
        
        img = Image.open(filePath)

        for i in range(0, 10):
            randDeg = random.randint(-180, 180)
            fullPath = os.path.join(path, str(randDeg) + '-' + str(i) + '-' + filename)

            try:
                if show is True:
                    img.rotate(randDeg, expand=True).resize((self.confs["Settings"]["ImgDims"], self.confs["Settings"]["ImgDims"])).save(fullPath).show()
                    self.filesMade += 1
                else:
                    img.rotate(randDeg, expand=True).resize((self.confs["Settings"]["ImgDims"], self.confs["Settings"]["ImgDims"])).save(fullPath)
                    self.filesMade += 1
                print("Rotated image written to: " + fullPath)
            except:
                print("File was not written! "+filename)

            time.sleep(1)

# Augment your data
You can use the code below to augment your data. The full Python class that holds the core data functions can be found in [Classes/Data.py](Classes/Data.py). The AMLDnnData class below is a wrapper class around the Data class that allows you to run the code in this Notebook. Follow the steps below and execute the code blocks using __SHIFT + ENTER__.

## Sort your dataset
The ALL IDB_1 dataset is the one used in this tutorial. In this dataset there were 49 negative and 59 positive. To make this even I removed 10 images from the positive dataset. From here I removed a further 10 images per class for testing further on in the tutorial and for the purpose of demos etc. In my case I ended up with 20 test images (10 pos/10 neg) and 39 images per class ready for augmentation. Place the original images that you wish to augment into the __Model/Data/0__ & __Model/Data/1__. Using this program I was able to create a dataset of __624__ positive and __624__ negative augmented images.


## Import required libraries
We are going to run the above code in the browser, to do this you need to run the following code. Execute the code block by placing your cursor inside and clicking __shift__ & __enter__.

In [0]:
import matplotlib.pyplot as plt

%matplotlib inline
plt.rcParams['figure.figsize'] = (5.0, 4.0) 
plt.rcParams['image.interpolation'] = 'nearest'
plt.rcParams['image.cmap'] = 'gray'

from Classes.Data import Data

## Initiate the data wrapper
The following Python class is a wrapper for the [Data (Classes/Data.py)](Classes/Data.py) class.

In [0]:
class AMLDnnData():
    
    def __init__(self):
        
        ###############################################################
        #
        # Sets up all default requirements and placeholders 
        # needed for the Acute Myeloid Leukemia Classifier. 
        #
        ###############################################################

        self.Data = Data()
        
    def processDataset(self):
        
        ###############################################################
        #
        # Make sure you have your equal amounts of positive and negative
        # samples in the Model/Data directories.
        # 
        # Only run this function once! it will continually make copies 
        # of all images in the Settings->TrainDir directory specified 
        # in Required/confs.json        
        #
        ###############################################################
        
        self.Data.processDataset() 
        
AMLDnnData = AMLDnnData()

## Augment the dataset

Make sure you have your equal amounts of positive and negative samples in the Model/Data directories, in my program I use the directories 0 and 1, but you can name them what you like as long as they are integers. 

__Only run this function once! it will continually make copies of all images in the Settings->TrainDir directory specified in Required/confs.json__   

Execute the following code block to process the dataset by running all of the functions shown above. You may see slides that are a strange color or not square, you can ignore this as the actual saved image is correct. 

In [0]:
# Uncomment the code, only run once!
AMLDnnData.processDataset() 

Resized image written to: Model/Augmented/1/Im029_1.jpg
Grayscaled image written to: Model/Augmented/1/Gray-Im029_1.jpg
Histogram equalized image written to: Model/Augmented/1/Hist-Im029_1.jpg
Horizontally reflected image written to: Model/Augmented/1/Hor-Im029_1.jpg
Vertical reflected image written to: Model/Augmented/1/Ver-Im029_1.jpg
Gaussian image written to: Model/Augmented/1/Gaus-Im029_1.jpg
Rotated image written to: Model/Augmented/1/-168-0-Im029_1.jpg
Rotated image written to: Model/Augmented/1/-174-1-Im029_1.jpg
Rotated image written to: Model/Augmented/1/-144-2-Im029_1.jpg
Rotated image written to: Model/Augmented/1/-99-3-Im029_1.jpg
Rotated image written to: Model/Augmented/1/-139-4-Im029_1.jpg
Rotated image written to: Model/Augmented/1/39-5-Im029_1.jpg
Rotated image written to: Model/Augmented/1/-7-6-Im029_1.jpg
Rotated image written to: Model/Augmented/1/178-7-Im029_1.jpg
Rotated image written to: Model/Augmented/1/112-8-Im029_1.jpg
Rotated image written to: Model/Augment

Rotated image written to: Model/Augmented/1/166-6-Im018_1.jpg
Rotated image written to: Model/Augmented/1/-104-7-Im018_1.jpg
Rotated image written to: Model/Augmented/1/-81-8-Im018_1.jpg
Rotated image written to: Model/Augmented/1/-155-9-Im018_1.jpg
Total augmented files created so far 128

Resized image written to: Model/Augmented/1/Im063_1.jpg
Grayscaled image written to: Model/Augmented/1/Gray-Im063_1.jpg
Histogram equalized image written to: Model/Augmented/1/Hist-Im063_1.jpg
Horizontally reflected image written to: Model/Augmented/1/Hor-Im063_1.jpg
Vertical reflected image written to: Model/Augmented/1/Ver-Im063_1.jpg
Gaussian image written to: Model/Augmented/1/Gaus-Im063_1.jpg
Rotated image written to: Model/Augmented/1/171-0-Im063_1.jpg
Rotated image written to: Model/Augmented/1/68-1-Im063_1.jpg
Rotated image written to: Model/Augmented/1/168-2-Im063_1.jpg
Rotated image written to: Model/Augmented/1/37-3-Im063_1.jpg
Rotated image written to: Model/Augmented/1/82-4-Im063_1.jpg


Rotated image written to: Model/Augmented/1/-144-2-Im060_1.jpg
Rotated image written to: Model/Augmented/1/-105-3-Im060_1.jpg
Rotated image written to: Model/Augmented/1/-159-4-Im060_1.jpg
Rotated image written to: Model/Augmented/1/10-5-Im060_1.jpg
Rotated image written to: Model/Augmented/1/36-6-Im060_1.jpg
Rotated image written to: Model/Augmented/1/88-7-Im060_1.jpg
Rotated image written to: Model/Augmented/1/-126-8-Im060_1.jpg
Rotated image written to: Model/Augmented/1/76-9-Im060_1.jpg
Total augmented files created so far 256

Resized image written to: Model/Augmented/1/Im020_1.jpg
Grayscaled image written to: Model/Augmented/1/Gray-Im020_1.jpg
Histogram equalized image written to: Model/Augmented/1/Hist-Im020_1.jpg
Horizontally reflected image written to: Model/Augmented/1/Hor-Im020_1.jpg
Vertical reflected image written to: Model/Augmented/1/Ver-Im020_1.jpg
Gaussian image written to: Model/Augmented/1/Gaus-Im020_1.jpg
Rotated image written to: Model/Augmented/1/102-0-Im020_1.jpg

Gaussian image written to: Model/Augmented/1/Gaus-Im059_1.jpg
Rotated image written to: Model/Augmented/1/54-0-Im059_1.jpg
Rotated image written to: Model/Augmented/1/29-1-Im059_1.jpg
Rotated image written to: Model/Augmented/1/-98-2-Im059_1.jpg
Rotated image written to: Model/Augmented/1/135-3-Im059_1.jpg
Rotated image written to: Model/Augmented/1/-106-4-Im059_1.jpg
Rotated image written to: Model/Augmented/1/-8-5-Im059_1.jpg
Rotated image written to: Model/Augmented/1/131-6-Im059_1.jpg
Rotated image written to: Model/Augmented/1/-55-7-Im059_1.jpg
Rotated image written to: Model/Augmented/1/-85-8-Im059_1.jpg
Rotated image written to: Model/Augmented/1/-47-9-Im059_1.jpg
Total augmented files created so far 384

Resized image written to: Model/Augmented/1/Im016_1.jpg
Grayscaled image written to: Model/Augmented/1/Gray-Im016_1.jpg
Histogram equalized image written to: Model/Augmented/1/Hist-Im016_1.jpg
Horizontally reflected image written to: Model/Augmented/1/Hor-Im016_1.jpg
Vertical r

Gaussian image written to: Model/Augmented/1/Gaus-Im006_1.jpg
Rotated image written to: Model/Augmented/1/-143-0-Im006_1.jpg
Rotated image written to: Model/Augmented/1/28-1-Im006_1.jpg
Rotated image written to: Model/Augmented/1/-85-2-Im006_1.jpg
Rotated image written to: Model/Augmented/1/-75-3-Im006_1.jpg
Rotated image written to: Model/Augmented/1/18-4-Im006_1.jpg
Rotated image written to: Model/Augmented/1/24-5-Im006_1.jpg
Rotated image written to: Model/Augmented/1/152-6-Im006_1.jpg
Rotated image written to: Model/Augmented/1/-56-7-Im006_1.jpg
Rotated image written to: Model/Augmented/1/-125-8-Im006_1.jpg
Rotated image written to: Model/Augmented/1/107-9-Im006_1.jpg
Total augmented files created so far 512

Resized image written to: Model/Augmented/1/Im027_1.jpg
Grayscaled image written to: Model/Augmented/1/Gray-Im027_1.jpg
Histogram equalized image written to: Model/Augmented/1/Hist-Im027_1.jpg
Horizontally reflected image written to: Model/Augmented/1/Hor-Im027_1.jpg
Vertical 

Gaussian image written to: Model/Augmented/0/Gaus-Im090_0.jpg
Rotated image written to: Model/Augmented/0/-19-0-Im090_0.jpg
Rotated image written to: Model/Augmented/0/111-1-Im090_0.jpg
Rotated image written to: Model/Augmented/0/152-2-Im090_0.jpg
Rotated image written to: Model/Augmented/0/141-3-Im090_0.jpg
Rotated image written to: Model/Augmented/0/46-4-Im090_0.jpg
Rotated image written to: Model/Augmented/0/25-5-Im090_0.jpg
Rotated image written to: Model/Augmented/0/-19-6-Im090_0.jpg
Rotated image written to: Model/Augmented/0/92-7-Im090_0.jpg
Rotated image written to: Model/Augmented/0/3-8-Im090_0.jpg
Rotated image written to: Model/Augmented/0/111-9-Im090_0.jpg
Total augmented files created so far 16

Resized image written to: Model/Augmented/0/Im082_0.jpg
Grayscaled image written to: Model/Augmented/0/Gray-Im082_0.jpg
Histogram equalized image written to: Model/Augmented/0/Hist-Im082_0.jpg
Horizontally reflected image written to: Model/Augmented/0/Hor-Im082_0.jpg
Vertical refle

Vertical reflected image written to: Model/Augmented/0/Ver-Im037_0.jpg
Gaussian image written to: Model/Augmented/0/Gaus-Im037_0.jpg
Rotated image written to: Model/Augmented/0/-63-0-Im037_0.jpg
Rotated image written to: Model/Augmented/0/38-1-Im037_0.jpg
Rotated image written to: Model/Augmented/0/-18-2-Im037_0.jpg
Rotated image written to: Model/Augmented/0/107-3-Im037_0.jpg
Rotated image written to: Model/Augmented/0/-61-4-Im037_0.jpg
Rotated image written to: Model/Augmented/0/176-5-Im037_0.jpg
Rotated image written to: Model/Augmented/0/65-6-Im037_0.jpg
Rotated image written to: Model/Augmented/0/133-7-Im037_0.jpg
Rotated image written to: Model/Augmented/0/-128-8-Im037_0.jpg
Rotated image written to: Model/Augmented/0/-18-9-Im037_0.jpg
Total augmented files created so far 144

Resized image written to: Model/Augmented/0/Im071_0.jpg
Grayscaled image written to: Model/Augmented/0/Gray-Im071_0.jpg
Histogram equalized image written to: Model/Augmented/0/Hist-Im071_0.jpg
Horizontally 

Gaussian image written to: Model/Augmented/0/Gaus-Im074_0.jpg
Rotated image written to: Model/Augmented/0/-149-0-Im074_0.jpg
Rotated image written to: Model/Augmented/0/-124-1-Im074_0.jpg
Rotated image written to: Model/Augmented/0/-140-2-Im074_0.jpg
Rotated image written to: Model/Augmented/0/-53-3-Im074_0.jpg
Rotated image written to: Model/Augmented/0/108-4-Im074_0.jpg
Rotated image written to: Model/Augmented/0/179-5-Im074_0.jpg
Rotated image written to: Model/Augmented/0/-57-6-Im074_0.jpg
Rotated image written to: Model/Augmented/0/-122-7-Im074_0.jpg
Rotated image written to: Model/Augmented/0/-72-8-Im074_0.jpg
Rotated image written to: Model/Augmented/0/101-9-Im074_0.jpg
Total augmented files created so far 272

Resized image written to: Model/Augmented/0/Im038_0.jpg
Grayscaled image written to: Model/Augmented/0/Gray-Im038_0.jpg
Histogram equalized image written to: Model/Augmented/0/Hist-Im038_0.jpg
Horizontally reflected image written to: Model/Augmented/0/Hor-Im038_0.jpg
Vert

Gaussian image written to: Model/Augmented/0/Gaus-Im046_0.jpg
Rotated image written to: Model/Augmented/0/47-0-Im046_0.jpg
Rotated image written to: Model/Augmented/0/-26-1-Im046_0.jpg
Rotated image written to: Model/Augmented/0/161-2-Im046_0.jpg
Rotated image written to: Model/Augmented/0/56-3-Im046_0.jpg
Rotated image written to: Model/Augmented/0/1-4-Im046_0.jpg
Rotated image written to: Model/Augmented/0/12-5-Im046_0.jpg
Rotated image written to: Model/Augmented/0/-76-6-Im046_0.jpg
Rotated image written to: Model/Augmented/0/160-7-Im046_0.jpg
Rotated image written to: Model/Augmented/0/-63-8-Im046_0.jpg
Rotated image written to: Model/Augmented/0/-127-9-Im046_0.jpg
Total augmented files created so far 400

Resized image written to: Model/Augmented/0/Im041_0.jpg
Grayscaled image written to: Model/Augmented/0/Gray-Im041_0.jpg
Histogram equalized image written to: Model/Augmented/0/Hist-Im041_0.jpg
Horizontally reflected image written to: Model/Augmented/0/Hor-Im041_0.jpg
Vertical ref

Gaussian image written to: Model/Augmented/0/Gaus-Im043_0.jpg
Rotated image written to: Model/Augmented/0/-41-0-Im043_0.jpg
Rotated image written to: Model/Augmented/0/156-1-Im043_0.jpg
Rotated image written to: Model/Augmented/0/-11-2-Im043_0.jpg
Rotated image written to: Model/Augmented/0/-126-3-Im043_0.jpg
Rotated image written to: Model/Augmented/0/85-4-Im043_0.jpg
Rotated image written to: Model/Augmented/0/105-5-Im043_0.jpg
Rotated image written to: Model/Augmented/0/-1-6-Im043_0.jpg
Rotated image written to: Model/Augmented/0/143-7-Im043_0.jpg
Rotated image written to: Model/Augmented/0/32-8-Im043_0.jpg
Rotated image written to: Model/Augmented/0/165-9-Im043_0.jpg
Total augmented files created so far 528

Resized image written to: Model/Augmented/0/Im078_0.jpg
Grayscaled image written to: Model/Augmented/0/Gray-Im078_0.jpg
Histogram equalized image written to: Model/Augmented/0/Hist-Im078_0.jpg
Horizontally reflected image written to: Model/Augmented/0/Hor-Im078_0.jpg
Vertical r

# Your augmented dataset
If you head to your __Model/Data/__ directory you will notice the augmented directory. Inside the augmented directory you will find 0 (negative) and 1 (postive) directories including resized copies of the original along with Grayscaled, Histogram Equalized, Reflected, Gaussian Blurred and rotated copies.

Using data augmentation I was able to increase the dataset from 49 images per class to 580 per class. This dataset will be used in the [AML/ALL Movidius NCS Classifier](https://github.com/AMLResearchProject/AML-Detection-System/tree/master/Classifiers/Movidius/NCS).

# About the author

[Adam Milton-Barker](https://www.petermossamlallresearch.com/team/adam-milton-barker/profile "Adam Milton-Barker") is a [Bigfinite](https://www.bigfinite.com "Bigfinite") IoT Network Engineer, part of the team that works on the core IoT software. In his spare time he is an [Intel Software Innovator](https://software.intel.com/en-us/intel-software-innovators/overview "Intel Software Innovator") in the fields of Internet of Things, Artificial Intelligence and Virtual Reality.