# Blood Cell Classification
#### Naren Rachapalli
    Using AI, we are able to train the programs to perform a task without us expliciting programming it; instead, we train the program to train itself by letting the program play games with itself. The program below uses a convoluted neutron network to train itself to classify what type of white blood cell it actually is. The convoluted neutron network imitate the structure of neurons that process images in the brain and use techniques to reduce neuron count, as well as maintaining positional relationships in the data by processing the data through multiple layers. The four different white blood cells that the program is training to classify are Eosinophil, Lymphocyte, Monocyte, and Neutrophil. Eosinophil make up 2 to 4 percent of white blood cells(WBC) which excretes acids to combat parasites; Lymphocyte make up 20 to 30 percent of WBCs which migrates in and out of blood; Monocytes make up 2 to 8 percent of WBCs which enter peripheral tissues to become tissue macrophages which can engulf large particles and pathogens; and Neutrophil make up 50 to 70 percent of WBCs and their cytoplasm is packed with pale granules containing lysosomal enzymes and bacteria-killing compounds. In the future, this program can help hasten the mandatory blood tests. 

In [1]:
%matplotlib inline


In [2]:
import numpy as np 
from keras.models import Sequential, load_model
from keras.layers import Dense, Dropout, Activation, Flatten, Input
from keras.layers import Conv2D, GlobalAveragePooling2D, LeakyReLU
from keras.utils import np_utils
from keras.optimizers import adam, SGD, rmsprop
from keras.applications import MobileNet
from string import ascii_uppercase
import matplotlib.pyplot as plt
from pandas_ml import ConfusionMatrix
from sklearn.model_selection import train_test_split
from sklearn.utils import shuffle
from PIL import Image
from glob import glob
import cv2

ImportError: ignored

# Data Preparation

* Uploading and formatting images for training and testing the convoluted neutron network*


In [None]:
# TODO
classes = []
for x in glob("data/train/*"):
    classes.append(x[11:])
num_classes = len(classes)
print(classes)

In [None]:
read_img = lambda path: cv2.resize(cv2.imread(path), (224, 224))
#TODO load in dataset for training

x_train = []
y_train = []
label = 0
for folder in glob("data/train/*"):
    for img in glob(folder + "/*"):
        x_train.append(read_img(img))
        y_train.append(label)
    label += 1
x_train = np.asarray(x_train)
y_train = np.asarray(y_train)

x_test = []
y_test = []
label = 0
for folder in glob("data/test/*"):
    for img in glob(folder + "/*"):
        x_test.append(read_img(img))
        y_test.append(label)
    label += 1
x_test = np.asarray(x_test)
y_test = np.asarray(y_test)
x_train, y_train = shuffle(x_train, y_train)
    

# Converts labels for train and test set to one hot encodings
y_train = np_utils.to_categorical(y_train, num_classes)
y_test = np_utils.to_categorical(y_test, num_classes)
x_train.shape, y_train.shape, x_test.shape, y_test.shape

# A collection of image of White Blood Cells

In [None]:
w=x_train.shape[1]
h=x_train.shape[2]
fig=plt.figure(figsize=(8, 8))
columns = 4
rows = 5
for i in range(1, columns*rows +1):
    img = np.random.randint(10, size=(h,w))
    fig.add_subplot(rows, columns, i)
    plt.imshow(x_train[i])

In [None]:
#TODO
model_name = "Blood CNN.h5"
load_checkpoint = False

In [None]:
#Load existing model
if load_checkpoint:
    model = load_model(model_name)

#Create new model
else:
    model_base = MobileNet(include_top=False,input_shape=x_train.shape[1:])
    model = Sequential()
    model.add(model_base)
    model.add(GlobalAveragePooling2D())
    model.add(Dropout(0,5))
    model.add(Dense(num_classes,activation='softmax'))
model.summary()

In [None]:
#TODO
opt = SGD(lr=0.01)
model.compile(loss='categorical_crossentropy',
              optimizer=opt,
              metrics=['accuracy'])

# Training the Program

In [None]:
while True:
    model.fit(x_train, y_train, 
              batch_size= 16, epochs=1, verbose=1)
    model.save(model_name)

# Evaluation and Testing

# Loss vs Accuracy

In [None]:
score = model.evaluate(x_test, y_test, verbose=0)
"Loss: %s, Accuracy: %s" % (score[0], score[1])

# Proof of Concept

In [None]:
i = 5
plt.imshow(x_test[i])
prediction = model.predict(np.expand_dims(x_test[i], axis=0))
"Expected: %s, Predicted: %s" % (classes[y_test[i].argmax()], classes[prediction.argmax()])

# The Confusion Matrix
    * Where is the program making the most mistakes in its classification*

In [None]:
ConfusionMatrix([classes[one_hot.argmax()] for one_hot in y_test], [classes[pred.argmax()] for pred in model.predict(x_test)]).plot()