<h1 style="text-align:center;color:red"> Maleria Disease Detection </h1>

# Maleria:
+ Malaria is a life-threatening disease caused by parasites that are transmitted to people through the bites of infected female Anopheles mosquitoes. It is preventable and curable.
+ In 2018, there were an estimated 228 million cases of malaria worldwide.
+ The estimated number of malaria deaths stood at 405 000 in 2018.
+ Children aged under 5 years are the most vulnerable group affected by malaria; in 2018, they accounted for 67% (272 000) of all malaria deaths worldwide.
+ The WHO African Region carries a disproportionately high share of the global malaria burden. In 2018, the region was home to 93% of malaria cases and 94% of malaria deaths.
+ Total funding for malaria control and elimination reached an estimated USD 2.7 billion in 2018. Contributions from governments of endemic countries amounted to USD 900 million, representing 30% of total funding.

> Credit: https://www.who.int/news-room/fact-sheets/detail/malaria

# Problem Statement
Given image of blood cells of a person, we have to classify whether the person is infected by maleria desease or not. We have dataset containing thousands of images of bloodcell of both the classes (Meleria infected and uninfected). 

I will be using a pretrained model (VGG19) which is trained on imagenet dataset. <br>
<img src="vgg19_image.png" alt="VGG19 Architecture">
<br>
VGG19 is trained on IMAGENET dataset which is 1000 class classification problem, I have used the concept of transfer learning to make is binary classification. 


# Dataset
The dataset can be downloaded from kaggle : https://www.kaggle.com/iarunava/cell-images-for-detecting-malaria
Dataset has 2 classes: 
> 1. Parasitized ( Maleria infected)
> 2. Uninfected  (Maleria uninfected)

Each of the class have 13,780 different images. <br>
I have used just 1000 images in training set (500 images of each class) and 400 images in training set.


In [1]:
# importing libraries
from keras.layers import Input, Dense, Flatten
from keras.applications.vgg19 import VGG19
from keras.preprocessing.image import ImageDataGenerator
from keras.models import Model
from glob import glob
import matplotlib.pyplot as plt

Using TensorFlow backend.


In [2]:
# we have 2 folders in cell_images
train_path = 'cell_images/Train'
test_path = 'cell_images/Test'
# VGG19 takes 224x224x3 image as input
IMAGE_SIZE = [224,224]

In [3]:
# Initializing vgg19 model with imagenet weights , and without including last layer
vgg = VGG19(input_shape=IMAGE_SIZE+[3],weights='imagenet',include_top=False)

In [4]:
# don't train existing weights of VGG19 model
for layer in vgg.layers:
    layer.trainable = False

In [5]:
number_of_classes_in_output = 2

In [6]:
# our layers - you can add more if you want
# x = Dense(1000, activation='relu')(x) => if we have 1000 class classification
x = Flatten()(vgg.output)
prediction = Dense(number_of_classes_in_output,activation='softmax')(x)

In [7]:
# create a model object
model = Model(inputs=vgg.input, outputs = prediction)

In [8]:
model.summary()

Model: "model_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
input_1 (InputLayer)         (None, 224, 224, 3)       0         
_________________________________________________________________
block1_conv1 (Conv2D)        (None, 224, 224, 64)      1792      
_________________________________________________________________
block1_conv2 (Conv2D)        (None, 224, 224, 64)      36928     
_________________________________________________________________
block1_pool (MaxPooling2D)   (None, 112, 112, 64)      0         
_________________________________________________________________
block2_conv1 (Conv2D)        (None, 112, 112, 128)     73856     
_________________________________________________________________
block2_conv2 (Conv2D)        (None, 112, 112, 128)     147584    
_________________________________________________________________
block2_pool (MaxPooling2D)   (None, 56, 56, 128)       0   

We have 20,024,384 Non-trainable parameter because we are using weights of pre-trained VGG19 model

In [9]:
model.compile(
loss='categorical_crossentropy',
optimizer='adam',
metrics=['accuracy'])

# Data augmentation
We can create artificial data from given data, to capture more invariance in model

In [10]:
# Generating training data
train_datagen = ImageDataGenerator(rescale=1./255,
                                  shear_range=0.2,
                                  zoom_range=0.2,
                                  horizontal_flip=True)

# Rescaling  test data
test_dataget = ImageDataGenerator(rescale=1./255)

training_set = train_datagen.flow_from_directory('cell_images/Train',
                                                target_size=(224,224),
                                                batch_size = 64,
                                                class_mode = 'categorical')

test_set = test_dataget.flow_from_directory('cell_images/Test',
                                            target_size = (224, 224),
                                            batch_size = 64,
                                            class_mode = 'categorical')

Found 1000 images belonging to 2 classes.
Found 400 images belonging to 2 classes.


In [None]:
# fit the model
r = model.fit_generator(training_set,
                       validation_data=test_set,
                       epochs=5,
                       steps_per_epoch=len(training_set),
                       validation_steps=len(test_set))


Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5

In [None]:
# loss
plt.plot(r.history['loss'], label='train loss')
plt.plot(r.history['val_loss'], label='val loss')
plt.legend()
plt.show()

In [None]:
r.history.keys()

In [None]:

# accuracies
plt.plot(r.history['accuracy'], label='train acc')
plt.plot(r.history['val_accuracy'], label='val acc')
plt.legend()
plt.show()

In [None]:
import tensorflow as tf

from keras.models import load_model

model.save('model_vgg19.h5')