# MALARIA DETECTION

### Lets take a look at how we are gonna make our model 

#### Step 1:  Loading and Splitting of the Dataset

- The first step is to load the data and scaling the images to binary 0 and 1 from Parasitized and Uninfected.
- Then we will resize the images to 50 x 50 
- After that suffling of the images before train-test-split and converting the images to a single numpy array
- Splitting the data 
- Converting the type of X_train and X_valid to float32
- Add then One Hot Encoding on y


#### Step 2:  Building the CNN model
- The CNN model is one of the efficient nueral networks for images and performing classifications. We will use tf.keras to build the CNN model.
- We will build a Sequential CNN model.
- We will build a CNN Layer followed by MaxPooling layer which is later followed by BatchNormalisation to normalize the previous layer's output and implement the Dropout regularization. After that we will use Flatten to the outputs. Then the last layer that has function Softmax is the output layer.
- Finally we have to compile the CNN model. We will use optimizer called Adam then will apply the loss function as categorical_crossentropy and an evaluation metric as accuracy.
- Next step is to use the fit function, to train our convolutional neural network (CNN) with X_train and y_train. Lets set the total amounts of epochs as 25 epochs, which is essentially 25 cycles or iterations of the full dataset including a batch size of 120.

####  Step 3 : Predictions and Testing of the Model
- After this we will predict and do evaluation on the builded model. 
- The last step will be to test our model on the HOLDOUT DATASET and making predictions.

### Importing Libraries

In [1]:
# importing the libraries for loading data and visualisation
import os
import cv2
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline 
from PIL import Image
import seaborn as sns

# import for train-test-split
from sklearn.model_selection import train_test_split

# import for One Hot Encoding
from keras.utils import to_categorical

# importing libraries for Model
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D
from tensorflow.keras.layers import Dense, Flatten, Dropout, BatchNormalization

# importing libraries for evaluating the model
from sklearn.metrics import accuracy_score
from sklearn.metrics import confusion_matrix

ModuleNotFoundError: No module named 'cv2'

## Loading Data and Train-Test-Split 

In [None]:
# loading the data of images and setting their labels
data = []
labels = []

Parasitized = os.listdir("../input/cell-images-for-detecting-malaria/cell_images/Parasitized/")

for a in Parasitized:
    
    try:
        image = cv2.imread("../input/cell-images-for-detecting-malaria/cell_images/Parasitized/" + a)
        image_from_array = Image.fromarray(image, 'RGB')
        size_image = image_from_array.resize((50, 50))
        data.append(np.array(size_image))
        labels.append(0)
    
    except AttributeError:
        print("")

Uninfected = os.listdir("../input/cell-images-for-detecting-malaria/cell_images/Uninfected/")

for b in Uninfected:

    try:
        image = cv2.imread("../input/cell-images-for-detecting-malaria/cell_images/Uninfected/" + b)
        image_from_array = Image.fromarray(image, 'RGB')
        size_image = image_from_array.resize((50, 50))
        data.append(np.array(size_image))
        labels.append(1)
    
    except AttributeError:
        print("")

# Creating single numpy array of all the images and labels
data = np.array(data)
labels = np.array(labels)

print('Cells : {} and labels : {}'.format(data.shape , labels.shape))

# lets shuffle the data and labels before splitting them into training and testing sets
n = np.arange(data.shape[0])
np.random.shuffle(n)
data = data[n]
labels = labels[n]

In [None]:
### Splitting the dataset into the Training set and Test set

X_train, X_valid, y_train, y_valid = train_test_split(data, labels, test_size = 0.2, random_state = 0)

print('Train data shape {} ,Test data shape {} '.format(X_train.shape, X_valid.shape))

In [None]:
X_train = X_train.astype('float32')  
X_valid = X_valid.astype('float32')

In [None]:
# One Hot Encoding 
y_train = to_categorical(y_train)
y_valid = to_categorical(y_valid)

## Building Model

In [None]:
# Defining Model
classifier = Sequential()

# CNN layers
classifier.add(Conv2D(32, kernel_size=(3, 3), input_shape = (50, 50, 3), activation = 'relu'))
classifier.add(MaxPooling2D(pool_size = (2, 2)))
classifier.add(BatchNormalization(axis = -1))
classifier.add(Dropout(0.5))   # Dropout prevents overfitting

classifier.add(Conv2D(32, kernel_size=(3, 3), input_shape = (50, 50, 3), activation = 'relu'))
classifier.add(MaxPooling2D(pool_size = (2, 2)))
classifier.add(BatchNormalization(axis = -1))
classifier.add(Dropout(0.5))

classifier.add(Flatten())

classifier.add(Dense(units=128, activation='relu'))
classifier.add(BatchNormalization(axis = -1))
classifier.add(Dropout(0.5))

classifier.add(Dense(units=2, activation='softmax')) 

In [None]:
classifier.compile(optimizer = 'adam', loss = 'categorical_crossentropy', metrics = ['accuracy'])

In [None]:
history = classifier.fit(X_train, y_train, batch_size=120, epochs=15, verbose=1, validation_data=(X_valid, y_valid))

In [None]:
print("Test_Accuracy: {:.2f}%".format(classifier.evaluate(X_valid, y_valid)[1]*100))

-  Summary of the Model

In [None]:
classifier.summary()

## Prediction, Evaluation and Testing of the Model

- Lets do our first prediction using predict and predict X_valid and store in y_pred variable

In [None]:
y_pred = classifier.predict(X_valid)

In [None]:
# Convert back to categorical values 
y_pred = np.argmax(y_pred, axis=1)
y_valid = np.argmax(y_valid, axis=1)

In [None]:
print('Accuracy Score: ', accuracy_score(y_valid, y_pred))

- Evaluation of the CNN model by Plotting Confusion Matrix 

In [None]:
# Plotting the Confusion Matrix
conf = confusion_matrix(y_valid, y_pred)
sns.heatmap(conf, annot=True)

In [None]:
classifier.save("malaria-model.h5")