### Convolution Neural network: Example and exercise with Keras

- An example of classification into 10 classes of MNIST data is provided for you
- You are then given an another set of data (malaria parasite vs non-affected ones): your tasks are described after this exercise
- Please pay attention and look into how Keras is working and try to change or build your own network and observe its effect on your results

In [14]:
from __future__ import print_function
import keras
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras.layers import BatchNormalization, Activation, GlobalAveragePooling2D
from keras import backend as K

import numpy as np

batch_size = 256
num_classes = 10
epochs = 12

# input image dimensions
img_rows, img_cols = 28, 28
data = np.load('datasets/MNIST_data/mnist.npz')
x_train= data['x_train']
y_train = data['y_train']
x_test = data['x_test']
y_test = data['y_test']

if K.image_data_format() == 'channels_first':
    x_train = x_train.reshape(x_train.shape[0], 1, img_rows, img_cols)
    x_test = x_test.reshape(x_test.shape[0], 1, img_rows, img_cols)
    input_shape = (1, img_rows, img_cols)
else:
    x_train = x_train.reshape(x_train.shape[0], img_rows, img_cols, 1)
    x_test = x_test.reshape(x_test.shape[0], img_rows, img_cols, 1)
    input_shape = (img_rows, img_cols, 1)

x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train /= 255
x_test /= 255
print('x_train shape:', x_train.shape)
print(x_train.shape[0], 'train samples')
print(x_test.shape[0], 'test samples')

# convert class vectors to binary class matrices
y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)



x_train shape: (60000, 28, 28, 1)
60000 train samples
10000 test samples


In [15]:
model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3),input_shape=input_shape))
model.add(BatchNormalization(momentum=0.95))
model.add(Activation('relu'))

model.add(Conv2D(64, kernel_size=(3, 3)))
model.add(BatchNormalization(momentum=0.95))
model.add(Activation('relu'))

model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(128, (3, 3)))   
model.add(BatchNormalization(momentum=0.95))
model.add(Activation('relu'))

model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(256, (3, 3)))
model.add(BatchNormalization(momentum=0.95))
model.add(Activation('relu'))
          
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
# model.add(Flatten()) # we can flatten or use a global pooling scheme.
model.add(GlobalAveragePooling2D())
          
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes, activation='softmax'))

In [None]:
model.compile(loss=keras.losses.categorical_crossentropy, optimizer=keras.optimizers.Adadelta(), metrics=['accuracy'])

# fit your model
model.fit(x_train, y_train, batch_size=batch_size, epochs=epochs, verbose=1, validation_data=(x_test, y_test))

# evaluate your model on test data (validation set)
#score = model.evaluate(x_test, y_test, verbose=0)

# print('Test loss:', score[0])
# print('Test accuracy:', score[1])

Train on 60000 samples, validate on 10000 samples
Epoch 1/12
Epoch 2/12
Epoch 3/12

### EXERCISE: Use another dataset

- Now, you will use a malaria dataset released by Dr. Stefan Jaeger https://ceb.nlm.nih.gov/repositories/malaria-datasets/
- The dataset contains a total of 27,558 cell images with equal instances of parasitized and uninfected cells
- We have provided this dataset as compact 64x64x3 h5 compressed images and a short script is written below to help you getting started
- Details of your task are explained below

In [5]:
import numpy as np
import h5py

# these are small helps to get you started, please change according to your given task
# training set positive samples
train_dataset_pos = h5py.File('datasets/malaria/data_train_affected.h5', "r")
train_pos_data=[]
train_neg_data=[]
# total is 27,558/2 (13778)
for i in range (1,20):
    train_pos_data.append(train_dataset_pos['X'+str(i)])

# training set negative samples
train_dataset_neg = h5py.File('datasets/malaria/data_train_unaffected.h5', "r")
for i in range (1,20):
    train_neg_data.append(train_dataset_neg['X'+str(i)])

print(train_neg_data[0].shape)
print(train_pos_data[0].shape)

(64, 64, 3)
(64, 64, 3)


In [6]:
# split your training data into train-validation and test data

In [7]:
# Create labels for pos-neg samples
y_train = np.hstack([1]*len(train_pos_data)+[0]*len(train_neg_data))

In [8]:
print(y_train)

[1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 0]


In [9]:
x_train = np.concatenate([train_pos_data, train_neg_data])

In [10]:
print(x_train.shape)

(38, 64, 64, 3)


In [11]:
batch_size = 128
num_classes = 2
epochs = 100
n_channels=3

# convert class vectors to binary class matrices
y_train = keras.utils.to_categorical(y_train, num_classes)

In [None]:
#print(y_train)

### Assignment

- Spend sometime here and plan your data: e.g., split data into traing (80%), validation (19%) and test samples (1%)
- Train your data using convolution neural network (feel free to use any number of layers) => observe implications of different architectures that you use
- Run your model and display your training and validation loss and accuracy
- Check your model in tensorboard
- Save checkpoint only for best loss
- Test using your separated test images