# WTB

Automatic classification of birds for this project : http://www.cabane-oiseaux.org/

## Data cooking
First, prepare the data : we need a train dataset and a validation dataset.
We use the Keras ImageDataGenerator that can build a training set from directories containing images.
Each sub directory must contains a category of bird : Keras will associate each one to a category in the model.

In [61]:
IMG_ROWS = 150
IMG_COLS = 150
EPOCHS = 10
BATCH_SIZE = 32
NUM_OF_TRAIN_SAMPLES = 3000
NUM_OF_TEST_SAMPLES = 600
DIR = '/home/kvjw3322/Documents/Prez/WTB/images'

In [56]:
from PIL import Image
import keras
from keras.preprocessing.image import ImageDataGenerator, load_img, img_to_array, array_to_img
from keras.callbacks import ModelCheckpoint
import os
import numpy as np

# Generator for train
train_image_generator = ImageDataGenerator()
train_iterator = train_image_generator.flow_from_directory(
    os.path.join(DIR, 'train'), # Root directory
    target_size=(IMG_ROWS, IMG_COLS), # Images will be processed to this size
    batch_size=BATCH_SIZE, # How many images are processed at the same time ?
    class_mode='categorical') # Each subdir is a category

# Generator for validation
valid_image_generator = ImageDataGenerator()
valid_iterator = valid_image_generator.flow_from_directory(
    os.path.join(DIR, 'validation'),
    target_size=(IMG_ROWS, IMG_COLS), # Images will be processed to this size
    batch_size=BATCH_SIZE, # How many images are processed at the same time ?
    class_mode='categorical')

# Number of classes
NUM_OF_CLASSES = len(train_iterator.class_indices)

Found 2728 images belonging to 6 classes.
Found 304 images belonging to 6 classes.


# Prepare human readable predictions

In [57]:
# Map a prediction indice to a label
mapper = { v:k for k,v in train_iterator.class_indices.items()}

def get_human_prediction(model, input_picture_array):
     return mapper.get([ idx[0] for idx, item in np.ndenumerate(model.predict(input_picture_array)[0]) if item == 1.0 ][0])

## RNN : layer by layer
We build the RNN, adding layers one by one

- Conv2D : The input_shape is 150,150 to match the images size. The last dimension is '3' because each pixel is represented by 3 values (RGB). For grayscale images, this parameter would have been '1' (one value to store the grey level). We choose to use 32 filters (or feature maps) : all the filters are appyed to the source image at the same time. The convolution layer "learn" the values of each filter (as synaptic coefficients are in a Dense hidden layer). The kernel_size is the matrix size of the filter
- MaxPooling2D : The subsampling layer get the best value (max with maxpooling) of a set of x pixels. This set comes from a sub matrix of "pool_size"

In [58]:
from keras.layers import Conv2D, Convolution2D, MaxPooling2D, Flatten, Dropout, Dense
from keras.models import Model, Sequential

model = Sequential() 
model.add(Conv2D(filters=32, input_shape=(IMG_ROWS, IMG_COLS, 3), kernel_size=(3,3)))
model.add(MaxPooling2D(pool_size=(2,2)))

model.add(Dropout(0.25))

# Transform the output matrix into a vector for the Dense layer
model.add(Flatten())
model.add(Dense(1024, activation="relu"))
model.add(Dense(NUM_OF_CLASSES, activation="softmax"))

## Let's learn !
It's time to make our network learn : we compile it (as Keras ask us to), and run the learning from our data.

In [59]:
# This Keras Callbak saves the best model according to the accuracy metric
filepath="data/bestmodel-{epoch:02d}-{val_acc:.2f}.hdf5"
checkpoint = ModelCheckpoint(filepath, monitor='val_acc', verbose=1, save_best_only=True, mode='max')

In [60]:
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
model.fit_generator(generator=train_iterator, 
                    steps_per_epoch=NUM_OF_TRAIN_SAMPLES // BATCH_SIZE,
                    epochs=EPOCHS,
                    validation_data=valid_iterator,
                    validation_steps=NUM_OF_TEST_SAMPLES // BATCH_SIZE,
                    callbacks=[checkpoint])

Epoch 1/1

Epoch 00001: val_acc improved from -inf to 0.50893, saving model to data/bestmodel-01-0.51.hdf5


<keras.callbacks.History at 0x7fa0e04b3c88>

# Model unit validation
Let's get a model prediction from a given picture

In [6]:
# Load model from disk
# model.load_weights("/home/kvjw3322/Documents/Prez/WTB/data/bestmodel-25-0.55.hdf5")

In [62]:
picture_path = '/home/kvjw3322/Documents/Prez/WTB/images/train/mesange bleu/01-20160116_092159-00.jpg'
#picture_path = '/home/kvjw3322/Documents/Prez/WTB/images/rouge gorge/01-20160116_094045-00.jpg'

picture = Image.open(picture_path)
picture = picture.resize(size=(150,150))
picture_array = img_to_array(img=picture)
picture_array = np.expand_dims(picture_array, axis=0)
get_human_prediction(model, picture_array)

'rouge gorge'

In [63]:
from sklearn.metrics import confusion_matrix, classification_report

In [71]:
# Y_pred = model.predict_generator(valid_iterator, NUM_OF_TEST_SAMPLES // BATCH_SIZE+1)
# y_pred = np.argmax(Y_pred, axis=1)
# print('Confusion Matrix')
# print(confusion_matrix(valid_iterator.classes, y_pred))
# print('Classification Report')
# target_names = train_iterator.class_indices.keys()
# print(classification_report(valid_iterator.classes, y_pred, target_names=target_names))