# Using Inception for indentify fish species
The Nature Conservency put out a competition on Kaggle to classify fish species according to images of captured fish on fishing boats [https://www.kaggle.com/c/the-nature-conservancy-fisheries-monitoring].  Convolution Neural Nets are quite formdible for image classification problems.  In this approach, we take a pre-trained ConvNet (Google's Inception https://arxiv.org/pdf/1512.00567v3.pdf) and fine-tune the top layer for the given training.

## Create a validation set
But first, we have to split the training data so there is a validation set.  Although it's best to use multiple validation/training sets (cross-validation), we will not do so here due to computation restrictions (I could really use a GPU).

In [1]:
import os
import numpy as np
import shutil

np.random.seed(21)

root_dir = '../data/train'
train_dir = '../data/train_split'
val_dir = '../data/val_split'

FishNames = ['ALB', 'BET', 'DOL', 'LAG', 'NoF', 'OTHER', 'SHARK', 'YFT']

# initiate number of training and validation samples (used in keras)
nbr_train_samples = 0
nbr_val_samples = 0

# choose training proportion
split_proportion = 0.8

for fish in FishNames:
    if fish not in os.listdir(train_dir):
        os.mkdir(os.path.join(train_dir, fish))
        
    # list of image files
    total_images = os.listdir(os.path.join(root_dir, fish))
    
    # number of split proportion
    nbr_train = int(len(total_images) * split_proportion)
    
    # shuffle list of iamges
    np.random.shuffle(total_images)
    
    # split into train and validation sets
    train_images = total_images[:nbr_train]
    val_images = total_images[nbr_train:]
    
    for img in train_images: # loop over all the images
        source = os.path.join(root_dir, fish, img)
        target = os.path.join(train_dir, fish, img)
        shutil.copy(source, target)
        nbr_train_samples = nbr_train_samples + 1
        
    if fish not in os.listdir(val_dir):
        os.mkdir(os.path.join(val_dir, fish))
        
    for img in val_images:
        source = os.path.join(root_dir, fish, img)
        target = os.path.join(val_dir, fish, img)
        shutil.copy(source, target)
        nbr_val_samples = nbr_val_samples + 1

## Load the Inception model as the base
Fortunately, Inception is built into keras and can easily be loaded.

In [2]:
import numpy as np
from keras.applications.inception_v3 import InceptionV3
from keras.layers import Flatten, Dense, Dropout
from keras.layers import AveragePooling2D, GlobalAveragePooling2D
from keras.models import Model
from keras.optimizers import RMSprop, SGD
from keras.callbacks import ModelCheckpoint
from keras.preprocessing.image import ImageDataGenerator
from keras.regularizers import l1l2

# input_shape explained in https://keras.io/layers/convolutional/
img_width = 299 # cols
img_height = 299 # rows
nbr_epochs = 1
batch_size = 32

InceptionV3_base = InceptionV3(include_top=False, 
                                weights='imagenet',
                                input_tensor=None,
                                input_shape=(img_height, img_width, 3))


Using TensorFlow backend.


## Add top block of neural nets and compile
Pool together the convolution blocks from Inception.  To prevent overfitting, include dropout on the top layers.  Using $L1$ and $L2$ normalization is also possible, though I didn't find any perfomance benefits from doing so.

In [3]:
output = InceptionV3_base.get_layer(index=-1).output
output = GlobalAveragePooling2D(name='avg_pool')(output)
output = Dense(512, activation='relu', W_regularizer=l1l2(1e-4,1e-4),
               name='topFC1')(output)
output = Dropout(0.5)(output)
output = Dense(64, activation='relu', W_regularizer=l1l2(1e-4,1e-4),
               name='topFC2')(output)
output = Dropout(0.5)(output)
output = Dense(8, activation='softmax', name='predictions')(output)

## Freeze the base lower
Since weights are randomly initialized on the newly-added top layer, we first fine-tune these before updating the Inception base.

In [4]:
for layer in InceptionV3_base.layers:
    layer.trainable = False
    
InceptionV3_model = Model(input=InceptionV3_base.input, output=output)
# InceptionV3_model.summary()
    
InceptionV3_model.compile(loss = 'categorical_crossentropy',
                         optimizer = 'rmsprop',
                         metrics = ['accuracy'])

## Organize and augment input data to keras
Rather than loading the entire training set in memory, we can call images in batches from the disk using the `flow_from_directory` method.  The training data can also easily be augmented with a variety of different methods.  Note that the color scheme in the images must be rescaled to be compatible with Inception.

In [5]:
## set data generators
# augmentation configuration for training:
train_datagen = ImageDataGenerator(rescale=1./255,
                                   shear_range=0.1,
                                   zoom_range=0.1,
                                   rotation_range=10.0,
                                   width_shift_range=0.1,
                                   height_shift_range=0.1,
                                   horizontal_flip=True)

# validation set will only be rescaled
val_datagen = ImageDataGenerator(rescale=1./255)

## set up input from directory to convnet
train_generator = train_datagen.flow_from_directory(
                    train_dir,
                    target_size = (img_width, img_height),
                    batch_size = batch_size,
                    shuffle = True,
                    # save_to_dir = '../data/visualization',
                    # save_prefix = 'aug',
                    classes = FishNames,
                    class_mode = 'categorical')

val_generator = val_datagen.flow_from_directory(
                    val_dir,
                    target_size = (img_width, img_height),
                    shuffle = True,
                    # save_to_dir = '../data/visualization',
                    # save_prefix = 'aug',
                    classes = FishNames,
                    class_mode = 'categorical')

Found 3019 images belonging to 8 classes.
Found 758 images belonging to 8 classes.


## Fit the top layer
We will use Stochastic Gradient Descent to for this training step so that we can avoid getting stuck in a local minima.

In [6]:
InceptionV3_model.fit_generator(train_generator,
                               samples_per_epoch = nbr_train_samples,
                               nb_epoch = nbr_epochs,
                               validation_data = val_generator,
                               nb_val_samples = nbr_val_samples)

Epoch 1/1


<keras.callbacks.History at 0x11f786b00>

## Unfreeze top blocks in the base
Now that the top layer is trained, we go back to the Inception base and fine-tune the top couple of convolution blocks there.  In order to know where the blocks end, look for the `keras.enging.topology.Merge` objects.

In [7]:
for i, layer in enumerate(InceptionV3_base.layers):
    print(i, layer)

0 <keras.engine.topology.InputLayer object at 0x1081eccc0>
1 <keras.layers.convolutional.Convolution2D object at 0x11693b240>
2 <keras.layers.normalization.BatchNormalization object at 0x116c52978>
3 <keras.layers.convolutional.Convolution2D object at 0x116cd1ac8>
4 <keras.layers.normalization.BatchNormalization object at 0x116d6feb8>
5 <keras.layers.convolutional.Convolution2D object at 0x116d8bef0>
6 <keras.layers.normalization.BatchNormalization object at 0x116f25898>
7 <keras.layers.pooling.MaxPooling2D object at 0x116f7ada0>
8 <keras.layers.convolutional.Convolution2D object at 0x116fbef98>
9 <keras.layers.normalization.BatchNormalization object at 0x116fe7eb8>
10 <keras.layers.convolutional.Convolution2D object at 0x11701de80>
11 <keras.layers.normalization.BatchNormalization object at 0x116cbdcf8>
12 <keras.layers.pooling.MaxPooling2D object at 0x1170f0e80>
13 <keras.layers.convolutional.Convolution2D object at 0x117306e80>
14 <keras.layers.normalization.BatchNormalization objec

To set the top two convolution blocks free for further training, we unfreeze layers 172 and onwards (since 172 is the last Merge object in the fixed convolution block).

In [8]:
for layer in InceptionV3_model.layers[:172]:
    layer.trainable = False
for layer in InceptionV3_model.layers[172:]:
    layer.trainable = True
    
# recompile so the modifications take effect
optimizer = SGD(lr=1e-4, momentum=0.9, decay=0.0, nesterov=True)
InceptionV3_model.compile(optimizer = optimizer,
                         loss = 'categorical_crossentropy',
                         metrics = ['accuracy'])

## Fit the model
Fit the model and save the best performer in terms of validation accuracy over the epochs.

In [9]:
best_model_file = 'best_weights.h5'
best_model = ModelCheckpoint(best_model_file, monitor='val_acc', 
                             verbose=1, save_best_only=True)

InceptionV3_model.fit_generator(train_generator,
                               samples_per_epoch = nbr_train_samples,
                               nb_epoch = nbr_epochs,
                               validation_data = val_generator,
                               nb_val_samples = nbr_val_samples,
                               callbacks = [best_model])

Epoch 1/1


<keras.callbacks.History at 0x12c717f98>

## Make predictions
Although the validation was scored without any image alteration, we will apply augmentation averaging over the test data.  The idea is to alter the training data by augmentation methods before predicting their class according to the model.  By iterating over different augmentation combinations, our averaged prediction outperforms a straight-foward prediction scheme.

In [15]:
# build the test generator
nbr_test_samples = 1000
test_data_dir = '../data/test_stg1/'

test_datagen = ImageDataGenerator(rescale=1./255,
                                 shear_range=0.1,
                                 zoom_range=0.1,
                                 width_shift_range=0.1,
                                 height_shift_range=0.1,
                                 horizontal_flip=True)

In [16]:
nbr_augmentation = 5
for idx in range(nbr_augmentation):
    if idx == 0: 
        print('{}st augmentation for testing ...'.format(idx+1))
    else:
        print('{}th augmentation for testing ...'.format(idx+1))
    random_seed = 21
    
    test_generator = test_datagen.flow_from_directory(
                    test_data_dir,
                    target_size = (img_width, img_height),
                    batch_size = batch_size,
                    shuffle = False,
                    seed = random_seed,
                    classes = None,
                    class_mode = None)
    
    test_image_list = test_generator.filenames

    if idx == 0:
        predictions = InceptionV3_model.predict_generator(test_generator, 
                                                          nbr_test_samples)
    else:
        predictions += InceptionV3_model.predict_generator(test_generator,
                                                          nbr_test_samples)    
        
predictions /= nbr_augmentation

1st augmentation for testing ...
Found 1000 images belonging to 1 classes.
2th augmentation for testing ...
Found 1000 images belonging to 1 classes.
3th augmentation for testing ...
Found 1000 images belonging to 1 classes.
4th augmentation for testing ...
Found 1000 images belonging to 1 classes.
5th augmentation for testing ...
Found 1000 images belonging to 1 classes.


## Write the submission file
Following the Kaggle submission guidelines:

In [17]:
f_submit = open('submit_InceptionBasedFishing.csv', 'w')
f_submit.write('image,ALB,BET,DOL,LAG,NoF,OTHER,SHARK,YFT\n')
for i, image_name in enumerate(test_image_list):
    pred = ['%.6f' % p for p in predictions[i,:]]
    f_submit.write('%s,%s\n' % (os.path.basename(image_name),
                                ','.join(pred)))
    
f_submit.close()

## Results
After training for 40 epochs, the validation loss decreased to ~0.20 and the accuracy improved to ~97%.  When testing on Kaggle's test data, the model scored 0.99 on categorical cross-entropy loss (Top 9% as of this writing).  This problem of overfitting the training data is most likely due to the fact that the photos in the test set are taken on different boats.  The best way to avoid this issue to by first implementing a fish detection system then sending the cropped fish through the neural net for training and prediction.

## References
* Fine-tuning Inception on keras: <br>
    https://keras.io/applications/#fine-tune-inceptionv3-on-a-new-set-of-classes
        
* Pre-processing data: <br>
    https://www.kaggle.com/c/the-nature-conservancy-fisheries-monitoring/forums/t/26202
    
* Google's Inception model: <br>
    https://arxiv.org/pdf/1512.00567v3.pdf