# The Nature Conservancy Fisheries Monitoring

In this notebook, I tried to solve the [The Nature Conservancy Fisheries Monitoring](https://www.kaggle.com/c/the-nature-conservancy-fisheries-monitoring) challenge using a simple finetuning approach.

Download the challenge's data and put it in the same folder of this notebook, then run all the cells.

In [1]:
import os
import shutil
import hashlib
import zipfile

from keras import layers

Using TensorFlow backend.


In [2]:
data_base_dir = 'data'

train_path = 'train.zip'
test_stg1_path = 'test_stg1.zip'
test_stg2_path = 'test_stg2.7z'

In [3]:
if not os.path.exists(data_base_dir):
    os.makedirs(data_base_dir)

In [4]:
def extract_zip(zip_path, out_dir):
    #if not os.path.exists(out_dir):
    name = os.path.basename(zip_path).split('.')[0]
    zip_ref = zipfile.ZipFile(zip_path, 'r')
    zip_ref.extractall(out_dir)
    zip_ref.close()
    return os.path.join(out_dir, name)

In [5]:
#extract training data
orig_train_dir = extract_zip(train_path, data_base_dir)

In [6]:
#extract stage_1 test data
test_stg1_dir  = extract_zip(test_stg1_path, os.path.join(data_base_dir, 'test'))

In [7]:
%%time
import subprocess
try:
    subprocess.call(r'"C:\Program Files\7-Zip\7z.exe" x ' + test_stg2_path + ' -o' + data_base_dir)
except:
    print("""
    Some thing went wrong, maybe you are not using windows or you dont have 7-Zip installed on your machine.
    You can ignore this cell and extract the ----> {} <---- file with a dedicated software,\
    then copy the images (not the folder) inside ----> {} <----.
    Or, you can uncomment the next cell and run it (not recommended, it takes forever). in this case, \
    extra packages should be installed: py7zlib and tqdm.
    Sorry for this inconvenience.
    """.format(test_stg2_path, test_stg1_dir))

Wall time: 1min 32s


In [8]:
#extract stage_1 test data
# this takes forever to run, if you are in a hurry, extract it using the 7zip software
"""
from tqdm import tqdm_notebook
import py7zlib

class SevenZFile(object):
    
    def __init__(self, filepath):
        fp = open(filepath, 'rb')
        self.archive = py7zlib.Archive7z(fp)
        
    def is_7zfile(cls, filepath):
        is7z = False
        fp = None
        try:
            fp = open(filepath, 'rb')
            archive = py7zlib.Archive7z(fp)
            n = len(archive.getnames())
            is7z = True
        finally:
            if fp:
                fp.close()
        return is7z

    def extractall(self, path):
        for name in tqdm_notebook(self.archive.getnames()):
            outfilename = os.path.join(path, name)
            outdir = os.path.dirname(outfilename)
            if not os.path.exists(outdir):
                os.makedirs(outdir)
            outfile = open(outfilename, 'wb')
            outfile.write(self.archive.getmember(name).read())
            outfile.close()
            
SevenZFile(test_stg2_path).extractall(data_base_dir)
"""

"\nfrom tqdm import tqdm_notebook\nimport py7zlib\n\nclass SevenZFile(object):\n    \n    def __init__(self, filepath):\n        fp = open(filepath, 'rb')\n        self.archive = py7zlib.Archive7z(fp)\n        \n    def is_7zfile(cls, filepath):\n        is7z = False\n        fp = None\n        try:\n            fp = open(filepath, 'rb')\n            archive = py7zlib.Archive7z(fp)\n            n = len(archive.getnames())\n            is7z = True\n        finally:\n            if fp:\n                fp.close()\n        return is7z\n\n    def extractall(self, path):\n        for name in tqdm_notebook(self.archive.getnames()):\n            outfilename = os.path.join(path, name)\n            outdir = os.path.dirname(outfilename)\n            if not os.path.exists(outdir):\n                os.makedirs(outdir)\n            outfile = open(outfilename, 'wb')\n            outfile.write(self.archive.getmember(name).read())\n            outfile.close()\n            \nSevenZFile(test_stg2_path).

In [9]:
test_stg2_dir = os.path.join(data_base_dir, 'test_stg2')

for img in os.listdir(test_stg2_dir):
    shutil.move(os.path.join(test_stg2_dir, img), test_stg1_dir)

## Train validation split

In [10]:
training_dir = os.path.join(data_base_dir, 'train_val_split', 'training')
validation_dir = os.path.join(data_base_dir, 'train_val_split', 'validation')

In [11]:
classes = [class_ for class_ in os.listdir(orig_train_dir) if os.path.isdir(os.path.join(orig_train_dir, class_))]

In [12]:
classes

['ALB', 'BET', 'DOL', 'LAG', 'NoF', 'OTHER', 'SHARK', 'YFT']

In [13]:
for class_ in classes:
    
    class_orig_dir = os.path.join(orig_train_dir, class_)
    class_training_dir = os.path.join(training_dir, class_)
    class_validation_dir = os.path.join(validation_dir, class_)
    
    if not os.path.exists(class_training_dir):
        os.makedirs(class_training_dir)
        
    if not os.path.exists(class_validation_dir):
        os.makedirs(class_validation_dir)

    img_list = os.listdir(class_orig_dir)

    for img in img_list:
        hash_name = hashlib.sha1(img.encode('ascii'))
        if int(hash_name.hexdigest(), 16) % 1000 > 100:
            shutil.copy(os.path.join(class_orig_dir, img), class_training_dir)
        else:
            shutil.copy(os.path.join(class_orig_dir, img), class_validation_dir)

## Finetuning InceptionResnetV2 (trained on imagenet)

Convolutional neural networks are very powerful at learning complex vision tasks, but they often require huge amounts of data to train and may take days to weeks to converge. Unfortunately, data is expensive and difficult to acquire. For most of new applications, no huge amounts of data, nor the required time are available. For example, our 8-classes dataset contains less than 4000 images, clearly insufficient to make a deep neural network learn and discover enough patterns to solve the task. 

A gentile solution to this issue is transfer learning. The idea is to start with a network that was previously trained on a large dataset, typically on a large-scale image-classification task, and then use it either as an initialization or a fixed feature extractor for the task of interest. 
* CNNs as a feature extractor: in this case we take the pretrained network, remove the last classification layer, put another one with the proper number of classes and train only this final layer on our dataset
* Fine-tuning the neural network:  The second strategy is to not only replace and retrain the classifier on top of the network on the new dataset, but to also fine-tune the weights of the pretrained network by continuing the backpropagation. It is possible to fine-tune all the layers of the network, or it is possible to keep some of the earlier layers fixed (due to overfitting concerns) and only fine-tune some higher-level portion of the network.

This is motivated by the observation that the earlier features of a CNN contain more generic features (e.g. edge detectors or color blob detectors) that should be useful to many tasks, but later layers of the CNN become progressively more specific to the details of the classes contained in the original dataset. In case of ImageNet for example, which contains many sea creatures and fish classes, a significant portion of the representational power of the CNN may be devoted to features that are specific to differentiating between fish breeds [1].

The strategy that I adopted is to take a model pretrained on ImageNet dataset and finetune a part of it on our dataset. The model i have chosen is *__InceptionResNetV2__* This architecture, as for most of CNNs, comprises two parts:  a series of convolutional and pooling layers called the  **convolutional base**, followed by a Global average pooling and a densely connected layer that act as a **classifier**.
First, we get rid of the **classification head** that was intended to classify the image into one of the 1000 classes, and we replace it by a global average pooling and a 8 nodes dense layer. We are going to fine-tune some of the last layers, but at the beginning, we need to freeze all the convolutional base and train only the dense layer, the reason is that the weights of the last layer are initialized randomly, thus the error at the beginning is very large due to this total randomness, and backpropagation will send back very large gradient values that would destroy the already trained layers. Once the classifier on top has already been trained, we can unfreeze some of the top convolutional layers and train both these layers and the layer we added. This process is summarized in the list below and illustrated in the figures below (reproduced from [2]).
  1. Add custom layers on top of an already-trained base network.
  2. Freeze the base network.
  3. Train the layers we added.
  4. Unfreeze some layers in the base network.
  5. Jointly train both these layers and the layer we added.
<img src="figures/transfer_learning_1.png" width="400" height="200" />
<img src="figures/transfer_learning_2.png" width="600" height="400" />


In [14]:
from keras.applications.inception_resnet_v2 import InceptionResNetV2

In [15]:
#from keras.applications.vgg16 import VGG16

In [16]:
conv_base = InceptionResNetV2(include_top=False) #VGG16(include_top=False) #

In [17]:
from keras import models, optimizers

In [18]:
model = models.Sequential()

In [19]:
model.add(conv_base)
model.add(layers.GlobalAveragePooling2D())
model.add(layers.Dense(8, activation='softmax'))

In [20]:
conv_base.trainable = False

In [21]:
model.compile(loss='categorical_crossentropy', optimizer=optimizers.Adam(lr=1e-4), metrics=['accuracy'])

In [22]:
from keras.callbacks import ModelCheckpoint, EarlyStopping

checkpointer = ModelCheckpoint('quicksign_inception_resnet_512.h5', monitor='val_loss', save_best_only=True, verbose=1)
earlystopper = EarlyStopping(monitor='val_loss', patience=2)


### Data augmentation

Data augmentation is a technique  to get around the lack of training data by creating fake data from the existing ones. It has been a particularly effective technique for image classification. With very simple transformations, we can create thousands of new valid labeled images that makes the trained model robust.
The transformations we used are:
* rotation
* vertical and horizontal shift
* zoom
* shear
* horizontal flip

The Keras *ImageDataGenerator* class generates batches of tensor image data with real-time data augmentation.

In [23]:
from keras.preprocessing.image import ImageDataGenerator

In [24]:
train_data_gen = ImageDataGenerator(rescale=1./255,
                                   rotation_range=40,
                                   width_shift_range=0.2,
                                   height_shift_range=0.2,
                                   shear_range=0.2,
                                   zoom_range=0.2,
                                   horizontal_flip=True)
validation_data_gen = ImageDataGenerator(rescale=1./255)

In [25]:
train_generator = train_data_gen.flow_from_directory(training_dir,
                                                    target_size=(512, 512),
                                                    batch_size=16,
                                                    class_mode='categorical')
validation_generator = validation_data_gen.flow_from_directory(validation_dir,
                                                    target_size=(512, 512),
                                                    batch_size=32,
                                                    class_mode='categorical')

Found 3393 images belonging to 8 classes.
Found 384 images belonging to 8 classes.


In [26]:
model.fit_generator(train_generator, epochs=20, validation_data=validation_generator, verbose=2,
                    callbacks=[checkpointer, earlystopper])

Epoch 1/20
 - 277s - loss: 1.6380 - acc: 0.4366 - val_loss: 1.6365 - val_acc: 0.4661

Epoch 00001: val_loss improved from inf to 1.63651, saving model to quicksign_inception_resnet_512.h5
Epoch 2/20
 - 260s - loss: 1.5654 - acc: 0.4574 - val_loss: 1.6286 - val_acc: 0.4792

Epoch 00002: val_loss improved from 1.63651 to 1.62862, saving model to quicksign_inception_resnet_512.h5
Epoch 3/20
 - 256s - loss: 1.5393 - acc: 0.4554 - val_loss: 1.5856 - val_acc: 0.4896

Epoch 00003: val_loss improved from 1.62862 to 1.58559, saving model to quicksign_inception_resnet_512.h5
Epoch 4/20
 - 261s - loss: 1.5076 - acc: 0.4627 - val_loss: 1.5803 - val_acc: 0.4948

Epoch 00004: val_loss improved from 1.58559 to 1.58032, saving model to quicksign_inception_resnet_512.h5
Epoch 5/20
 - 259s - loss: 1.4808 - acc: 0.4695 - val_loss: 1.5296 - val_acc: 0.5339

Epoch 00005: val_loss improved from 1.58032 to 1.52963, saving model to quicksign_inception_resnet_512.h5
Epoch 6/20
 - 258s - loss: 1.4586 - acc: 0.4

<keras.callbacks.History at 0x2538dc7c6a0>

In [27]:
model.load_weights('quicksign_inception_resnet_512.h5')

In [28]:
earlystopper = EarlyStopping(monitor='val_loss', patience=5)



In [29]:
conv_base.trainable = True
set_trainable = False
for layer in conv_base.layers:
    if 'conv_7b' in layer.name:
        set_trainable = True
    if set_trainable:
        layer.trainable = True
    else:
        layer.trainable = False
    

In [30]:
model.compile(loss='categorical_crossentropy', optimizer=optimizers.Adam(lr=1e-5), metrics=['accuracy'])

In [31]:
model.fit_generator(train_generator, epochs=100, validation_data=validation_generator, verbose=2,
                    callbacks=[checkpointer, earlystopper])

Epoch 1/100
 - 275s - loss: 1.3416 - acc: 0.5364 - val_loss: 1.3056 - val_acc: 0.5599

Epoch 00001: val_loss improved from 1.49580 to 1.30556, saving model to quicksign_inception_resnet_512.h5
Epoch 2/100
 - 267s - loss: 1.2747 - acc: 0.5678 - val_loss: 1.2482 - val_acc: 0.5599

Epoch 00002: val_loss improved from 1.30556 to 1.24817, saving model to quicksign_inception_resnet_512.h5
Epoch 3/100
 - 272s - loss: 1.2328 - acc: 0.5748 - val_loss: 1.2010 - val_acc: 0.5677

Epoch 00003: val_loss improved from 1.24817 to 1.20097, saving model to quicksign_inception_resnet_512.h5
Epoch 4/100
 - 261s - loss: 1.2006 - acc: 0.5872 - val_loss: 1.1665 - val_acc: 0.5781

Epoch 00004: val_loss improved from 1.20097 to 1.16649, saving model to quicksign_inception_resnet_512.h5
Epoch 5/100
 - 254s - loss: 1.1354 - acc: 0.6042 - val_loss: 1.1372 - val_acc: 0.5833

Epoch 00005: val_loss improved from 1.16649 to 1.13716, saving model to quicksign_inception_resnet_512.h5
Epoch 6/100
 - 258s - loss: 1.1210 

Epoch 45/100
 - 252s - loss: 0.6555 - acc: 0.7878 - val_loss: 0.6936 - val_acc: 0.7839

Epoch 00045: val_loss did not improve from 0.69211
Epoch 46/100
 - 252s - loss: 0.6457 - acc: 0.7984 - val_loss: 0.6879 - val_acc: 0.7786

Epoch 00046: val_loss improved from 0.69211 to 0.68785, saving model to quicksign_inception_resnet_512.h5
Epoch 47/100
 - 253s - loss: 0.6442 - acc: 0.7946 - val_loss: 0.6833 - val_acc: 0.7839

Epoch 00047: val_loss improved from 0.68785 to 0.68334, saving model to quicksign_inception_resnet_512.h5
Epoch 48/100
 - 251s - loss: 0.6354 - acc: 0.8002 - val_loss: 0.6803 - val_acc: 0.7917

Epoch 00048: val_loss improved from 0.68334 to 0.68025, saving model to quicksign_inception_resnet_512.h5
Epoch 49/100
 - 253s - loss: 0.6569 - acc: 0.7905 - val_loss: 0.6772 - val_acc: 0.7943

Epoch 00049: val_loss improved from 0.68025 to 0.67716, saving model to quicksign_inception_resnet_512.h5
Epoch 50/100
 - 254s - loss: 0.6333 - acc: 0.7961 - val_loss: 0.6750 - val_acc: 0.794

 - 258s - loss: 0.4691 - acc: 0.8512 - val_loss: 0.5354 - val_acc: 0.8438

Epoch 00092: val_loss improved from 0.54059 to 0.53543, saving model to quicksign_inception_resnet_512.h5
Epoch 93/100
 - 256s - loss: 0.4665 - acc: 0.8562 - val_loss: 0.5299 - val_acc: 0.8359

Epoch 00093: val_loss improved from 0.53543 to 0.52991, saving model to quicksign_inception_resnet_512.h5
Epoch 94/100
 - 253s - loss: 0.4814 - acc: 0.8460 - val_loss: 0.5303 - val_acc: 0.8333

Epoch 00094: val_loss did not improve from 0.52991
Epoch 95/100
 - 257s - loss: 0.4882 - acc: 0.8392 - val_loss: 0.5330 - val_acc: 0.8464

Epoch 00095: val_loss did not improve from 0.52991
Epoch 96/100
 - 260s - loss: 0.4842 - acc: 0.8501 - val_loss: 0.5267 - val_acc: 0.8438

Epoch 00096: val_loss improved from 0.52991 to 0.52672, saving model to quicksign_inception_resnet_512.h5
Epoch 97/100
 - 257s - loss: 0.4781 - acc: 0.8551 - val_loss: 0.5245 - val_acc: 0.8438

Epoch 00097: val_loss improved from 0.52672 to 0.52447, saving mo

<keras.callbacks.History at 0x2538dbb87b8>

### Retrain on the whole dataset

### Predictions and submission

In [32]:
test_data_gen = ImageDataGenerator(rescale=1./255)

test_generator = test_data_gen.flow_from_directory('data/test/',
                                                    target_size=(512, 512),
                                                    batch_size=64,
                                                    class_mode='categorical',
                                                  shuffle=False)

Found 13153 images belonging to 1 classes.


In [33]:
preds = model.predict_generator(test_generator, verbose=1)



In [34]:
preds.shape

(13153, 8)

In [35]:
import numpy as np
im_names = np.array(os.listdir(os.path.join('data/test', 'test_stg1')))

In [36]:
im_names = ['test_stg2/'+name if 'image' in name else name for name in im_names ]

In [37]:
import pandas as pd

In [38]:
df_names = pd.DataFrame({'image': im_names})

In [39]:
df_preds = pd.DataFrame(data=preds, columns=['ALB','BET','DOL','LAG','NoF','OTHER','SHARK','YFT'])

In [40]:
df_submission = pd.concat([df_names, df_preds], axis=1)

In [41]:
df_submission.to_csv('submission.csv', index=False)

In [42]:
df_submission.head()

Unnamed: 0,image,ALB,BET,DOL,LAG,NoF,OTHER,SHARK,YFT
0,test_stg2/image_00001.jpg,0.449768,0.045104,0.000157,0.000516,0.255778,0.015385,0.034865,0.198427
1,test_stg2/image_00002.jpg,0.425014,0.00437,0.004676,0.000242,0.50453,0.013302,0.0183,0.029566
2,test_stg2/image_00003.jpg,0.836657,0.001974,0.001573,8.7e-05,0.138526,0.015086,0.0003,0.005797
3,test_stg2/image_00004.jpg,0.355983,0.052375,0.116718,0.05472,0.056654,0.063279,0.056551,0.24372
4,test_stg2/image_00005.jpg,0.936777,0.010723,0.002031,5.6e-05,0.004347,0.011373,0.015417,0.019274


## References

[1]: [CS231n: Convolutional Neural Networks for Visual Recognition](http://cs231n.github.io/transfer-learning/)

[2]: Chollet, Francois. Deep learning with python. Manning Publications Co., 2017.
