The model is a custom CNN with 21 convolutions layers and 3,421,258 parameters.
They are 4 blocks of 4convolutions layers, connected in a DenseNet way, 
with a transition block (the bottleneck) having a convolution 
that divided the number of filters by 2 with kernel size of 1 and a maxpooling2D.
I also add a convolution, with a kernel_size 1 and stride 2, that is connected 
to the next block like Xception.

To prevent the overfitting, there are:
- Data augmentation
- Batch_normalization
- Save model with best validation accurary
- Use global averaging instead of a dense which is sensible to overfitting

There is also data augmentation in validation to improve the amount of sample in the validation set (4059 * 3), so
the best validation accuracy should give a good accuracy in the testing set.

One callback has been added to save the best validation accuray.

The model reduces the vanishing gradient by the implementation of:
- Skip connection
- Use relu as activation function

I tried to add intermediate output layer like inception model but it did not give 
better output, maybe because the neural network is quite small.

The optimizer use is Adam, with a learning rate set to 1e-3, the optimizer will lower
the learning rate when the weight started to be well trained.

After 30 epochs, the model achieves 92% accuracy in the validation test.

After 60 epochs, the model achieves 93.29% accuracy in the validation test.

After 90 epochs, the model achieves 94.56% accuracy in the validation test.

After 120 epochs, the model achieves 95.48% accuracy in the validation test.

Every 30 epochs, I helped the neural network to train by loading the best validation accuracy 
in the 30 epochs previous epochs, and restarted a training session (3 times). 

For training from scratch, comment the load_weight in the second cell, change the learning to 1e-3, in the fit_generator
change the value of variable nb_training to 5 and epochs to 40.

In [1]:
import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
import os
from keras.models import load_model
from keras import backend as K

Using TensorFlow backend.


In [2]:
from keras.models import Model
from keras.layers import Input, Dense, Conv2D, MaxPooling2D, GlobalAveragePooling2D, concatenate
from keras.layers.normalization import BatchNormalization
from keras.optimizers import Adam

input_shape = (128, 192, 3)

def conv_normalization(filters, kernel_size, layer, strides = 1):
    x = Conv2D(filters, kernel_size, activation='relu', padding='same', strides = strides, kernel_initializer='he_normal')(layer)

    x = BatchNormalization()(x)
    return x

def cnn_model(input_shape, len_out, max_pool = True):
    channel = 64
    inputs = Input(shape=input_shape)
    conv = conv_normalization(channel, 3, inputs)
    
    filters = channel
    for i in range (4): # nb_block layer
        filters = filters // 2  #transition block
        print(filters)
        conv = conv_normalization(filters, 1, conv) #Transition block
        conv_prev = conv
        conv = MaxPooling2D(pool_size=(2, 2))(conv)
            
        conv_inter = conv_normalization(filters, 3, conv)
        conv_inter2 = conv_normalization(filters, 3, conv_inter)
        conv_inter3 = conv_normalization(filters, 3, conv_inter2)
        conv = concatenate([conv_normalization(filters, 3, conv_inter3), conv_inter, conv_inter2, 
                            conv_normalization(filters, 1, conv_prev, 2)], axis=3)
        filters = 4 * filters
        
    x = GlobalAveragePooling2D()(conv)
    outputs = Dense(len_out, activation='softmax')(x)

    model = Model(inputs, outputs)

    return model, filters

model, filters = cnn_model(input_shape, 10)
model.load_weights('../input/convo-21/clement_fang_cnn21.h5')

model.compile(optimizer = Adam(lr = 1e-4),
        loss='categorical_crossentropy',
        metrics=['accuracy'])

Instructions for updating:
Colocations handled automatically by placer.
32
64
128
256


In [3]:
model.summary()

__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
input_1 (InputLayer)            (None, 128, 192, 3)  0                                            
__________________________________________________________________________________________________
conv2d_1 (Conv2D)               (None, 128, 192, 64) 1792        input_1[0][0]                    
__________________________________________________________________________________________________
batch_normalization_1 (BatchNor (None, 128, 192, 64) 256         conv2d_1[0][0]                   
__________________________________________________________________________________________________
conv2d_2 (Conv2D)               (None, 128, 192, 32) 2080        batch_normalization_1[0][0]      
__________________________________________________________________________________________________
batch_norm

In [4]:
from keras.preprocessing.image import ImageDataGenerator
batch_size = 128
input_size = (128, 192)

train_datagen = ImageDataGenerator(
            rescale=1. / 255,
            shear_range=0.2,
            zoom_range=0.2,
            validation_split=0.1,
            horizontal_flip=True)

train_generator = train_datagen.flow_from_directory(
            directory="../input/training-ships/ships_train2/ships_train",
            target_size=input_size,
            color_mode="rgb",
            batch_size=batch_size,
            class_mode="categorical",
            subset = "training",
            shuffle=True)

val_generator = train_datagen.flow_from_directory(
            directory="../input/training-ships/ships_train2/ships_train",
            target_size=input_size,
            color_mode="rgb",
            batch_size=batch_size,
            class_mode="categorical",
            subset = "validation",
            shuffle=False)

Found 36568 images belonging to 10 classes.
Found 4059 images belonging to 10 classes.


In [5]:
from keras.utils import np_utils

def compute_score(res, Y_test):
    s = 0
    for i in range (len(res)):
        if res[i] == Y_test[i]:
            s += 1
    return s / (len(res))

ships = np.load('../input/reco-nav-2/ships_test.npz')
X_test = ships['X']
Y_test = ships['Y']
Y_cat = np_utils.to_categorical(Y_test)
del ships

X_test = X_test.astype('float32')
X_test /= 255

In [6]:
import matplotlib.pyplot as plt
from keras.callbacks import ModelCheckpoint

nb_training = 0  # 5 for long session
epochs = 40

weight_path = 'network.hdf5'
model_checkpoint = ModelCheckpoint(weight_path, monitor='val_acc', save_best_only=True, save_weights_only=True, verbose = 1)

for i in range (nb_training):
    if i == 2:
        model.compile(optimizer = Adam(lr = 1e-4), # change the lr from 1e-3 to 1e-4
        loss='categorical_crossentropy',
        metrics=['accuracy'])
    elif i == 3:
        model.compile(optimizer = Adam(lr = 1e-5), # change the lr from 1e-4 to 1e-5
        loss='categorical_crossentropy',
        metrics=['accuracy'])
    
    history = model.fit_generator(
                train_generator,
                validation_data = val_generator,
                validation_steps=(4059 // batch_size) * 3,
                callbacks=[model_checkpoint],
                steps_per_epoch = 36568 // batch_size,
                epochs=epochs)
    plt.plot(history.history['acc'])
    plt.plot(history.history['val_acc'])
    plt.title('Model accuracy')
        
    model.load_weights(weight_path)

In [7]:
if nb_training:
    plt.plot(history.history['acc'])
    plt.plot(history.history['val_acc'])
    plt.title('Model accuracy')

In [8]:
from sklearn.metrics import classification_report, confusion_matrix

model.save("clement_fang_21cnn.h5")
res = model.predict(X_test).argmax(axis=1)
confu = confusion_matrix(Y_cat.argmax(axis=1), res)
print ("score: ", compute_score(res, Y_test))
types = ['containership', 'cruiser', 'destroyer','coastguard', 'smallfish', 'methanier', 'cv', 'corvette', 'submarine', 'tug']
pd.DataFrame({types[i][:3]:confu[:,i] for i in range(len(types))}, index=types)

score:  0.9560439560439561


Unnamed: 0,con,cru,des,coa,sma,met,cv,cor,sub,tug
containership,275,1,0,0,0,0,0,0,0,0
cruiser,0,258,0,0,0,0,1,0,0,0
destroyer,0,0,273,2,0,0,3,6,0,0
coastguard,0,1,2,136,3,0,0,2,1,4
smallfish,1,0,0,2,127,0,0,0,0,0
methanier,2,0,0,0,3,158,0,1,0,0
cv,0,1,3,0,1,0,80,1,0,0
corvette,0,0,15,2,2,0,5,113,1,0
submarine,0,0,0,0,1,0,1,1,97,3
tug,0,0,1,2,1,0,0,0,0,136


<a href="network2.hdf5">link</a>