## Updated TL;DR

I am just using the pretrained weights from  [@martinpiotte](https://kaggle.com/martinpiotte). Thanks to **@suicaokhoailang** for creating the updated kernel. I think the important steps to improve to 0.9 are:
- Get rid of `lapjv` dependency. It really slows down training/trying different ideas.
- Load images as RGB (and retrain). I can't find where, but the current first place wrote that it helps by ~0.1.

### Interesting:
- The `mpiotte-bootstrap-model` only scored `0.697`. Though, it was better on the playgroud competition.

## TL;DR

I tried to refactor [@martinpiotte](https://kaggle.com/martinpiotte)'s original kernel [here](https://www.kaggle.com/martinpiotte/whale-recognition-model-with-score-0-78563).

I changed almost nothing beside commenting out the latter 380 epochs since it can't fit into a kernel. I also generated the new bounding boxes in my kernel [here](https://www.kaggle.com/suicaokhoailang/generating-whale-bounding-boxes) and saved it as a **.csv** instead of **pickle** for readability. 

A few things to point out:

- Training more will probably improve your score, maybe as many as 500 epochs. We only train for 20 epochs in this kernel.

- You may try to improve your training time by applying this technique (thanks **Brian**): https://www.kaggle.com/c/humpback-whale-identification/discussion/74402#444476 .

- Consider using a pretrained model(s), good for blending.

In [1]:
#!pip install lap
# Read the dataset description
import gzip
# Read or generate p2h, a dictionary of image name to image id (picture to hash)
import pickle
import platform
import random
# Suppress annoying stderr output when importing keras.
import sys
from lap import lapjv
from math import sqrt
# Determine the size of each image
from os.path import isfile

import keras
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
from PIL import Image as pil_image
from imagehash import phash
from keras import backend as K
from keras import regularizers
from keras.engine.topology import Input
from keras.layers import Activation, Add, BatchNormalization, Concatenate, Conv2D, Dense, Flatten, GlobalMaxPooling2D, \
    Lambda, MaxPooling2D, Reshape
from keras.models import Model
from keras.optimizers import Adam
from keras.preprocessing.image import img_to_array
from keras.utils import Sequence
from pandas import read_csv
from scipy.ndimage import affine_transform
from tqdm import tqdm_notebook as tqdm
import time

  from ._conv import register_converters as _register_converters
Using TensorFlow backend.


In [3]:
TRAIN_DF = './data/train.csv'
SUB_Df = './data/sample_submission.csv'
TRAIN = './data/train/'
TEST = '../data/test/'
P2H = './data/metadata/p2h.pickle'
P2SIZE = './data/metadata/p2size.pickle'
BB_DF = "./data/metadata/bounding_boxes.csv"
tagged = dict([(p, w) for _, p, w in read_csv(TRAIN_DF).to_records()])
submit = [p for _, p, _ in read_csv(SUB_Df).to_records()]
join = list(tagged.keys()) + submit

In [4]:
def expand_path(p):
    if isfile(TRAIN + p):
        return TRAIN + p
    if isfile(TEST + p):
        return TEST + p
    return p

## Duplicate image identification

This part was from the original kernel, seems like in the playground competition dulicated images was a real issue. I don't know the case about this one but I took one for the team and generated the results anyway. I'm such a nice chap.

In [7]:
if isfile(P2SIZE):
    print("P2SIZE exists.")
    with open(P2SIZE, 'rb') as f:
        p2size = pickle.load(f)
else:
    p2size = {}
    for p in tqdm(join):
        try :
            size = pil_image.open(expand_path(p)).size
            p2size[p] = size
        except : 
            print(p)

A Jupyter Widget

00028a005.jpg
000dcf7d8.jpg
000e7c7df.jpg
0019c34f4.jpg
001a4d292.jpg
00247bc36.jpg
0027089a4.jpg
002de4d94.jpg
002f52f0c.jpg
002fd89d4.jpg
00313e2d2.jpg
00379666f.jpg
0041a9867.jpg
004344e9f.jpg
0048970f9.jpg
004fa8ff7.jpg
00512687e.jpg
006183fb4.jpg
0061febfc.jpg
0065d4964.jpg
00744bd58.jpg
007886aef.jpg
008a4bc86.jpg
009dd6c96.jpg
00a427ac6.jpg
00ac0fca6.jpg
00bb36bcf.jpg
00ca82e74.jpg
00ccb53ec.jpg
00d101f4b.jpg
00dc02f06.jpg
00e94fd56.jpg
00edab267.jpg
00f44f4e1.jpg
00ff45291.jpg
01034a160.jpg
0110a64a9.jpg
011490ea7.jpg
01288355d.jpg
012dbdb59.jpg
0133056d1.jpg
013cb8d04.jpg
013e23c3f.jpg
01665bc26.jpg
0169cec0e.jpg
016bbca6e.jpg
016ca6f9c.jpg
0170f741c.jpg
0182ae396.jpg
01830c9cf.jpg
01902b965.jpg
01b1ecf7b.jpg
01b7a570a.jpg
01c91eadc.jpg
01d0bd024.jpg
01d48763a.jpg
01d8ab0b0.jpg
01d9c00d4.jpg
01ddb76dc.jpg
01e1319b8.jpg
01e527ebc.jpg
01ea0d37f.jpg
01f47d246.jpg
01f9f858a.jpg
01fb8d6d7.jpg
020025fbc.jpg
02160c50a.jpg
0219364ca.jpg
02283cf37.jpg
0238ebb55.jpg
023ef338e.jpg
02475f

24e2e1bb6.jpg
25016ce54.jpg
25045eeda.jpg
2511292b3.jpg
251bdcbb6.jpg
25229aa3d.jpg
252d84509.jpg
2531265cd.jpg
2538718fd.jpg
253a10e3d.jpg
253a15fb8.jpg
253f87df0.jpg
2546a4873.jpg
254d357a7.jpg
25599244d.jpg
255b2ee9c.jpg
2567dd01b.jpg
256c7e805.jpg
25700d220.jpg
2595349ea.jpg
259f879c9.jpg
25a498d64.jpg
25b15ac25.jpg
25c0664bc.jpg
25c0ca66a.jpg
25c46edcb.jpg
25d83d815.jpg
25e214f65.jpg
25e616dc0.jpg
25e69e235.jpg
25e99e04e.jpg
25eb59462.jpg
25fbd433a.jpg
25fee114d.jpg
260808fa5.jpg
26080d55d.jpg
260fb26fc.jpg
26177f47f.jpg
26222e02e.jpg
2624cbe8a.jpg
262f5cb0c.jpg
2637f5914.jpg
264aa2a7d.jpg
264afc332.jpg
2654ca2c1.jpg
265aedf53.jpg
2665b53de.jpg
268101a2e.jpg
26896f5c4.jpg
268acde04.jpg
2693391fd.jpg
269f4eff6.jpg
26aaa25c7.jpg
26ab9a400.jpg
26d409601.jpg
26d98ecc7.jpg
26edcf428.jpg
26f0e0781.jpg
26f42b551.jpg
26f7254ac.jpg
26f99f936.jpg
270e405fb.jpg
2711e0ba1.jpg
27210cae2.jpg
272146b21.jpg
272c1b86f.jpg
273f1e39e.jpg
27408484f.jpg
2756bce82.jpg
276513d28.jpg
27671a1c3.jpg
277508

4af2c5f98.jpg
4af58beb4.jpg
4afb9eb3c.jpg
4b1222c82.jpg
4b24dd763.jpg
4b317dd80.jpg
4b31a3f4d.jpg
4b3ade927.jpg
4b4345266.jpg
4b4c65d30.jpg
4b4cb7651.jpg
4b709d83b.jpg
4b732fd59.jpg
4b75409fc.jpg
4b795e86c.jpg
4b8943e01.jpg
4b925e1f7.jpg
4b98bfaa9.jpg
4b9925e93.jpg
4b9c984b3.jpg
4badcf6dd.jpg
4bb2557cd.jpg
4bb56cdd8.jpg
4bb651519.jpg
4bc14831a.jpg
4bd973b48.jpg
4bf48759d.jpg
4bf84920b.jpg
4c020946d.jpg
4c0c1c29a.jpg
4c0cb7c7b.jpg
4c0ceac84.jpg
4c162dd64.jpg
4c19ff6a0.jpg
4c49657e3.jpg
4c4c984e5.jpg
4c594bc0f.jpg
4c61badb7.jpg
4c67dbaa9.jpg
4c73b7bf5.jpg
4c775306c.jpg
4c7cc130d.jpg
4c8a67ef7.jpg
4c8b7aa93.jpg
4c90060d8.jpg
4c910047d.jpg
4c955a088.jpg
4c9e3d03b.jpg
4ca1edeb0.jpg
4ca27c9c4.jpg
4caef1b7c.jpg
4cb865beb.jpg
4cbdbb886.jpg
4cc2d18ff.jpg
4cc4b8a69.jpg
4cccd5b04.jpg
4cd4f4e2f.jpg
4cd6b39ec.jpg
4cdcac8a5.jpg
4cdd31702.jpg
4cf6d4672.jpg
4cf7723a8.jpg
4cf8b7c49.jpg
4d124a041.jpg
4d1998606.jpg
4d24a354f.jpg
4d24e7128.jpg
4d24f03da.jpg
4d2f1d959.jpg
4d31ac4d6.jpg
4d35014a3.jpg
4d3e45

70701da3e.jpg
707538b0b.jpg
707ce9cfb.jpg
7082701e3.jpg
708c79614.jpg
709ab2e85.jpg
709e2160f.jpg
70a52a46a.jpg
70a96d168.jpg
70adf3b1f.jpg
70ba21e1b.jpg
70bf108b7.jpg
70bf3f148.jpg
70c5fbbf6.jpg
70cb3c322.jpg
70d6c9e57.jpg
70dcd72b0.jpg
70e345c28.jpg
70e79cdd3.jpg
70eaed7a1.jpg
70f3c5a7e.jpg
70f6f22a9.jpg
70fa2acc3.jpg
70fdc500a.jpg
70fe1cded.jpg
710377f50.jpg
711d601a3.jpg
7129f2495.jpg
713b9a678.jpg
714572bbc.jpg
714d99e8b.jpg
715b9f9d9.jpg
717363f33.jpg
717e2fe32.jpg
7182bdf78.jpg
718ac0c90.jpg
718b4f585.jpg
7194c4d1b.jpg
7198af93e.jpg
71a6363b8.jpg
71b24c355.jpg
71b98293e.jpg
71c440ea3.jpg
71cd2eca4.jpg
71cd9ea03.jpg
71cfdf860.jpg
71dc4563f.jpg
71e0bbd96.jpg
71e13aa44.jpg
71e1c3c79.jpg
71ef8b935.jpg
71fdc3a91.jpg
7202b2868.jpg
720a7bbc3.jpg
7225d8fba.jpg
722741d71.jpg
72301d7f4.jpg
7242ce270.jpg
724a7c2a9.jpg
7254670dd.jpg
7260456f5.jpg
7264774e2.jpg
7264797b5.jpg
726569062.jpg
727cd32c9.jpg
7283ed735.jpg
728905483.jpg
728a9ef2c.jpg
72905e678.jpg
729118514.jpg
7292d8d5f.jpg
729504

96b32756e.jpg
96b9a49e6.jpg
96bcf84a2.jpg
96bdb421f.jpg
96c2b7290.jpg
96d4d0abc.jpg
96d5f04a0.jpg
96d705515.jpg
96df5a7fc.jpg
96e7743f0.jpg
96e82e03f.jpg
96eb39627.jpg
96f13934d.jpg
96f9a1abe.jpg
970003dc7.jpg
97000a97c.jpg
97162d493.jpg
971ec15eb.jpg
972ce58db.jpg
97306e0ae.jpg
97389a8a1.jpg
97398069c.jpg
973a49cbb.jpg
973c6b0b6.jpg
975316ce9.jpg
9756dce3f.jpg
9759c5c06.jpg
975de1cd8.jpg
97611121d.jpg
97634fb77.jpg
97660ae73.jpg
9766d9b89.jpg
976d4394a.jpg
977e95c5b.jpg
97894eb84.jpg
97af0ccaf.jpg
97b709895.jpg
97bd59358.jpg
97bdef817.jpg
97c0a6fac.jpg
97c1dfae4.jpg
97caed6ec.jpg
97cc0288f.jpg
97ce8840c.jpg
97d31c01a.jpg
97dbf341a.jpg
97e038a23.jpg
97f150a23.jpg
97f72805f.jpg
97fa23d21.jpg
980213cce.jpg
980307f53.jpg
9810d93bc.jpg
981226dff.jpg
981b104fa.jpg
982680be6.jpg
982cb1828.jpg
982ee7184.jpg
982fd73d3.jpg
9839cf047.jpg
983dbe231.jpg
984da79f3.jpg
984de26cc.jpg
9855499e1.jpg
9870ab3ad.jpg
9870f00c2.jpg
98721f3cb.jpg
989a21b04.jpg
98af1ebd0.jpg
98b487856.jpg
98b6cee22.jpg
98baee

c0ea7a383.jpg
c0eafea36.jpg
c0ef727fb.jpg
c10383c86.jpg
c104744b9.jpg
c1048b3f4.jpg
c10aa0ed6.jpg
c117b794d.jpg
c11ecabeb.jpg
c13dc7ac2.jpg
c13e46298.jpg
c1473d059.jpg
c148421d8.jpg
c1606ad8d.jpg
c1643ef68.jpg
c16aae45f.jpg
c1754620d.jpg
c17786f3f.jpg
c198ac796.jpg
c19dbac61.jpg
c1a9a1a6b.jpg
c1b1eb7cf.jpg
c1ba2f277.jpg
c1bcb36f7.jpg
c1c2438b7.jpg
c1c5a0a97.jpg
c1c6f669e.jpg
c1ca9023e.jpg
c1cc71317.jpg
c1d4b58a1.jpg
c1e0c3098.jpg
c1e7d19e3.jpg
c1edbe055.jpg
c1f48283c.jpg
c1f8f9e7f.jpg
c1fec8ca9.jpg
c1ffc3d30.jpg
c2039f683.jpg
c20946927.jpg
c20d4ee0c.jpg
c238cbe88.jpg
c24d6174a.jpg
c24dad592.jpg
c25715ee7.jpg
c25834953.jpg
c26e92a03.jpg
c26ed66bf.jpg
c26fa8510.jpg
c27702280.jpg
c2790e7c9.jpg
c27ce139c.jpg
c27e4452e.jpg
c2917a5f0.jpg
c297be4b5.jpg
c29bcb7fd.jpg
c29d97ee3.jpg
c29f0a0ac.jpg
c2a00a179.jpg
c2a4d7180.jpg
c2ace555f.jpg
c2b249f5b.jpg
c2b68dccc.jpg
c2b91aae9.jpg
c2babd6be.jpg
c2bc91fbf.jpg
c2c4f4ad6.jpg
c2c668c71.jpg
c2caaffd9.jpg
c2d855a5a.jpg
c2d8bec9e.jpg
c2db24907.jpg
c2dba5

e66ddb0b0.jpg
e67b3fc69.jpg
e67e2f713.jpg
e67fe1d96.jpg
e6881ee20.jpg
e6896f079.jpg
e6942529b.jpg
e696e6e3b.jpg
e698c535f.jpg
e6a20c522.jpg
e6a3e90e2.jpg
e6a82e9d5.jpg
e6b21c6c5.jpg
e6c608a11.jpg
e6d3bedc4.jpg
e6d8597ab.jpg
e6ecaa027.jpg
e7084c222.jpg
e7100b8c8.jpg
e71a81f97.jpg
e72c1eed8.jpg
e72cb2d5c.jpg
e736315ef.jpg
e745fb448.jpg
e76a2972e.jpg
e76a48fd7.jpg
e76a76e80.jpg
e76ab8b2c.jpg
e76c6e1c6.jpg
e76f30b6b.jpg
e7727e96f.jpg
e778e6301.jpg
e77af2115.jpg
e782e4ae2.jpg
e784b9628.jpg
e789f8192.jpg
e7a1b85a0.jpg
e7a240504.jpg
e7b2053cc.jpg
e7b594b76.jpg
e7b79cbb2.jpg
e7c8f33bc.jpg
e7c9ba511.jpg
e7d685096.jpg
e7da88feb.jpg
e7e2ec238.jpg
e7f41172d.jpg
e80fb0992.jpg
e817c9974.jpg
e8223df28.jpg
e83b786cf.jpg
e83c211ec.jpg
e841f7ed7.jpg
e85418b2f.jpg
e8812b33c.jpg
e894abead.jpg
e895c3fef.jpg
e89a7b2b5.jpg
e8a07b559.jpg
e8aa1d653.jpg
e8ac0d68d.jpg
e8ca24199.jpg
e8d315c1f.jpg
e8d93b878.jpg
e8e0c470c.jpg
e8e754532.jpg
e8f50d0cc.jpg
e8f9dae9e.jpg
e90588126.jpg
e914b21b0.jpg
e92d8e279.jpg
e94a91

In [None]:
def match(h1, h2):
    for p1 in h2ps[h1]:
        for p2 in h2ps[h2]:
            i1 = pil_image.open(expand_path(p1))
            i2 = pil_image.open(expand_path(p2))
            if i1.mode != i2.mode or i1.size != i2.size: return False
            a1 = np.array(i1)
            a1 = a1 - a1.mean()
            a1 = a1 / sqrt((a1 ** 2).mean())
            a2 = np.array(i2)
            a2 = a2 - a2.mean()
            a2 = a2 / sqrt((a2 ** 2).mean())
            a = ((a1 - a2) ** 2).mean()
            if a > 0.1: return False
    return True


if isfile(P2H):
    print("P2H exists.")
    with open(P2H, 'rb') as f:
        p2h = pickle.load(f)
else:
    # Compute phash for each image in the training and test set.
    p2h = {}
    for p in tqdm(join):
        img = pil_image.open(expand_path(p))
        h = phash(img)
        p2h[p] = h

    # Find all images associated with a given phash value.
    h2ps = {}
    for p, h in p2h.items():
        if h not in h2ps: h2ps[h] = []
        if p not in h2ps[h]: h2ps[h].append(p)

    # Find all distinct phash values
    hs = list(h2ps.keys())

    # If the images are close enough, associate the two phash values (this is the slow part: n^2 algorithm)
    h2h = {}
    for i, h1 in enumerate(tqdm(hs)):
        for h2 in hs[:i]:
            if h1 - h2 <= 6 and match(h1, h2):
                s1 = str(h1)
                s2 = str(h2)
                if s1 < s2: s1, s2 = s2, s1
                h2h[s1] = s2

    # Group together images with equivalent phash, and replace by string format of phash (faster and more readable)
    for p, h in p2h.items():
        h = str(h)
        if h in h2h: h = h2h[h]
        p2h[p] = h
#     with open(P2H, 'wb') as f:
#         pickle.dump(p2h, f)
# For each image id, determine the list of pictures
h2ps = {}
for p, h in p2h.items():
    if h not in h2ps: h2ps[h] = []
    if p not in h2ps[h]: h2ps[h].append(p)

In [None]:
def show_whale(imgs, per_row=2):
    n = len(imgs)
    rows = (n + per_row - 1) // per_row
    cols = min(per_row, n)
    fig, axes = plt.subplots(rows, cols, figsize=(24 // per_row * cols, 24 // per_row * rows))
    for ax in axes.flatten(): ax.axis('off')
    for i, (img, ax) in enumerate(zip(imgs, axes.flatten())): ax.imshow(img.convert('RGB'))
        

def read_raw_image(p):
    img = pil_image.open(expand_path(p))
    return img

In [None]:
# For each images id, select the prefered image
def prefer(ps):
    if len(ps) == 1: return ps[0]
    best_p = ps[0]
    best_s = p2size[best_p]
    for i in range(1, len(ps)):
        p = ps[i]
        s = p2size[p]
        if s[0] * s[1] > best_s[0] * best_s[1]:  # Select the image with highest resolution
            best_p = p
            best_s = s
    return best_p

h2p = {}
for h, ps in h2ps.items():
    h2p[h] = prefer(ps)
len(h2p), list(h2p.items())[:5]

In [None]:
# Read the bounding box data from the bounding box kernel (see reference above)
p2bb = pd.read_csv(BB_DF).set_index("Image")

old_stderr = sys.stderr
sys.stderr = open('/dev/null' if platform.system() != 'Windows' else 'nul', 'w')

sys.stderr = old_stderr

img_shape = (384, 384, 1)  # The image shape used by the model
anisotropy = 2.15  # The horizontal compression ratio
crop_margin = 0.05  # The margin added around the bounding box to compensate for bounding box inaccuracy

In [None]:
def build_transform(rotation, shear, height_zoom, width_zoom, height_shift, width_shift):
    """
    Build a transformation matrix with the specified characteristics.
    """
    rotation = np.deg2rad(rotation)
    shear = np.deg2rad(shear)
    rotation_matrix = np.array(
        [[np.cos(rotation), np.sin(rotation), 0], [-np.sin(rotation), np.cos(rotation), 0], [0, 0, 1]])
    shift_matrix = np.array([[1, 0, height_shift], [0, 1, width_shift], [0, 0, 1]])
    shear_matrix = np.array([[1, np.sin(shear), 0], [0, np.cos(shear), 0], [0, 0, 1]])
    zoom_matrix = np.array([[1.0 / height_zoom, 0, 0], [0, 1.0 / width_zoom, 0], [0, 0, 1]])
    shift_matrix = np.array([[1, 0, -height_shift], [0, 1, -width_shift], [0, 0, 1]])
    return np.dot(np.dot(rotation_matrix, shear_matrix), np.dot(zoom_matrix, shift_matrix))

In [None]:
def read_cropped_image(p, augment):
    """
    @param p : the name of the picture to read
    @param augment: True/False if data augmentation should be performed
    @return a numpy array with the transformed image
    """
    # If an image id was given, convert to filename
    if p in h2p:
        p = h2p[p]
    size_x, size_y = p2size[p]

    # Determine the region of the original image we want to capture based on the bounding box.
    row = p2bb.loc[p]
    x0, y0, x1, y1 = row['x0'], row['y0'], row['x1'], row['y1']
    dx = x1 - x0
    dy = y1 - y0
    x0 -= dx * crop_margin
    x1 += dx * crop_margin + 1
    y0 -= dy * crop_margin
    y1 += dy * crop_margin + 1
    if x0 < 0:
        x0 = 0
    if x1 > size_x:
        x1 = size_x
    if y0 < 0:
        y0 = 0
    if y1 > size_y:
        y1 = size_y
    dx = x1 - x0
    dy = y1 - y0
    if dx > dy * anisotropy:
        dy = 0.5 * (dx / anisotropy - dy)
        y0 -= dy
        y1 += dy
    else:
        dx = 0.5 * (dy * anisotropy - dx)
        x0 -= dx
        x1 += dx

    # Generate the transformation matrix
    trans = np.array([[1, 0, -0.5 * img_shape[0]], [0, 1, -0.5 * img_shape[1]], [0, 0, 1]])
    trans = np.dot(np.array([[(y1 - y0) / img_shape[0], 0, 0], [0, (x1 - x0) / img_shape[1], 0], [0, 0, 1]]), trans)
    if augment:
        trans = np.dot(build_transform(
            random.uniform(-5, 5),
            random.uniform(-5, 5),
            random.uniform(0.8, 1.0),
            random.uniform(0.8, 1.0),
            random.uniform(-0.05 * (y1 - y0), 0.05 * (y1 - y0)),
            random.uniform(-0.05 * (x1 - x0), 0.05 * (x1 - x0))
        ), trans)
    trans = np.dot(np.array([[1, 0, 0.5 * (y1 + y0)], [0, 1, 0.5 * (x1 + x0)], [0, 0, 1]]), trans)

    # Read the image, transform to black and white and comvert to numpy array
    img = read_raw_image(p).convert('L')
    img = img_to_array(img)

    # Apply affine transformation
    matrix = trans[:2, :2]
    offset = trans[:2, 2]
    img = img.reshape(img.shape[:-1])
    img = affine_transform(img, matrix, offset, output_shape=img_shape[:-1], order=1, mode='constant',
                           cval=np.average(img))
    img = img.reshape(img_shape)

    # Normalize to zero mean and unit variance
    img -= np.mean(img, keepdims=True)
    img /= np.std(img, keepdims=True) + K.epsilon()
    return img

def read_for_training(p):
    """
    Read and preprocess an image with data augmentation (random transform).
    """
    return read_cropped_image(p, True)


def read_for_validation(p):
    """
    Read and preprocess an image without data augmentation (use for testing).
    """
    return read_cropped_image(p, False)


p = list(tagged.keys())[312]

In [None]:
def subblock(x, filter, **kwargs):
    x = BatchNormalization()(x)
    y = x
    y = Conv2D(filter, (1, 1), activation='relu', **kwargs)(y)  # Reduce the number of features to 'filter'
    y = BatchNormalization()(y)
    y = Conv2D(filter, (3, 3), activation='relu', **kwargs)(y)  # Extend the feature field
    y = BatchNormalization()(y)
    y = Conv2D(K.int_shape(x)[-1], (1, 1), **kwargs)(y)  # no activation # Restore the number of original features
    y = Add()([x, y])  # Add the bypass connection
    y = Activation('relu')(y)
    return y


def build_model(lr, l2, activation='sigmoid'):
    ##############
    # BRANCH MODEL
    ##############
    regul = regularizers.l2(l2)
    optim = Adam(lr=lr)
    kwargs = {'padding': 'same', 'kernel_regularizer': regul}

    inp = Input(shape=img_shape)  # 384x384x1
    x = Conv2D(64, (9, 9), strides=2, activation='relu', **kwargs)(inp)

    x = MaxPooling2D((2, 2), strides=(2, 2))(x)  # 96x96x64
    for _ in range(2):
        x = BatchNormalization()(x)
        x = Conv2D(64, (3, 3), activation='relu', **kwargs)(x)

    x = MaxPooling2D((2, 2), strides=(2, 2))(x)  # 48x48x64
    x = BatchNormalization()(x)
    x = Conv2D(128, (1, 1), activation='relu', **kwargs)(x)  # 48x48x128
    for _ in range(4):
        x = subblock(x, 64, **kwargs)

    x = MaxPooling2D((2, 2), strides=(2, 2))(x)  # 24x24x128
    x = BatchNormalization()(x)
    x = Conv2D(256, (1, 1), activation='relu', **kwargs)(x)  # 24x24x256
    for _ in range(4):
        x = subblock(x, 64, **kwargs)

    x = MaxPooling2D((2, 2), strides=(2, 2))(x)  # 12x12x256
    x = BatchNormalization()(x)
    x = Conv2D(384, (1, 1), activation='relu', **kwargs)(x)  # 12x12x384
    for _ in range(4):
        x = subblock(x, 96, **kwargs)

    x = MaxPooling2D((2, 2), strides=(2, 2))(x)  # 6x6x384
    x = BatchNormalization()(x)
    x = Conv2D(512, (1, 1), activation='relu', **kwargs)(x)  # 6x6x512
    for _ in range(4):
        x = subblock(x, 128, **kwargs)

    x = GlobalMaxPooling2D()(x)  # 512
    branch_model = Model(inp, x)

    ############
    # HEAD MODEL
    ############
    mid = 32
    xa_inp = Input(shape=branch_model.output_shape[1:])
    xb_inp = Input(shape=branch_model.output_shape[1:])
    x1 = Lambda(lambda x: x[0] * x[1])([xa_inp, xb_inp])
    x2 = Lambda(lambda x: x[0] + x[1])([xa_inp, xb_inp])
    x3 = Lambda(lambda x: K.abs(x[0] - x[1]))([xa_inp, xb_inp])
    x4 = Lambda(lambda x: K.square(x))(x3)
    x = Concatenate()([x1, x2, x3, x4])
    x = Reshape((4, branch_model.output_shape[1], 1), name='reshape1')(x)

    # Per feature NN with shared weight is implemented using CONV2D with appropriate stride.
    x = Conv2D(mid, (4, 1), activation='relu', padding='valid')(x)
    x = Reshape((branch_model.output_shape[1], mid, 1))(x)
    x = Conv2D(1, (1, mid), activation='linear', padding='valid')(x)
    x = Flatten(name='flatten')(x)

    # Weighted sum implemented as a Dense layer.
    x = Dense(1, use_bias=True, activation=activation, name='weighted-average')(x)
    head_model = Model([xa_inp, xb_inp], x, name='head')

    ########################
    # SIAMESE NEURAL NETWORK
    ########################
    # Complete model is constructed by calling the branch model on each input image,
    # and then the head model on the resulting 512-vectors.
    img_a = Input(shape=img_shape)
    img_b = Input(shape=img_shape)
    xa = branch_model(img_a)
    xb = branch_model(img_b)
    x = head_model([xa, xb])
    model = Model([img_a, img_b], x)
    model.compile(optim, loss='binary_crossentropy', metrics=['binary_crossentropy', 'acc'])
    return model, branch_model, head_model


model, branch_model, head_model = build_model(64e-5, 0)

In [None]:
h2ws = {}
new_whale = 'new_whale'
for p, w in tagged.items():
    if w != new_whale:  # Use only identified whales
        h = p2h[p]
        if h not in h2ws: h2ws[h] = []
        if w not in h2ws[h]: h2ws[h].append(w)
for h, ws in h2ws.items():
    if len(ws) > 1:
        h2ws[h] = sorted(ws)

# For each whale, find the unambiguous images ids.
w2hs = {}
for h, ws in h2ws.items():
    if len(ws) == 1:  # Use only unambiguous pictures
        w = ws[0]
        if w not in w2hs: w2hs[w] = []
        if h not in w2hs[w]: w2hs[w].append(h)
for w, hs in w2hs.items():
    if len(hs) > 1:
        w2hs[w] = sorted(hs)

In [None]:
train = []  # A list of training image ids
for hs in w2hs.values():
    if len(hs) > 1:
        train += hs
random.shuffle(train)
train_set = set(train)

w2ts = {}  # Associate the image ids from train to each whale id.
for w, hs in w2hs.items():
    for h in hs:
        if h in train_set:
            if w not in w2ts:
                w2ts[w] = []
            if h not in w2ts[w]:
                w2ts[w].append(h)
for w, ts in w2ts.items():
    w2ts[w] = np.array(ts)

t2i = {}  # The position in train of each training image id
for i, t in enumerate(train):
    t2i[t] = i

In [None]:
class TrainingData(Sequence):
    def __init__(self, score, steps=1000, batch_size=32):
        """
        @param score the cost matrix for the picture matching
        @param steps the number of epoch we are planning with this score matrix
        """
        super(TrainingData, self).__init__()
        self.score = -score  # Maximizing the score is the same as minimuzing -score.
        self.steps = steps
        self.batch_size = batch_size
        for ts in w2ts.values():
            idxs = [t2i[t] for t in ts]
            for i in idxs:
                for j in idxs:
                    self.score[
                        i, j] = 10000.0  # Set a large value for matching whales -- eliminates this potential pairing
        self.on_epoch_end()

    def __getitem__(self, index):
        start = self.batch_size * index
        end = min(start + self.batch_size, len(self.match) + len(self.unmatch))
        size = end - start
        assert size > 0
        a = np.zeros((size,) + img_shape, dtype=K.floatx())
        b = np.zeros((size,) + img_shape, dtype=K.floatx())
        c = np.zeros((size, 1), dtype=K.floatx())
        j = start // 2
        for i in range(0, size, 2):
            a[i, :, :, :] = read_for_training(self.match[j][0])
            b[i, :, :, :] = read_for_training(self.match[j][1])
            c[i, 0] = 1  # This is a match
            a[i + 1, :, :, :] = read_for_training(self.unmatch[j][0])
            b[i + 1, :, :, :] = read_for_training(self.unmatch[j][1])
            c[i + 1, 0] = 0  # Different whales
            j += 1
        return [a, b], c

    def on_epoch_end(self):
        if self.steps <= 0: return  # Skip this on the last epoch.
        self.steps -= 1
        self.match = []
        self.unmatch = []
        _, _, x = lapjv(self.score)  # Solve the linear assignment problem
        y = np.arange(len(x), dtype=np.int32)

        # Compute a derangement for matching whales
        for ts in w2ts.values():
            d = ts.copy()
            while True:
                random.shuffle(d)
                if not np.any(ts == d): break
            for ab in zip(ts, d): self.match.append(ab)

        # Construct unmatched whale pairs from the LAP solution.
        for i, j in zip(x, y):
            if i == j:
                print(self.score)
                print(x)
                print(y)
                print(i, j)
            assert i != j
            self.unmatch.append((train[i], train[j]))

        # Force a different choice for an eventual next epoch.
        self.score[x, y] = 10000.0
        self.score[y, x] = 10000.0
        random.shuffle(self.match)
        random.shuffle(self.unmatch)
        # print(len(self.match), len(train), len(self.unmatch), len(train))
        assert len(self.match) == len(train) and len(self.unmatch) == len(train)

    def __len__(self):
        return (len(self.match) + len(self.unmatch) + self.batch_size - 1) // self.batch_size


# Test on a batch of 32 with random costs.
score = np.random.random_sample(size=(len(train), len(train)))
data = TrainingData(score)
(a, b), c = data[0]

In [None]:
# A Keras generator to evaluate only the BRANCH MODEL
class FeatureGen(Sequence):
    def __init__(self, data, batch_size=64, verbose=1):
        super(FeatureGen, self).__init__()
        self.data = data
        self.batch_size = batch_size
        self.verbose = verbose
        if self.verbose > 0: self.progress = tqdm(total=len(self), desc='Features')

    def __getitem__(self, index):
        start = self.batch_size * index
        size = min(len(self.data) - start, self.batch_size)
        a = np.zeros((size,) + img_shape, dtype=K.floatx())
        for i in range(size): a[i, :, :, :] = read_for_validation(self.data[start + i])
        if self.verbose > 0:
            self.progress.update()
            if self.progress.n >= len(self): self.progress.close()
        return a

    def __len__(self):
        return (len(self.data) + self.batch_size - 1) // self.batch_size


class ScoreGen(Sequence):
    def __init__(self, x, y=None, batch_size=2048, verbose=1):
        super(ScoreGen, self).__init__()
        self.x = x
        self.y = y
        self.batch_size = batch_size
        self.verbose = verbose
        if y is None:
            self.y = self.x
            self.ix, self.iy = np.triu_indices(x.shape[0], 1)
        else:
            self.iy, self.ix = np.indices((y.shape[0], x.shape[0]))
            self.ix = self.ix.reshape((self.ix.size,))
            self.iy = self.iy.reshape((self.iy.size,))
        self.subbatch = (len(self.x) + self.batch_size - 1) // self.batch_size
        if self.verbose > 0:
            self.progress = tqdm(total=len(self), desc='Scores')

    def __getitem__(self, index):
        start = index * self.batch_size
        end = min(start + self.batch_size, len(self.ix))
        a = self.y[self.iy[start:end], :]
        b = self.x[self.ix[start:end], :]
        if self.verbose > 0:
            self.progress.update()
            if self.progress.n >= len(self): self.progress.close()
        return [a, b]

    def __len__(self):
        return (len(self.ix) + self.batch_size - 1) // self.batch_size


In [None]:
def set_lr(model, lr):
    K.set_value(model.optimizer.lr, float(lr))


def get_lr(model):
    return K.get_value(model.optimizer.lr)


def score_reshape(score, x, y=None):
    """
    Tranformed the packed matrix 'score' into a square matrix.
    @param score the packed matrix
    @param x the first image feature tensor
    @param y the second image feature tensor if different from x
    @result the square matrix
    """
    if y is None:
        # When y is None, score is a packed upper triangular matrix.
        # Unpack, and transpose to form the symmetrical lower triangular matrix.
        m = np.zeros((x.shape[0], x.shape[0]), dtype=K.floatx())
        m[np.triu_indices(x.shape[0], 1)] = score.squeeze()
        m += m.transpose()
    else:
        m = np.zeros((y.shape[0], x.shape[0]), dtype=K.floatx())
        iy, ix = np.indices((y.shape[0], x.shape[0]))
        ix = ix.reshape((ix.size,))
        iy = iy.reshape((iy.size,))
        m[iy, ix] = score.squeeze()
    return m


def compute_score(verbose=1):
    """
    Compute the score matrix by scoring every pictures from the training set against every other picture O(n^2).
    """
    features = branch_model.predict_generator(FeatureGen(train, verbose=verbose), max_queue_size=12, workers=6,
                                              verbose=0)
    score = head_model.predict_generator(ScoreGen(features, verbose=verbose), max_queue_size=12, workers=6, verbose=0)
    score = score_reshape(score, features)
    return features, score


def make_steps(step, ampl):
    """
    Perform training epochs
    @param step Number of epochs to perform
    @param ampl the K, the randomized component of the score matrix.
    """
    global w2ts, t2i, steps, features, score, histories

    # shuffle the training pictures
    random.shuffle(train)

    # Map whale id to the list of associated training picture hash value
    w2ts = {}
    for w, hs in w2hs.items():
        for h in hs:
            if h in train_set:
                if w not in w2ts: w2ts[w] = []
                if h not in w2ts[w]: w2ts[w].append(h)
    for w, ts in w2ts.items(): w2ts[w] = np.array(ts)

    # Map training picture hash value to index in 'train' array    
    t2i = {}
    for i, t in enumerate(train): t2i[t] = i

    # Compute the match score for each picture pair
    features, score = compute_score()

    # Train the model for 'step' epochs
    history = model.fit_generator(
        TrainingData(score + ampl * np.random.random_sample(size=score.shape), steps=step, batch_size=32),
        initial_epoch=steps, epochs=steps + step, max_queue_size=12, workers=6, verbose=1).history
    steps += step

    # Collect history data
    history['epochs'] = steps
    history['ms'] = np.mean(score)
    history['lr'] = get_lr(model)
    print(history['epochs'], history['lr'], history['ms'])
    histories.append(history)

In [None]:
histories = []
steps = 0

if isfile('../input/piotte/mpiotte-standard.model'):
    tmp = keras.models.load_model('../input/piotte/mpiotte-standard.model')
    model.set_weights(tmp.get_weights())
else:
    # epoch -> 10
    make_steps(10, 1000)
    ampl = 100.0
    for _ in range(2):
        print('noise ampl.  = ', ampl)
        make_steps(5, ampl)
        ampl = max(1.0, 100 ** -0.1 * ampl)
#     # epoch -> 150
#     for _ in range(18): make_steps(5, 1.0)
#     # epoch -> 200
#     set_lr(model, 16e-5)
#     for _ in range(10): make_steps(5, 0.5)
#     # epoch -> 240
#     set_lr(model, 4e-5)
#     for _ in range(8): make_steps(5, 0.25)
#     # epoch -> 250
#     set_lr(model, 1e-5)
#     for _ in range(2): make_steps(5, 0.25)
#     # epoch -> 300
#     weights = model.get_weights()
#     model, branch_model, head_model = build_model(64e-5, 0.0002)
#     model.set_weights(weights)
#     for _ in range(10): make_steps(5, 1.0)
#     # epoch -> 350
#     set_lr(model, 16e-5)
#     for _ in range(10): make_steps(5, 0.5)
#     # epoch -> 390
#     set_lr(model, 4e-5)
#     for _ in range(8): make_steps(5, 0.25)
#     # epoch -> 400
#     set_lr(model, 1e-5)
#     for _ in range(2): make_steps(5, 0.25)
#     model.save('standard.model')

In [None]:
model.summary()

In [None]:
def prepare_submission(threshold, filename):
    """
    Generate a Kaggle submission file.
    @param threshold the score given to 'new_whale'
    @param filename the submission file name
    """
    vtop = 0
    vhigh = 0
    pos = [0, 0, 0, 0, 0, 0]
    with open(filename, 'wt', newline='\n') as f:
        f.write('Image,Id\n')
        for i, p in enumerate(tqdm(submit)):
            t = []
            s = set()
            a = score[i, :]
            for j in list(reversed(np.argsort(a))):
                h = known[j]
                if a[j] < threshold and new_whale not in s:
                    pos[len(t)] += 1
                    s.add(new_whale)
                    t.append(new_whale)
                    if len(t) == 5: break;
                for w in h2ws[h]:
                    assert w != new_whale
                    if w not in s:
                        if a[j] > 1.0:
                            vtop += 1
                        elif a[j] >= threshold:
                            vhigh += 1
                        s.add(w)
                        t.append(w)
                        if len(t) == 5: break;
                if len(t) == 5: break;
            if new_whale not in s: pos[5] += 1
            assert len(t) == 5 and len(s) == 5
            f.write(p + ',' + ' '.join(t[:5]) + '\n')
    return vtop, vhigh, pos

In [None]:
# Find elements from training sets not 'new_whale'
tic = time.time()
h2ws = {}
for p, w in tagged.items():
    if w != new_whale:  # Use only identified whales
        h = p2h[p]
        if h not in h2ws: h2ws[h] = []
        if w not in h2ws[h]: h2ws[h].append(w)
known = sorted(list(h2ws.keys()))

# Dictionary of picture indices
h2i = {}
for i, h in enumerate(known): h2i[h] = i

# Evaluate the model.
fknown = branch_model.predict_generator(FeatureGen(known), max_queue_size=20, workers=10, verbose=0)
fsubmit = branch_model.predict_generator(FeatureGen(submit), max_queue_size=20, workers=10, verbose=0)
score = head_model.predict_generator(ScoreGen(fknown, fsubmit), max_queue_size=20, workers=10, verbose=0)
score = score_reshape(score, fknown, fsubmit)

# Generate the subsmission file.
prepare_submission(0.99, 'submission.csv')
toc = time.time()
print("Submission time: ", (toc - tic) / 60.)