## Problem
Build a Deep Learning classification model that takes the hierarchy into consideration.
Take any 4 categories from the dataset (100 images each) such that there is a hierarchical relationship between them for eg:

Animals
- Dog
- Cat

~~Flowers~~ Fruits
- ~~Rose~~ Grapes
- ~~Sunflower~~ Pear

Build a Classification model for the above 4 categories, such that the penalty of inter-category prediction is higher.

## Import stuff

In [None]:
import os
import numpy as np
import sys
from PIL import Image
import PIL.ImageOps

from textwrap import wrap
from matplotlib import pyplot as plt
import matplotlib.cm as cm
import random

import keras
from keras import layers as L
from keras import optimizers as opt
from keras.preprocessing.image import ImageDataGenerator

# for reproducibility
from numpy.random import seed
seed(41)
from tensorflow import set_random_seed
set_random_seed(41)

## Extract and save the image files (28x28) from .npy files

Do not run next __two__ code cell. Data has already been prepared and kept in folders

### Split the npy file and save images

In [None]:
# WARNING: this cell needed to run only once. After downloading the npy files

# used code from: https://github.com/C-Aniruddh/RapidDraw/blob/in-dev/processing/process_all.py
number_images = 100; # Number of images in each category
img_width, img_height = 28, 28


npy_dir = '../data_dump/'
out_dir = './data/'
npy_files = [f for f in os.listdir(npy_dir) if os.path.isfile(os.path.join(npy_dir, f))]
print('Available classes:')
print(npy_files)

categories = []

for x in npy_files:
    category_split = x.split('.')
    category = category_split[0]
    categories.append(category)
    
print('Data from following classes will be unpacked:')
print(categories)

for y in categories:
    if not os.path.exists(os.path.join(out_dir, y)):
        os.makedirs(os.path.join(out_dir, y))

index_cat = 0
for z in npy_files:
    print('Processing file', z)
    images = np.load(os.path.join(npy_dir, z))
    print('Saving in', categories[index_cat])
    number_imgs = range(0, number_images, 1)
    for a in number_imgs:
        print('\t Processing Image', a+1)
        file_name = '%s.jpg' % (a+1)
        file_path = os.path.join(out_dir, categories[index_cat], file_name)
        img = images[a].reshape(img_width, img_height)
        f_img = Image.fromarray(img)
        inverted_image = PIL.ImageOps.invert(f_img)
        inverted_image.save(file_path, 'JPEG')
    index_cat = index_cat + 1

### Have a look at few training images

In [None]:
# randomly picks 4 images per class
r, c = 2, 2
for d in os.listdir(out_dir):
    print(d)
    fig, axs = plt.subplots(r, c)
    cnt = 1
    for i in range(r):
        for j in range(c):
            img = plt.imread(out_dir+d+'/'+str(random.randint(1, 100))+'.jpg')
            axs[i, j].imshow(img, cmap=cm.gray)
            axs[i, j].axis('off')
            cnt += 1
    plt.show()
    plt.close()

## Highlevel idea

I chose to use a neural network to build the classifier. The reason is two-fold:
- CNN based neural networks are historically proven to be very strong at image classification, identification tasks.
- The test is on Deep Learning! ;-)

So, My idea is to build a multi-task model. First it will try to classify images into $animal$ or $fruit$ classes and then it will further classify among the four classes ($dog$, $cat$, $grapes$, $pear$).
The reason behind using a multi-task model is 
- I can use different loss weights for different tasks. A higher weight for animal vs fruit classification loss and a comparatively lower weight to 4 class classification loss.
- This particular model construction captures the hierarchical nature of the problem nicely.


### Data description

Pulled out first 100 samples from each class for training and next 100 from each class for validation/testing.
The test data is completely unseen.

## Create and verify the data generator

Why I need a generator when $4*100*(28*28) = 1.2MB$ would fit nicely in computer memory?
- It is good to be prepare a pipeline which can handle large size data
- To show off!

In [None]:
# few constants
data_path = './data/'
val_data_path = './val_data/'
img_width, img_height, n_channel = 28, 28, 3
img_shape = (img_width, img_height)
batch_size = 32

In [None]:
# create a generator
datagen = ImageDataGenerator(
    rescale=1./255,
    rotation_range=20,
    width_shift_range=0.2,
    height_shift_range=0.2,
    horizontal_flip=True)
train_generator = datagen.flow_from_directory(data_path, target_size=(img_width, img_height)
                    , class_mode='categorical'
                    , batch_size=batch_size, interpolation='nearest'
                    , shuffle=True
                   )
val_generator = ImageDataGenerator(
    rescale=1./255).flow_from_directory(val_data_path, target_size=(img_width, img_height)
                    , class_mode='categorical'
                    , batch_size=batch_size, interpolation='nearest'
                    , shuffle=True
                   )

Why do I need the below generator?
- The $train\_generator$ or $val\_generator$ created above produces target for 4 class classification problem. But our $animal vs fruit$ classification needs a different target for the same batch. Please see the inline comments inside the below functions for details about each type of targets.
- Wrapping the generators to produce X_batch, animal vs fruit target and 4 class classification target.

In [None]:
# mode will be train for both training and validation
def combined_generator(datagen, mode='train'):
    y_4c = None
    while True:
        if mode == 'train':
            X_batch, y_4c = datagen.next() # produces 4 class classification target
        else:
            X_batch = datagen.next()
        if y_4c is not None:
            # class and how the target looks for y_4c
            # cat     [1, 0, 0, 0]
            # dog     [0, 1, 0, 0]
            # grapes  [0, 0, 1, 0]
            # pear    [0, 0, 0, 1]

            # for a_vs_f model we need target of shape (batch_size, 2)
            # class cat and dog -> [1, 0]
            # class grapes and pear -> [0, 1]
            # how class and the target should look for a_vs_f model
            # cat     [1, 0]
            # dog     [1, 0]
            # grapes  [0, 1]
            # pear    [0, 1]

            y_a_f = np.zeros((y_4c.shape[0], 2))
            for idx, y in enumerate(y_4c):
                if np.all(y == [1, 0, 0, 0]) or np.all(y == [0, 1, 0, 0]):
                    y_a_f[idx] = [1, 0]
                else:
                    y_a_f[idx] = [0, 1]
            # produce animal vs fruit and 4class classification target for the sasme set of training images
            yield X_batch, [y_a_f, y_4c]
        else:
            yield X_batch
    

In [None]:
# test how my generator is doing
X, y = next(combined_generator(train_generator)) # ask for a batch

print('Shape of one batch of data:')
print('X shape: ', X.shape)
print('y_a_f shape: ', y[0].shape)
print('y_4c shape: ', y[1].shape)


r, c = 4, 4 # lets not show entire batch, check rxc images
imgs = X[:r*c]
label1 = y[0][:r*c] # animal vs fruit target. see the generator ouput order
label2 = y[1][:r*c] # 4 class classification output. see the generator ouput order

cnt = 0
fig, axs = plt.subplots(r, c)
fig.tight_layout()
for i in range(r):
    for j in range(c):
        img = imgs[cnt]
        axs[i, j].imshow(img, cmap=cm.gray)
        axs[i, j].axis('off')
        axs[i, j].set_title('\n'.join(wrap(str(label1[cnt])+str(label2[cnt]),60))) # took some time to figure this out!
        cnt += 1
plt.show()
plt.close()


## Build models

In [None]:
# some model realted constants
latent_dim = 128
epochs = 50
learning_rate = 1E-4 # with only 100 samples per class, slow cooking (small batch_size, small lr) seems good.

In [None]:
# build the animal vs fruit model
def build_a_vs_f_model(input_dim=(img_width, img_height, n_channel), n_classes=2):
    input_ = L.Input(shape=input_dim)
    x = L.Conv2D(32, kernel_size=(3, 3), activation='relu')(input_)
    x = L.Conv2D(64, kernel_size=(3, 3), activation='relu')(x)
    x = L.MaxPooling2D(pool_size=(2, 2))(x)
    x = L.Dropout(0.25)(x)
    x = L.Flatten()(x)
    feat = L.Dense(latent_dim, activation='relu')(x)
    pred = L.Dense(n_classes, activation='softmax')(feat)

    # this is to classify animal vs fruit
    af_model = keras.models.Model(inputs=input_, outputs=pred, name='af_model')
    # idea is to use this layer output as a input feature to the 4 class classification problem
    af_feature_model = keras.models.Model(inputs=input_, outputs=feat, name='feat_model')
    
    af_model.summary()
    keras.utils.vis_utils.plot_model(af_model, to_file='a_vs_f_model_plot.png', show_shapes=True, show_layer_names=True)

    return af_model, af_feature_model

In [None]:
# build the 4 class classification model
def build_4class_model(input_dim=(img_width, img_height, n_channel), n_classes=4, latent_dim=128):
    input_img = L.Input(shape=input_dim)
    x = L.Conv2D(32, kernel_size=(3, 3), activation='relu')(input_img)
    x = L.Conv2D(64, kernel_size=(3, 3), activation='relu')(x)
    x = L.MaxPooling2D(pool_size=(2, 2))(x)
    x = L.Dropout(0.25)(x)
    x = L.Flatten()(x)
    x = L.Dense(128, activation='relu')(x)
    
    input_feat = L.Input(shape=(latent_dim,))
    y = L.Concatenate()([x, input_feat])
    y = L.Dense(128)(y)
    y = L.Dropout(0.25)(y)
    y = L.Dense(64)(y)
    y = L.Dropout(0.25)(y)
    y = L.Dense(32)(y)
    y = L.Dropout(0.25)(y)
    pred = L.Dense(n_classes, activation='softmax')(y)
    model = keras.models.Model(inputs=[input_img, input_feat], outputs=pred, name='class4_model')
    model.summary()
    keras.utils.vis_utils.plot_model(model, to_file='class4_model_plot.png', show_shapes=True, show_layer_names=True)

    return model

In [None]:
def build_combined_model(a_vs_f_model, feat_model, class4_model):
    inputs = L.Input(shape=(img_width, img_height, n_channel))
    a_f = a_vs_f_model(inputs)
    feat = feat_model(inputs)
    pred = class4_model([inputs, feat])
    model = keras.models.Model(inputs=inputs, outputs=[a_f, pred], name='comb_model')
    model.summary()
    keras.utils.vis_utils.plot_model(model, to_file='comb_model_plot.png', show_shapes=True, show_layer_names=True)
    return model

In [None]:
# now, time to train the model
def train_model():
    
    # create models
    a_vs_f_model, feat_model = build_a_vs_f_model()
    class_4_model = build_4class_model()
    combined_model = build_combined_model(a_vs_f_model, feat_model, class_4_model)
    
    # get opt
    comb_opt = opt.Adam(lr=learning_rate, beta_1=0.9, beta_2=0.999, epsilon=1e-08)

    loss_weights = [2, 1] # animal vs fruit classification is given more weight
    combined_model.compile(optimizer=comb_opt
                           , loss=['categorical_crossentropy', 'categorical_crossentropy']
                           , loss_weights=loss_weights, metrics=['acc'])
    
    # set up callbacks
    model_filepath = 'model.hdf5'
    chkpoint = keras.callbacks.ModelCheckpoint(model_filepath, monitor='val_loss', verbose=1
                    , save_best_only=True, save_weights_only=False, mode='auto', period=1)
    callback_list=[keras.callbacks.History(), chkpoint]
    
    history = combined_model.fit_generator(
            generator=combined_generator(train_generator),
            steps_per_epoch=1+train_generator.n//train_generator.batch_size,
            validation_data=combined_generator(val_generator),
            validation_steps=1+val_generator.n//val_generator.batch_size,
            callbacks=callback_list,
            epochs=epochs
        )
        
    return combined_model, history
    

In [None]:
combined_model, history = train_model()

In [None]:
# lets have a look at the loss and acc of the model for train and val dataset
def plot_model_perf(history):
    # keys are hard to remmeber
    print(history.history.keys())
    
    # summarize history for different accuracies
    plt.plot(history.history['af_model_acc'])
    plt.plot(history.history['val_af_model_acc'])
    plt.title('animal vs fruit model accuracy')
    plt.ylabel('accuracy')
    plt.xlabel('epoch')
    plt.legend(['train', 'test'], loc='upper left')
    plt.show()

    plt.plot(history.history['class4_model_acc'])
    plt.plot(history.history['val_class4_model_acc'])
    plt.title('4 class model accuracy')
    plt.ylabel('accuracy')
    plt.xlabel('epoch')
    plt.legend(['train', 'test'], loc='upper left')
    plt.show()
    
    # summarize history for losses
    plt.plot(history.history['af_model_loss'])
    plt.plot(history.history['val_af_model_loss'])
    plt.title('animal vs fruit model loss')
    plt.ylabel('loss')
    plt.xlabel('epoch')
    plt.legend(['train', 'test'], loc='upper left')
    plt.show()
    
    plt.plot(history.history['class4_model_loss'])
    plt.plot(history.history['val_class4_model_loss'])
    plt.title('4 class model loss')
    plt.ylabel('loss')
    plt.xlabel('epoch')
    plt.legend(['train', 'test'], loc='upper left')
    plt.show()

In [None]:
plot_model_perf(history)

## Bonus

### Model architecture

#### animal vs fruit model

<img src="files/a_vs_f_model_plot.png">

#### 4 class classification model

<img src="files/class4_model_plot.png">

#### Combined model

<img src="files/comb_model_plot.png">

## Next steps

1. Try training animal vs fruit model first, then freeze it and train the comnbined model.
2. Playt with learning rate scheduling. Suppose 1E-4 is my base lr. Linearly increase lr from 1E-6 to 1E-4 over a period of 10 epochs (idea is to not take any catastrophic step while the model wights are still random), then hold the base rate for 30 epochs and finally bring it linearly down to 0 (idea is to take as small steps as possible when the model is stabilised).
3. Add more data.