# Fine tunning de redes preentranadas

Los experimentos realizados en este notebook se basan en las indicaciones de este [blog](https://blog.keras.io/building-powerful-image-classification-models-using-very-little-data.html)

La idea, básimente consiste en:
1. coger una red ya entrenada previamente y quitarle la capa superior
2. clarificar nuestro conjunto de datos con la red resultante del paso anterior
3. diseñar un modelo sencillo cuyo input es el output del punto 2 y entrenarlo

Aparentemente con muy poco cálculo se pueden obtener buenos resultados.

En los siguientes experimentos voy a probar el planteamiento anterior utilizando las redes preentrenadas que vienen con defecto con Keras para ver cual de ellas ofrece mejores resultados.

Después, una vez seleccionada una, intentaré determinar el optimizar el diseño del modelo superior.

## Parámetros comunes para todos los experimentos

In [1]:
import numpy as np
import os
from keras.preprocessing.image import ImageDataGenerator
from keras.models import Sequential
from keras.layers import Dropout, Flatten, Dense
from keras import applications
import os.path


train_data_dir = '../data/train'
validation_data_dir = '../data/validation'

train_features_path = '{}_train_features.npy'
train_labels_path = '{}_train_labels.npy'
validation_features_path = '{}_validation_features.npy'
validation_labels_path = '{}_validation_labels.npy'
weights_path = '{}_top_model.h5'

# TODO: set properly
width, height = 200, 200
train_samples = 1152
validation_samples = 288
categories = 21
batch_size = 4
epochs = 20

Using TensorFlow backend.


## Generación de datos

Definimos unas funciones que, dado un modelo preentrenado, permiten traducir nuestros datos en carácterísticas y etiquetas para utilizarse en el top model.

Primero para los datos de entrenamiento:

In [11]:
def is_train_data_generated(name):
    return os.path.isfile(train_features_path.format(name)) \
       and os.path.isfile(train_labels_path.format(name))

def generate_train_data(name, model):
    
    naive_datagen = ImageDataGenerator(rescale=1. / 255)    
    dataflow = naive_datagen.flow_from_directory(train_data_dir, 
                                                 batch_size=batch_size, 
                                                 class_mode='categorical',
                                                 target_size=(width, height),
                                                 shuffle=False)

    features = None
    labels = None    
    rounds = train_samples // batch_size
    print 'running {} rounds'.format(rounds)
    for i in range(rounds):
        if i % 50 == 0:
            print
            print i,'/',rounds,'.',
        else:
            print '.',
        batch = dataflow.next()
        batch_features = model.predict(batch[0])
        batch_labels = batch[1]

        if features is None:
            features = batch_features
        else:
            features = np.append(features,batch_features,axis=0)

        if labels is None:
            labels = batch_labels
        else:
            labels = np.append(labels,batch_labels,axis=0)
            
    np.save(open(train_features_path.format(name), 'w'), features)
    np.save(open(train_labels_path.format(name), 'w'), labels)

Y ahora para los datos de prueba:

In [12]:
def is_validation_data_generated(name):
    return os.path.isfile(validation_features_path.format(name)) \
       and os.path.isfile(validation_labels_path.format(name))

def generate_validation_data(name, model):
    
    naive_datagen = ImageDataGenerator(rescale=1. / 255)    
    dataflow = naive_datagen.flow_from_directory(validation_data_dir, 
                                                 batch_size=batch_size, 
                                                 class_mode='categorical',
                                                 target_size=(width, height),
                                                 shuffle=False)

    features = None
    labels = None    
    rounds = validation_samples // batch_size
    print 'running {} rounds'.format(rounds)
    for i in range(rounds):
        if i % 50 == 0:
            print
            print i,'/',rounds,'.',
        else:
            print '.',
        batch = dataflow.next()
        batch_features = model.predict(batch[0])
        batch_labels = batch[1]

        if features is None:
            features = batch_features
        else:
            features = np.append(features,batch_features,axis=0)

        if labels is None:
            labels = batch_labels
        else:
            labels = np.append(labels,batch_labels,axis=0)
            
    np.save(open(validation_features_path.format(name), 'w'), features)
    np.save(open(validation_labels_path.format(name), 'w'), labels)

Una función que evita regerar datos ya generados:

In [14]:
def generate_data_if_needed(name, model):
    if not is_train_data_generated(name):
        generate_train_data(name,model)
        
    if not is_validation_data_generated(name):
        generate_validation_data(name,model)

## Top model común a todos los experimentos

In [15]:
def common_top_model(input_shape):
    model = Sequential()
    model.add(Flatten(input_shape=input_shape))
    model.add(Dense(256, activation='relu'))
    model.add(Dropout(0.5))
    model.add(Dense(categories, activation='softmax'))
    return model

## Ejecutor de experimentos

In [19]:
def run_experiment(name, model):    
    
    print 'generating data if needed'
    generate_data_if_needed(name, model)
    
    print 'loading train/validation data'    
    train_features = np.load(open(train_features_path.format(name)))
    train_labels = np.load(open(train_labels_path.format(name)))
    validation_features = np.load(open(validation_features_path.format(name)))
    validation_labels = np.load(open(validation_labels_path.format(name)))

    print 'loaded shapes'
    print '\t',train_features.shape
    print '\t',train_labels.shape
    print '\t',validation_features.shape
    print '\t',validation_labels.shape
              
    print 'training top model'
    top_model = common_top_model(train_features.shape[1:])
    top_model.compile(
        optimizer='rmsprop',
        loss='categorical_crossentropy',
        metrics=['accuracy'])
    top_model.fit(train_features,
                  train_labels,
                  batch_size=batch_size,
                  nb_epoch=epochs,
                  validation_data=(validation_features, validation_labels))
              
    print 'saving top model weights'
    top_model.save_weights(weights_path.format(name))
    print 'done'

## VGG16

In [18]:
def VGG16_exp1():    
    name = 'VGG16_exp1'       
    VGG16 = applications.VGG16(include_top=False, weights='imagenet')
    run_experiment(name, VGG16)
    
# VGG16_exp1()

El output de este experimento ha sido:
```
Using TensorFlow backend.
training the top model
(1152, 6, 6, 512)
(1152, 21)
(288, 6, 6, 512)
(288, 21)
Train on 1152 samples, validate on 288 samples
Epoch 1/20
1152/1152 [==============================] - 17s - loss: 3.7184 - acc: 0.2309 - val_loss: 1.5696 - val_acc: 0.5069
Epoch 2/20
1152/1152 [==============================] - 17s - loss: 1.7444 - acc: 0.4306 - val_loss: 1.1125 - val_acc: 0.6250
Epoch 3/20
1152/1152 [==============================] - 18s - loss: 1.4550 - acc: 0.5226 - val_loss: 0.8640 - val_acc: 0.6875
Epoch 4/20
1152/1152 [==============================] - 20s - loss: 1.1497 - acc: 0.6085 - val_loss: 0.9907 - val_acc: 0.7083
Epoch 5/20
1152/1152 [==============================] - 21s - loss: 1.0092 - acc: 0.6589 - val_loss: 0.6152 - val_acc: 0.7847
Epoch 6/20
1152/1152 [==============================] - 21s - loss: 0.9943 - acc: 0.6953 - val_loss: 0.5913 - val_acc: 0.7847
Epoch 7/20
1152/1152 [==============================] - 21s - loss: 0.8675 - acc: 0.7075 - val_loss: 0.7292 - val_acc: 0.7917
Epoch 8/20
1152/1152 [==============================] - 21s - loss: 0.8566 - acc: 0.7283 - val_loss: 0.5850 - val_acc: 0.8333
Epoch 9/20
1152/1152 [==============================] - 21s - loss: 0.7185 - acc: 0.7509 - val_loss: 1.2415 - val_acc: 0.7083
Epoch 10/20
1152/1152 [==============================] - 21s - loss: 0.6836 - acc: 0.7682 - val_loss: 0.9080 - val_acc: 0.7708
Epoch 11/20
1152/1152 [==============================] - 22s - loss: 0.7027 - acc: 0.7604 - val_loss: 0.7720 - val_acc: 0.7986
Epoch 12/20
1152/1152 [==============================] - 22s - loss: 0.6242 - acc: 0.8012 - val_loss: 0.6787 - val_acc: 0.8333
Epoch 13/20
1152/1152 [==============================] - 22s - loss: 0.5720 - acc: 0.8238 - val_loss: 0.7979 - val_acc: 0.7847
Epoch 14/20
1152/1152 [==============================] - 21s - loss: 0.5702 - acc: 0.8186 - val_loss: 1.0029 - val_acc: 0.7778
Epoch 15/20
1152/1152 [==============================] - 21s - loss: 0.5734 - acc: 0.8212 - val_loss: 0.6922 - val_acc: 0.8403
Epoch 16/20
1152/1152 [==============================] - 21s - loss: 0.4771 - acc: 0.8351 - val_loss: 1.0405 - val_acc: 0.7986
Epoch 17/20
1152/1152 [==============================] - 22s - loss: 0.5103 - acc: 0.8455 - val_loss: 0.9226 - val_acc: 0.7847
Epoch 18/20
1152/1152 [==============================] - 21s - loss: 0.4617 - acc: 0.8585 - val_loss: 0.8290 - val_acc: 0.8333
Epoch 19/20
1152/1152 [==============================] - 22s - loss: 0.4432 - acc: 0.8594 - val_loss: 0.7482 - val_acc: 0.8264
Epoch 20/20
1152/1152 [==============================] - 22s - loss: 0.5049 - acc: 0.8498 - val_loss: 0.9646 - val_acc: 0.7986
```

Es decir, en 5 minutos se obtiene un **val_acc: 0.8403**, cosa que encuentro muy prometedora.

Siguientes pasos: probar con otras redes preentrenadas, quedarme con la que ofrezca el mejor resultado y con ella trabajar en el top model para optizarlo.