# Brief summary of the experiments in this folder

The experiments contained here show the results of experiments trying to modify and fine-tune a pretrained 
Inceptionv3 network [available in the keras framework](https://keras.io/applications/#inceptionv3)

All experiments were run on an amazon p2.xlarge instance with 4 CPUs and 1 NVIDIA GPU. 
On average they took about 40 min/ten training epochs

Ultimately, the model tested in notebooks [Inceptionv3_21](Inceptionv3_21.ipynb) and 
[Inceptionv3_21_weight_test](Inceptionv3_21_weight_test.ipynb) was used in the implementation 
in the [Car-Match website](https://github.com/trnoriega/car-website)

### Initial rmsProp experiments:
Experiments 1-3, 5-8 focused on finding hyperparameters in the rmsProp optimization algorithm 
that allowed for the fastest training with the highest accuracy.

Main conclusions:

- Learning rate is extremely important: even small deviations from 0.001 led to degradation in performance
- Without fine tuning deeper layers there was very little that could be done to push 
__top-3 training accuracy past 0.6598__
- Even with optimization of learning rate good performance still needed about 30-50 training epochs

### Testing other optimization algorithms

Experiments 4, 9-12 tested default settings of other optimization algorithms:

- SGD; which wasn't very impressive
- Adagrad; which after some learning rate fine-tuning outperformed rmsProp in terms of epochs needed to get high accuracy
- Adagradelta; which was not as good as Adagrad
- Adam; which significantly outperformed rmsProp without need of fine-tuning
- Nadam (experiment 12); which took the price for lowest 
number of epochs to reach a high top-3 accuracy: __0.63 after 20 epochs__

__From here on all experiments used the Nadam optimizer with default settings__

### Fine tuning-deeper layers

Experiments 13-14 showed that by fine-tunning the next deepest layer after the classification dense-layer
there was a large increase in top-3 accuracy: __0.9281 after 30 epochs!__

However, this came at the expense of overfitting, as indicated by much worse performance in the validation set.

### Regularization

Experiments 15-21 tested two different regularization methods:

- L2 regularization to the prediction dense layer, which did not work at all. 
The model never learned at all, regardless of regularization parameters.
- Dropout layer between the last convolutional network pooling layer and the prediction dense layer,
which worked like a charm.

## FINAL MODEL

[Experiment 21](Inceptionv3_21.ipynb) contains the most succesful model. 
Progressively finetuning deeper layers over 40 epochs. 
__Top-5 accuracy on the test set was 95%__. Full validation can be found [here](../5_model_validation.ipynb)

Settings:

```
pred_layer_config = {
    'activation': 'softmax',
    'activity_regularizer': None,
    'bias_constraint': None,
    'bias_initializer': {'class_name': 'Zeros', 'config': {}},
    'bias_regularizer': None,
    'kernel_constraint': None,
    'kernel_initializer': {'class_name': 'VarianceScaling',
                           'config': {
                               'distribution': 'uniform',
                               'mode': 'fan_avg',
                               'scale': 1.0,
                               'seed': 8}
                          },
    'kernel_regularizer': None,
    'name': 'predictions',
    'trainable': True,
    'units': NB_CLASSES,
    'use_bias': True}

model = Sequential()
model.add(conv_base)
model.add(Dropout(0.5, seed=21))
model.add(Dense(**pred_layer_config))

optimizer = optimizers.Nadam(lr=0.002, beta_1=0.9, beta_2=0.999, schedule_decay=0.004)
```