## Report on Model3 

This model implements the original LeNet architecture from the classic paper by Yann LeCun 

http://yann.lecun.com/exdb/publis/pdf/lecun-01a.pdf.
 
The code is contained within `/tf_lenet.py`

We tried to stay as close to the original as posible with two exceptions: 

 1.  As the images are RGB, there is the design choice of whether they should be:
    1. converted to monochrome, essentially applying  `tf.reduce_mean` along the color-channels dimension 
    2. treated as RGB by setting the shape of the filters in the first convolutional layer to  [5,5,3,16] instead of the [5, 5, 1, 16] shape specified in the original paper.
     We actually tried both posibilities and can easily switch between the two by means of the `drop_colors` hyper-parameter,
     If set to 1, the images are converted to monochrome in the first layer, if set to 0 then the [5,5,3,16] filters are used.
     See first few lines of `tf_lenet.layer_c1` for the gory details.
     
     
 2. Due to the small number of images we had for training, it was necessary to do employ some regularization technique. We chose to do drop out on the first fully connected layer. The `dropout_rate` hyper-parameter controls what portion of conections are dropped. See `tf_lenet.fully_connected` for details.
 
 ## A Note about non-linearity function and the final layer
 
Most recent implementations of the LeNet architecture out there are not really faithful to the original in at least two respects: 

  1. They use `relu` units after two introduce non-linearities after most layers instead of the $A \cdot tanh$ function that LeCun used in his design. 

  2. They use a standard 'sigmoidal' fully connected layer for the final layer, instead of LeCun's proposed gaussian connections.  
  
  
We have followed the original in both respects but also tried the more modern versions and have noted with some surprise that the original version works better, or at least is to find good hyper parameters for easier. We suspect this might be due to the limited number of training images used. 
     
     
## Fine tuning and accuracy

After some (not very systematic) experimentation we determined that a decent choice for hyperparameters is: 

```
{ "model_name" : "model3",
  "rescale_mode" : "",
  "batch_size" : 100,
  "drop_colors" : 1,
  "learning_rate" : 0.0005,
  "dropout_rate" : 0.3,
  "epochs" : 300 }
  ```

The accuracy obtained with this choice is roughly: **91%**, not great but a few points better than with logistic regression. 

In [1]:
import sys
import os
os.chdir('../') 
from train_test import run_test

DEBUG:matplotlib.backends:backend module://ipykernel.pylab.backend_inline version unknown


In [2]:
run_test( "model3", "images/test")

Level 100:train_test:test_4d has shape = ((238, 32, 32, 3),)
Level 99:train_utils:test_lenet : importing tensorflow
Level 99:train_utils:Testing...

INFO:tensorflow:Restoring parameters from models/model3/saved/


0.9117646898542132
