We describe the components and how to built a convolution neural network (CNN) similar to that described in Gradient Based Learning Applied to Document Recognition by LeCunn et al.. All the necessary layers/components are built using Theano
. In addition, a number gradient descent methods are tested on the MNIST dataset.
The network will consist of a series of convolution layers, pooling layers and a fully connected layer and can be visualized by the following diagram
We show how to implement various forms of gradient descent and show how to define the Exponential Linear Unit (ELU)
as described in Fast and accurate deep network learning by exponetial lienar units by Clevert et al.
A full describtion can be found in the ipython notebook located in the ipython notebook LeNet5.ipynb
.
- Bengio Yoshua, Glorat Xavier, Understanding the difficulty of training deep feedforward neural networks, AISTATS, pages 249–256, 2010
- Clevert Djork-Arne, Unterthiner Thomas, Hochreiter Sepp, Fast And Accurate Deep Network Learning by Exponential Linear Units (ELU), ICLR 2016, https://arxiv.org/abs/1511.07289
- Goodfellow Ian, Bengio Yoshua, Courville Aaron, Deep Learning, MIT Press, 2016, http://www.deeplearningbook.org
- He Kaiming et al., Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification, ICCV, 2015, pp. 1026-1034
- LeCun Yann et al., Gradient-Based Learning Applied to Document Recognition, PROC. OF THE IEEE., Nov 1998
Rewrite LeNet5 code in a more modular fashion