Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
Enhancement: Convolution networking #14
The purpose of this issue is to discuss implementing convolution neural networking in neural2d. Comments are appreciated.
Neural2d currently implements convolution filtering, but not convolution networking.
A convolution filter is an image processing operation such as edge detection, low pass filtering (smoothing), etc. In neural2d, any layer of neurons can be configured as a convolution filter. In a convolution filter layer, all the neurons share the same convolution kernel, which is just a matrix of weights. The weights do not undergo training; they remain constant throughout the life of the net.
A convolution neural network is somewhat similar to convolution filtering, except that the kernel weights undergo training, and the layer is replicated N times so that N separate kernels can be trained.
To be more precise, when we speak of a convolution neural net, we are usually talking about a regular neural net with one or more convolution network layers that form special subnets within the larger neural network.
It's unfortunate that the terms are so similar -- "convolution filtering" vs. "convolution networking." Is there a better terminology we can use?
The purpose of convolution neural networking is to uncover a number of image features during training that help with pattern recognition. Commonly, image features are little patches of pixels that look like bits of edges, corners, or other shapes. Ideally, all the features found during training are highly uncorrelated, forming a set of orthogonal mathematical bases that can help partition the classification space.
The output from one convolution network layer can become the input to another convolution network layer, so that each convolution stage can find higher level features.
Here are the minimum required changes to implement convolution neural networking in neural2d:
Additionally, it would be useful to implement the following features:
These sub-tasks are discussed below.
Topology configuration file syntax
A convolution filter layer can be thought of as a degenerate convolution network layer with a depth of 1, and no weight updates during backprop.
The existing topology configuration syntax for convolution filtering could be expanded to allow a depth parameter. For example, for a convolution network layer of 10 kernels:
Expand the Layer data structure
Currently a single layer contains a two-dimensional set of neurons. That can be expanded to implement a layer depth. If a layer is configured as a convolution network layer, then the layer instance will be replicated N times to train N kernels. We'll refer to as the layer depth. (Regular layers of neurons and convolution filter layers have a default depth of 1.)
Adding a depth dimension to the layer structure will make it a little more complicated to loop over neurons during forward or back propagation. Perhaps the Layer class should provide a variety of iterators to hide those details. That would be a nice little refactoring project that could be done before the other changes.
Backprop weight updates
Currently, neural2d does not update input weights on layers flagged as convolution filter layers. For convolution network layers, we would need to add a backprop function to update the weights in each convolution network layer.
Rectified linear units
Here is one possible ReLU transfer function and its derivative:
Care must be given to the derivative at zero.
This would be just another transfer function specified by the "tf" parameter in the topology config file.
Pooling the output
To pool the output of a convolution network layer, define a new layer smaller than the convolution layer with the convolution layer as the input, and give it the new "pool" parameter. The pool parameter takes a pooling method (max or avg), and a window size over which to apply the pooling method:
A pooling layer may also take its input from a regular layer of depth 1.
Following is an example of a convolution network pipeline using the syntax proposed above:
added a commit
Apr 28, 2015
added a commit
May 4, 2015
Here are some design notes collected during the implementation of [#14]. These changes will soon be checked in:
A convolution network layer can be defined in the topology config file by specifying a depth on the size parameter and specifying the convolution kernel size with a convolve parameter. For example, to train 40 kernels of size 7x7 on an input image of 64x64 pixels:
In the topology config syntax, the pool parameter requires the argument "avg" or "max" followed by the operator size. For example, to pool a 40-kernel layer of 64x64 neurons into a layer of 16x16 neurons:
There are some performance considerations in addition to the extra level of indirection of the Connection records. In convolution filtering and convolution networking layers, the weight and gradient members of the Connection records are unused, reducing cache efficiency when looping over the connections. We could define different kinds of connection records derived from a base class Connection, but making class Connection virtual immediately adds space overhead to every instance for the vtable, negating any advantages we hoped to achieve. So, I'm keeping class Connection a non-virtual POD object even though it is not as cache-efficient as possible for convolution layers.