Skip to content

Article and source code reference to construct a C++ neural network in a weekend without any dependencies

Notifications You must be signed in to change notification settings

jeremyong/cpp_nn_in_a_weekend

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

C++ Neural Network in a Weekend

This repository is the companion code to the article "Neural Network in a Weekend." Readers are welcome to clone the repository and use the code herein as a reference if following along the article. Pull requests and issues filed for errors and bugs in both code and/or documentation are welcome and appreciated. However, pull requests that introduce new features are unlikely to be considered, as the ultimate goal of this code is to be tractable for a newer practitioner getting started with deep learning architectures.

Article pdf link

Compilation and Usage

mkdir build
cd build
# substitute Ninja for your preferred generator
cmake .. -G Ninja
ninja
# trains the network and writes the learned parameters to disk
./src/nn train ../data/train
# evaluate the model loss and accuracy based on the trained parameters
./src/nn evaluate ../data/test ./ff.params

Note that the actual location of the nn executable may depend on your build system and build type. For performance reasons, it recommended to run the training itself with an optimized build, reverting to a development/debug build only when debugging is needed.

Conventions

  1. Member variables have a single underscore suffix (e.g. member_variable_)
  2. The F.T.R. acroynym stands for "For the reader" and precedes suggestions for experimentation, improvements, or alternative implementations
  3. Throughout, you may see the type aliases num_t and rne_t. These aliases refer to float and std::mt199837 respectively and are defined in Model.hpp to easily experiment with alternative precisions and random number engines. The reader may wish to make these parameters changeable by other means.

General Code Structure

The neural network is modeled as a computational graph. The graph itself is the Model defined in Model.hpp. Nodes in the computational graph override the Node base class and must implement various methods to explain how data flows through the node (forwards and backwards).

The fully-connected feedforward node in this example is implemented as FFNode in FFNode.hpp. The cross-entropy loss node is implemented in CELossNode.hpp. Together, these two nodes are all that is needed to train our example on the MNIST dataset.

Data

For your convenience, the MNIST data used to train and test the network is provided uncompressed in the data/ subdirectory. The data is structured like so:

Images

Image data can be parsed using code provided in the MNIST.hpp header, but the data is described here as well. Multi-byte integers are stored with the MSB first, meaning that on a little-endian architecture, the bytes must be flipped. Image pixel data is stored in row-major order and packed contiguously one after another.

 Bytes
[00-03] 0x00000803 (Magic Number: 2051)
[04-07] image count
[08-11] rows
[12-15] columns
[16]    pixel[0, 0]
[17]    pixel[0, 1]
...

Labels

Label data is parsed according to the following byte layout:

 Bytes
[00-03] 0x00000801 (Magic Number: 2049)
[04-07] label count
[8]     label 1
[9]     label 2
...

The parser provided by the MNIST input node validates the magic numbers to ensure the machine endianness is as expected, and also validates that the image data and label data sizes match.

About

Article and source code reference to construct a C++ neural network in a weekend without any dependencies

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published