Train and Visualize Binary Neural Networks (Code for: The High-Dimensional Geometry of Binary Neural Networks)
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
data
.gitignore
README.md
bnn.py
bnn_test.py
gen_plots.ipynb
graph_ops.py
hd_figures.ipynb
plotting.py
run_bnn.py
utils.py

README.md

The High-Dimensional Geometry of Binary Neural Networks

This repository contains code to reproduce the results in the following paper:

The High-Dimensional Geometry of Binary Neural Networks

Alexander G. Anderson, Cory P. Berg

Abstract

Recent research has shown that one can train a neural network with binary weights and activations at train time by augmenting the weights with a high-precision continuous latent variable that accumulates small changes from stochastic gradient descent. However, there is a dearth of work to explain why one can effectively capture the features in data with binary weights and activations. Our main result is that the neural networks with binary weights and activations trained using the method of Courbariaux, Hubara et al. (2016) work because of the high-dimensional geometry of binary vectors. In particular, the ideal continuous vectors that extract out features in the intermediate representations of these BNNs are well-approximated by binary vectors in the sense that dot products are approximately preserved. Compared to previous research that demonstrated good classification performance with BNNs, our work explains why these BNNs work in terms of HD geometry. Furthermore, the results and analysis used on BNNs are shown to generalize to neural networks with ternary weights and activations. Our theory serves as a foundation for understanding not only BNNs but a variety of methods that seek to compress traditional neural networks. Furthermore, a better understanding of multilayer binary neural networks serves as a starting point for generalizing BNNs to other neural network architectures such as recurrent neural networks.

Quickstart:

  1. Download the relevant datasets. From the data directory, run:

python download_data.py --data_set MNIST

python download_data.py --data_set cifar-10

  1. (Optional) Run a quick test: python bnn_test.py

Run a longer test: python run_bnn.py --debug (trains a network on a small fraction of CIFAR10)

  1. Train a full network: python run_bnn.py

  2. Visualize a trained network: jupyter notebook gen_plots.ipynb

  3. Analyze quantization of HD vectors assuming a Gaussian distribution: juypter notebook hd_figures.ipynb

Be sure to choose the appropriate path for loading the model.

[Note if the parameters are changed in run_bnn.py, they need to be changed in the visualization script as well.]