Skip to content

Training stochastic binary networks (SBN) using gumbel-sigmoid and gumbel-softmax

License

Notifications You must be signed in to change notification settings

hellolzc/SBN_gumbel

Repository files navigation

Training stochastic binary networks (SBN) using gumbel-sigmoid and gumbel-softmax

thing is largely a work in progress

Bernoulli SBN Categorical SBN

Main idea:

Explaination and motivation: https://arxiv.org/abs/1611.01144

There has recently been a trick that allows train networks with quasi-discrete categorical activations via gumbel-softmax or gumbel-sigmoid nonlinearity. A great explaination of how it works can be found here.

The trick is to add a special noize to the softmax distribution that favors almost-1-hot outcomes. Such noize can be obtained from gumbel distribution. Since sigmoid can be viewed as a special case of softmax of 2 classes(logit and 0), we can use the same technique to implement an LSTM network with gates that will ultimately be forced to converge to 0 or 1. Here's a demo of gumbel-sigmoid on a toy task.

Such network can then be binarized: multiplication can be replaced with if/else operations and fp16 operations to drastically improve execution speed, especially when implemented in a special-purpose device, see here and here.

Contributors so far

  • Lambda Lab
  • Arseniy Ashukha (advice & useful comments)
  • hellolzc

Environments

  • Python 3
  • PyTorch
  • numpy
  • scikit-learn

About

Training stochastic binary networks (SBN) using gumbel-sigmoid and gumbel-softmax

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages