Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Neural Arithmetic Logic Units #52

Open
howardyclo opened this issue Apr 23, 2019 · 1 comment
Open

Neural Arithmetic Logic Units #52

howardyclo opened this issue Apr 23, 2019 · 1 comment

Comments

@howardyclo
Copy link
Owner

howardyclo commented Apr 23, 2019

Metadata

@howardyclo
Copy link
Owner Author

howardyclo commented May 4, 2019

TL;DR

Present a simple module capable of learning arithmetic functions such as add, sub, mult, div, etc. And can generalize well on unseen data or unseen inference scheme.

DNNs with Non-linearities Struggle to Learn Identity Function

  • Train an autoencoder to reconstruct its input ranged [-5, 5].
  • All autoencoders are identical in its parameterization (3 hidden layers of size 8), only using different non linearities.
  • Trained with MSE loss.
  • Tested in [-20, 20], the error increase severely both below and above the range of numbers seen during training.

The Neural Accumulator (NAC) & Neural Arithmetic Logit Unit (NALU)

  • NAC: A special case of linear layer, whose weight matrix W only consists of {-1, 0, 1}, defined as:
    • W = tanh(\hat{W}) * σ(\hat{M})
    • The elements of W are guaranteed to be [-1, 1], and biased towards {-1, 0, 1} during learning, since {-1, 0, 1} corresponds to the saturation points of either tanh(.) or σ(.)
    • Its output are additions or subtractions of rows in the input vector.
  • NALU: Learns a weighted sum between two sub-cells:
    • One is the original NAC, capable of learning to add and subtract.
    • The other one operates in log space, capable of multiply and divid, e.g., log(XY) = logX + logY; log(X/Y) = logX - log Y; exp(log(X)) = X
    • Altogether, NALU can learn to perform general arithmetic operations.

Limitations of a single NALU [Ref]

  • Can handle either add/subtract or mult/div operations but not a combination of both.
  • For mult/div operations, it cannot handle negative targets as the mult/div gate output is the result of an exponentiation operation which always yeilds positive results.
  • Power operations are only possible when the exponent is in the range of [0, 1].

Related Work

@howardyclo howardyclo changed the title Mathematical Reasoning in Neural Networks Neural Arithmetic Logic Units May 6, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant