NALU

Implementation of Neural Arithmetic Logic Units as discussed in https://arxiv.org/abs/1808.00508

This implementation

The implementation here deviates from the paper when it comes to computing the gate variable g
The paper enforces a dependence of g on the input x with the equation:
However for most purposes the gating function is only dependant upon the task and not the input
and can be learnt independantly of the input.
This implementation uses where G is a learnt scalar.

For recurrent tasks, however, it does make sense to condition the gate value on the input.

Limitations of a single NALU

Can handle either add/subtract or mult/div operations but not a combination of both.
For mult/div operations, it cannot handle negative targets as the mult/div gate output
is the result of an exponentiation operation which always yeilds positive results.
Power operations are only possible when the exponent is in the range of [0, 1].

Advantages of using NALU

The careful design of the mathematics ensure the learnt weights allow for both
interpolation and extrapolation.

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
models		models
utilities		utilities
NALU_experiments.ipynb		NALU_experiments.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

models

models

utilities

utilities

NALU_experiments.ipynb

NALU_experiments.ipynb

README.md

README.md

Repository files navigation

NALU

This implementation

Limitations of a single NALU

Advantages of using NALU

About

Releases

Packages

Languages

iamtrask/NALU-2

Folders and files

Latest commit

History

Repository files navigation

NALU

This implementation

Limitations of a single NALU

Advantages of using NALU

About

Resources

Stars

Watchers

Forks

Languages