# Neural Accumulator and ALU
This is a simple implementation of the Neural Accumulator and Neural Arithmetic Logic Unit as a Flux layer. 

From the paper:
> Here we propose two models that are able to learn to represent and manipulate numbers in a systematic
way. The first supports the ability to accumulate quantities additively, a desirable inductive bias for
linear extrapolation. This model forms the basis for a second model, which supports multiplicative
extrapolation.

## Neural Accumulator
The NAC consists of a special affine transformation that consists of only -1's, 0's, and 1's. This prevents the layer from rescaling the representations when mapping from input to output.

This is accomplished by combining the saturating nonlinearities $tanh$ and $\sigma$:

$$
\begin{align}
\mathbf{a} & = \mathbf{Wx} \\ 
\mathbf{W} &= tanh(\mathbf{\hat{W}}) \odot \sigma(\mathbf{\hat{M}}),\\
\end{align}
$$
where $\mathbf{\hat{W}}$ and $\mathbf{\hat{M}}$ are weight matrices.

In [1]:
# plot of tan(x)*σ(y)

In [2]:
using Flux

In [3]:
# Neural Accumulator
struct NAC
    W
    M
end

NAC(in::Integer, out::Integer) = 
    NAC(param(randn(out, in)), param(randn(out, in)))

(nac::NAC)(x) = (tanh.(nac.W) .* σ.(nac.M)) * x

## Neural Arithmetic Logic Unit
> The NALU consists of two NAC cells interpolated by a learned sigmoidal gate g, such that if the add/subtract subcell’s output value is applied with a weight of 1 (on), the multiply/divide subcell’s is 0 (off) and vice versa.  The first NAC computes the accumulation vector a, which stores results of the NALU’s addition/subtraction operations; it is computed  identically to the original NAC, (i.e., a = Wx). The second NAC operates in log space and is therefore capable of learning to multiply and divide, storing its results in m.

$$
\begin{align}
\mathbf{y} &= \mathbf{g} \odot \mathbf{a} + (1 - \mathbf{g}) \odot \mathbf{m} \\
\mathbf{m} &= \exp \mathbf{W} (\log (|\mathbf{x}| + \epsilon)), \mathbf{g} = \sigma(\mathbf{Gx})
\end{align}
$$

In [4]:
# Neural Arithmetic Logic Unit
struct NALU
    nacₐ :: NAC
    nacₘ :: NAC
    G
end

NALU(in::Integer, out::Integer) = 
    NALU(NAC(in, out), NAC(in, out), param(randn(out, in)))

function (nalu::NALU)(x)
    # gate
    g = σ.(nalu.G*x)
    
    # addition
    a = nalu.nacₐ(x)
    
    # multiplication
    m = exp.(nalu.nacₘ(log.(abs.(x) .+ eps())))
    
    # nalu
    return g .* a + (1 .- g) .* m
end