# Neural Arithmetic Logic Unit

This notebook contains basic experiments with the NAC and NALU activation units. We attempt to reproduce some of the results found in [arXiv:1808.00508](https://arxiv.org/abs/1808.00508).

In [1]:
import keras as k
from keras.models import *
import keras.backend as K

  from ._conv import register_converters as _register_converters
Using TensorFlow backend.


In [2]:
import nalu.GenData as gd

In [3]:
from nalu.NALU import NALU
from nalu.NAC import NAC

In [4]:
def pred_close(y_true, y_pred):
    return K.abs(y_true - y_pred) < 1e-4

## Idendity

### NAC


In [43]:
_, Y = gd.gd_paper(size=100000)

In [46]:
x = Input((1,)) # n is the dimension of the data, (n, N)
y = NAC(1)(x) # number of units, but let's test
m = Model(x, y)
m.compile(k.optimizers.RMSprop(lr=0.01), "mse", metrics=[pred_close])
m.fit(Y, Y, batch_size=20, epochs=10)

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<keras.callbacks.History at 0x1481b08ac88>

In [54]:
_, Yext = gd.gd_uniform(500000,550000, size=50000)
m.evaluate(Yext, Yext)



[0.0, 1.0]

In [55]:
x = Input((1,)) # n is the dimension of the data, (n, N)
y = NALU(1)(x) # number of units, but let's test
m = Model(x, y)
m.compile(k.optimizers.RMSprop(lr=0.01), "mse", metrics=[pred_close])
m.fit(Y, Y, batch_size=20, epochs=10)

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<keras.callbacks.History at 0x1481b05e860>

In [57]:
_, Yext = gd.gd_uniform(500000,550000, size=50000)
m.evaluate(Yext, Yext)




[<tf.Variable 'nalu_7/W_hat:0' shape=(1, 1) dtype=float32_ref>,
 <tf.Variable 'nalu_7/M_hat:0' shape=(1, 1) dtype=float32_ref>,
 <tf.Variable 'nalu_7/G:0' shape=(1, 1) dtype=float32_ref>]

In [64]:
m.get_weights()

[array([[9.012543]], dtype=float32),
 array([[16.635576]], dtype=float32),
 array([[0.8285337]], dtype=float32)]

## Addition

### NALU/paper

In [33]:
X, Y = gd.gd_paper(size=100000)

In [36]:
x = Input((2,)) # n is the dimension of the data, (n, N)
y = NALU(1)(x) # number of units, but let's test
m = Model(x, y)
m.compile(k.optimizers.RMSprop(lr=0.01), "mse", metrics=[pred_close])
m.fit(X, Y, batch_size=20, epochs=50)

Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50


<keras.callbacks.History at 0x147df463128>

#### Interpolation

It looks like there could be some overfitting to the training set. Do we need to train on uniform?

In [37]:
Xint, Yint = gd.gd_paper(size=50000)
m.evaluate(Xint, Yint)



[4.6140787588626605e-08, 0.99982]

In [38]:
Xint, Yint = gd.gd_uniform(size=50000)
m.evaluate(Xint, Yint)



[2.1089257575710006e-06, 0.97478]

#### Extrapolation

In [39]:
Xext, Yext = gd.gd_uniform(size=50000)
m.evaluate(Xext, Yext)



[2.13256411180609e-06, 0.97538]

In [41]:
Xext, Yext = gd.gd_uniform(30000, 35000, size=50000)
m.evaluate(Xext, Yext)



[7.498779296875e-06, 0.75252]