## Modulo Experiments

This notebook depicts an attempt at getting a neural network to learn a modulo function.

The general consensus is that a neural network can fit any function (Cybenko, 1990) http://cognitivemedium.com/magic_paper/assets/Cybenko.pdf
However, given that the neurons in a traditional neural network are typically only adding or subtracting the input weights * input data value, it is difficult to approximate multiplication and division.

A modulo involves calculating both addition/subtraction and multiplication/division, hence it is fundamentally challenging.

However, can a neural network learn a concept, such as how to calculate a modulo function? Such concepts are typically calculated using a procedural set of instructions, and is not exactly the domain of traditional neural networks.

The modulo function is basically the remainder. Some examples are:
- 5 % 2 = 1
- 5 % 3 = 2
- 6 % 2 = 0

One example of a set of instructions for modulo of a % b is as follows:
1) Calculate c = a//b (the floor of a/b)
2) Calculate modulo = a - c*b

Dependencies:
- tensorflow
- numpy

## Inputs/Outputs

Train Inputs:
250000 random integers from 0 to 2^20

Test Inputs:
10000 random integers from 2^20 to 2^21

Outputs:
Modulo of the input with factor 7

## Base Model
The base model in this notebook uses a ResNet with 3 layers, with skip connections between all layers

## Results
We are able to get a train accuracy of up to 100%, but the test accuracy is 0%!
Strong overfitting here.
This also means that the model is unable to generalize.

In [10]:
import tensorflow as tf, numpy as np

# hyperparameters here
divisor = 7

# convert a number into binary
def int2bits(i,fill=21): 
    return list(map(int,bin(i)[2:].zfill(fill)))

def bits2int(b):
    return sum(i*2**n for n,i in enumerate(reversed(b)))

# Data. 
I = np.random.randint(0,2**20,size=(250000,))
X = np.array(list(map(int2bits,I)))
Y = np.array([int2bits(2**i,divisor) for i in I % divisor])

# Test Data. 
It = np.random.randint(2**20,2**21,size=(10000,))
Xt = np.array(list(map(int2bits,It)))
Yt = np.array([int2bits(2**i,divisor) for i in It % divisor])

In [11]:
# Model.
from tensorflow.keras.layers import Dense, Input, Concatenate
from tensorflow.keras import Model


### Change the model architecture here
########################################################
inputs = Input(shape = (21))
x = Dense(1000, 'relu')(inputs)

# Do a ResNet style skip connection
layer1 = Concatenate()([x, inputs])
x = Dense(1000, 'relu')(layer1)

# Do a double skip connection
layer2 = Concatenate()([x, layer1])
x = Dense(1000, 'relu')(layer2)

# Do a triple skip connection
layer3 = Concatenate()([x, layer2])
x = Dense(1000, 'relu')(layer3)
outputs = Dense(divisor, 'softmax')(x)

model = Model(inputs=inputs, outputs=outputs)

########################################################

model.compile('adam','categorical_crossentropy',['accuracy'])

# Train (report the final score at the 20th epoch)
model.fit(X,Y,10_000,20,validation_data=(Xt,Yt))

Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20


<tensorflow.python.keras.callbacks.History at 0x63cd96650>