# Add 2 binary numbers using RNNs

In this jupyter notebook we will focus on $\textit{Recurrent Neural Network}$ which is a neural network that makes use of sequential information. It uses the order of inputs to create a logical connection between them, which is very useful in tasks like natural language processing, speech recognition and video activity recognition. 

We will use use RNN to teach model how to add 2 binary numbers. It is very simple learning problem, cause the main goal of that notebook is to become familiar with using RNNs in $\textit{Tensorflow 2.0}$ 



<img src="images/binary_summation.gif" style="width:30% ;height:30%;">


### In order to install TF 2,0 on google uncomment and run below code line 

In [1]:
# !pip install -q tensorflow==2.0.0-alpha0

# Imports

In [2]:
from tensorflow.keras import datasets, layers, models
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Input, Dense, SimpleRNN, RepeatVector, TimeDistributed
import numpy as np
import tensorflow as tf
from tqdm import tqdm_notebook

# Set training parameters

In [3]:
EPOCHS = 300
BATCH_SIZE = 1024
BUFFER_SIZE = 128
LEARNING_RATE = 1e-3

# Prepare training data

## Generate data

In [4]:
def get_data():
    """
    :return: 2 np.arrays x and y. x of size (N, 2, 8) and y of size (N, 8, 1). x[n]
            consist of 2 binary numbers y[n] is the result of addition of those 2 numbers
    """
    x = []
    y = []

    for i in range(0, 127):
        x_1 = np.unpackbits(np.uint8(i))
        for j in range(0, 127):
            x_2 = np.unpackbits(np.uint8(j))
            x.append((x_1, x_2))
            y.append(np.unpackbits(np.uint8(i + j)))

    x = np.array(x)
    y = np.array(y)

    y = np.expand_dims(y, axis=2)

    return x, y


In [5]:
x, y = get_data()

x = tf.cast(x, tf.float32)
y = tf.cast(y, tf.float32)

print(f"X shape: {x.shape}, we have {x.shape[0]} examples, {x.shape[1]} numbers of {x.shape[2]} shape each")
print(f"y shape {y.shape}")


X shape: (16129, 2, 8), we have 16129 examples, 2 numbers of 8 shape each
y shape (16129, 8, 1)


## Example data

In [15]:
print(f"0 b:{x[1,0]} + 1 b:{x[1,1]} = 1 b: {y[1, :, 0]}")

print(f"0 b:{x[10,0]} + 10 b:{x[10,1]} = 10 b: {y[10, :, 0]}")

print(f"0 b:{x[90,0]} + 90 b:{x[90,1]} = 90 b: {y[90, :, 0]}")

print(f"1 b:{x[150,0]} + 23 b:{x[150,1]} = 24 b: {y[150, :, 0]}")

0 b:[0. 0. 0. 0. 0. 0. 0. 0.] + 1 b:[0. 0. 0. 0. 0. 0. 0. 1.] = 1 b: [0. 0. 0. 0. 0. 0. 0. 1.]
0 b:[0. 0. 0. 0. 0. 0. 0. 0.] + 10 b:[0. 0. 0. 0. 1. 0. 1. 0.] = 10 b: [0. 0. 0. 0. 1. 0. 1. 0.]
0 b:[0. 0. 0. 0. 0. 0. 0. 0.] + 90 b:[0. 1. 0. 1. 1. 0. 1. 0.] = 90 b: [0. 1. 0. 1. 1. 0. 1. 0.]
1 b:[0. 0. 0. 0. 0. 0. 0. 1.] + 23 b:[0. 0. 0. 1. 0. 1. 1. 1.] = 24 b: [0. 0. 0. 1. 1. 0. 0. 0.]


## Dataset API 

In [7]:
dataset = tf.data.Dataset.from_tensor_slices((x, y)).batch(BATCH_SIZE).shuffle(buffer_size=BUFFER_SIZE)

# Prepare model

In [8]:
def get_model():
    """
    return: keras.models.Sequential
    """
    model = Sequential()
    model.add(Input(shape=(2,8)))
    model.add(SimpleRNN(128, activation='relu'))
    model.add(RepeatVector(8))
    model.add(SimpleRNN(64, activation='relu', return_sequences=True))
    model.add(TimeDistributed(Dense(1)))
    
    return model

## Model sumary

In [9]:
model = get_model()
model.summary()

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
simple_rnn (SimpleRNN)       (None, 128)               17536     
_________________________________________________________________
repeat_vector (RepeatVector) (None, 8, 128)            0         
_________________________________________________________________
simple_rnn_1 (SimpleRNN)     (None, 8, 64)             12352     
_________________________________________________________________
time_distributed (TimeDistri (None, 8, 1)              65        
Total params: 29,953
Trainable params: 29,953
Non-trainable params: 0
_________________________________________________________________


# Training

## Define loss function and optimizer

In [10]:
optimizer = tf.keras.optimizers.Adam(lr=LEARNING_RATE)
loss = tf.keras.losses.MeanSquaredError()

## Define accuracy measurements

In [11]:
def get_binary_number_accuracy(y_true, y_pred):
    """
    Function returns accuracy of prediction for N samples
    where N is number of examples and d is amount of digits in a single nummber
    :param y_true: np.array size of (N, d),
    :param y_pred: np.array size of (N,d)
    :return: scalar, accuracy
    """
    accuracy = np.mean(np.prod(y_pred == y_true, axis=1))

    return accuracy


## Define training function

In [12]:
def train(model, dataset):
    """
    Function optimizes model parameters, model training is based on data from dataset
    :param model: tensorflow model
    :param dataset: tensorflow Dataset API
    :return: None
    """
    def train_step(images, labels):
        with tf.GradientTape() as tape:
            logits = model(images, training=True)
            loss_value = loss(labels, logits)

        grads = tape.gradient(loss_value, model.trainable_variables)
        optimizer.apply_gradients(zip(grads, model.trainable_variables))

    for epoch in tqdm_notebook(range(EPOCHS)):
        for (batch, (data, labels)) in enumerate(dataset):
            train_step(data, labels)


## Launch training

In [13]:
train(model, dataset)

HBox(children=(IntProgress(value=0, max=300), HTML(value='')))




# Training accuracy

In [18]:
predictions = model.predict(x)
y_pred = np.round(predictions)
y_pred = np.abs(y_pred)
y_true = y.numpy()

print(f"Binary Accuracy: {get_binary_number_accuracy(y_pred=y_pred, y_true=y_true)}")

Binary Accuracy: 1.0


# Example results

In [19]:
print(f"0 b:{x[1,0]} + 1 b:{x[1,1]} = 1 b: {y_pred[1, :, 0]}")

print(f"0 b:{x[10,0]} + 10 b:{x[10,1]} = 10 b: {y_pred[10, :, 0]}")

print(f"0 b:{x[90,0]} + 90 b:{x[90,1]} = 90 b: {y_pred[90, :, 0]}")

print(f"1 b:{x[150,0]} + 23 b:{x[150,1]} = 24 b: {y_pred[150, :, 0]}")

0 b:[0. 0. 0. 0. 0. 0. 0. 0.] + 1 b:[0. 0. 0. 0. 0. 0. 0. 1.] = 1 b: [0. 0. 0. 0. 0. 0. 0. 1.]
0 b:[0. 0. 0. 0. 0. 0. 0. 0.] + 10 b:[0. 0. 0. 0. 1. 0. 1. 0.] = 10 b: [0. 0. 0. 0. 1. 0. 1. 0.]
0 b:[0. 0. 0. 0. 0. 0. 0. 0.] + 90 b:[0. 1. 0. 1. 1. 0. 1. 0.] = 90 b: [0. 1. 0. 1. 1. 0. 1. 0.]
1 b:[0. 0. 0. 0. 0. 0. 0. 1.] + 23 b:[0. 0. 0. 1. 0. 1. 1. 1.] = 24 b: [0. 0. 0. 1. 1. 0. 0. 0.]


# Summary

As we can see our model model performs perfectly. This is due to the fact that it is relatively simple learning problem. It is possible to make it more complicated, e.g. by adding diffrent operations like "minus" or "multiplication" or by using numbers in diffrent form e.g. in hexadecimal system. 