# The Parity Problem
The problem is very simple and it can be solved just in few lines of code.<br />
Of course, we are going to use Recursive Neural Network (RNN) to solve our problem.<br />
You have a binary list, for example ```[1,1,0,1,0,1,0,1,0]```, and you need to check if the number of one is even or odd.<br />
We can solve this problem just in a few lines for code.<br />

In [None]:
def iseven(lst):
    return sum(lst) % 2 == 0

def isodd(lst):
    return not iseven(lst)

print(f"is 101010 even: {iseven([1,0,1,0,1,0])}")
print(f"is 101010  odd: { isodd([1,0,1,0,1,0])}")

Now we need to build a dataset full of this problems with their solutions

In [None]:
dataset_size = 10000
max_length = 10

import random
def gen_problem():
    size = random.randint(1,max_length)
    return [random.randint(0,1) for _ in range(size)] + [0] * ( max_length - size )

def gen_dataset():
    return [gen_problem() for _ in range(dataset_size)]

dataset = gen_dataset()
solutions = [1 if iseven(problem) else 0 for problem in dataset]
for i in range(5):
    print(f"problem {i}: {dataset[i]}, sol: {solutions[i]}.")

Now, we have our own dataset. <br />
We need to split the dataset into test, validation, and training. <br />
- The training slice will be used to train out model.
- The validation slice will be used to check how our model is doing during training.
- The test slice will be used to check how our model is doing after training.

It is important to have large training, validation, and test dataset so that <br />
you have enough data to train and validate your model without the risk of overfitting.
A good ratio of data is:
- 70% for training.
- 15% for validating.
- 15% for testing.

In [None]:
trs = int(dataset_size * 0.70)
vls = int(dataset_size * 0.15)
tss = int(dataset_size * 0.15)
training_set   = (dataset[:trs], solutions[:trs])
validation_set = (dataset[trs:trs+vls], solutions[trs:trs+vls])
test_set       = (dataset[trs+vls:trs+vls+tss], solutions[trs+vls:trs+vls+tss])
print(f"training size: {len(training_set[0])}, validation size: {len(validation_set[0])}, test size: {len(test_set[0])}")

### Tensors
Tensorflow uses its own data type to process data which are tensors.<br />
Tensors are lists that need to be homogeneous in all dimension.<br />
For example: 
- Tensor([1,1,1,1]) is a tensorm of dimension 1 and size 4.
- Tensor([[1,1], [1,1], [1,1]]) is a tensor of dimension 2 and size (3, 2).
- Tensor([[1,1,1], [1,1], [1,1]]) is not a valid tensor as it is not homogeneous

Let's convert our datasets into tensors.

In [None]:
import tensorflow as tf
training_set   = (tf.convert_to_tensor(  training_set[0], dtype=float), tf.convert_to_tensor(  training_set[1], dtype=float))
validation_set = (tf.convert_to_tensor(validation_set[0], dtype=float), tf.convert_to_tensor(validation_set[1], dtype=float))
test_set       = (tf.convert_to_tensor(      test_set[0], dtype=float), tf.convert_to_tensor(      test_set[1], dtype=float))

### Network architecture
Next, we need to define a neural network architecture. <br />
We will use a RNN, but it will be implemented from scratch. <br />
However, tensorflow already has implementation for many popular architecture. 

<img src="https://static.wixstatic.com/media/3eee0b_969c1d3e8d7943f0bd693d6151199f69~mv2.gif"  width="600" height="300">

In [None]:
class Model(tf.keras.Model):
    def __init__(self, max_length, batch_size, size=32):
        super().__init__()
        
        self.max_length = max_length
        self.batch_size = batch_size
        
        self.tob = tf.keras.layers.Dense(size, activation="relu")
        self.x2h = tf.keras.layers.Dense(size, activation="relu")
        self.h2y = tf.keras.layers.Dense(size, activation="relu")
        self.h2h = tf.keras.layers.Dense(size, activation="relu")
        self.y2s = tf.keras.layers.Dense(1)
        self.h0  = tf.zeros((1, size))
        
    def call(self, batch):
        b = self.tob(batch)
        
        h = self.h0
        for i in range(self.max_length):
            x = tf.expand_dims(b[:,i],-1)
            h = h + self.x2h(x)
            y = self.h2y(h)
            h = self.h2h(h)
        
        s = self.y2s(y)
        
        return tf.math.sigmoid(tf.reshape(s, (-1)))
    
batch_size = 50
model = Model(max_length, batch_size, size=64)

# Training loop
Now that we have a model and the data we can finally start the training. <br />
We feed the model with small batches of problems at the same time.

In [None]:
epochs = 10

def MSE(preds, truths):
    return tf.math.reduce_mean((preds - truths) ** 2)

def acc(preds, truths):
    return tf.math.reduce_mean(tf.cast(tf.math.round(preds) == truths, dtype=float))

optimizer = tf.keras.optimizers.Adam(learning_rate=0.01)

# split the training set in batches
training_batches = (tf.reshape(training_set[0],[trs//batch_size, batch_size, max_length]), tf.reshape(training_set[1],[trs//batch_size, batch_size]))
      

for e in range(epochs):
    
    # Train 
    for i, (batchX, batchY) in enumerate(zip(*training_batches)):

        # register computational graph
        with tf.GradientTape() as tape:
            preds = model(batchX)
            loss  = MSE(preds, batchY)

        # compute gradients from graph wrt. model variables
        grads = tape.gradient(loss, model.trainable_variables)
        optimizer.apply_gradients(zip(grads, model.trainable_variables))
        
        # log results
        print(f"\r epoch: {e}/{epochs}, batch: {i}/{trs//batch_size}, loss: {tf.get_static_value(loss):.5f}, acc: {tf.get_static_value(acc(preds, batchY)):.5f}", end="")



## Exercises
- (⭐) validate the model every time the epoch ends
- (⭐) test the model at the end of all epochs
- (⭐⭐⭐) rewrite the model so that it works with max_length >= 50