<a href="https://colab.research.google.com/github/ianomunga/XOR-LSTM-Problem/blob/main/XOR_LSTM.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

##Solving the XOR Logic Gate Output Problem using an LSTM Recurrent Neural Network
XOR stands for 'Exclusive-Or', which is a logical operator that evaluates to the 'True' Boolean output when either of its values are true; but not both. This mutual exclusivity is captured in this part of its name. 'exclusive'. 

This relationship is hard to represent in a linear way that a Logistic Regression Model would be able to generalize statistically for, because the statistical significance in the bits, i.e. the meaning, comes from a mutual rlationship between the two values under evaluation. 

A non-linearity could model this relationship, however, and that's where the LSTM Model comes in. It's 'Long Short-Term Memory' enables the cumulative evaluations of the stream of logic gates to be carried forward recurrently throughout the sequence.

This is what the code below will implement.


In [8]:
#get all your dependencies in check
from tensorflow.keras import optimizers
from tensorflow.keras.layers import Dense, Input, LSTM
from tensorflow.keras.models import Sequential
import numpy as np
import random

In [9]:
#encapsulate some key variables, i.e

#the sequence_length
SEQ_LEN = 50

#the number of bits in the sequence
COUNT = 100000

In [10]:
#create our pairs of logic gate values based on the cumulative sum of the generated sequence
bin_pair = lambda x: [x, not(x)]
training = np.array([[bin_pair(random.choice([0, 1])) for _ in range(SEQ_LEN)] for _ in range(COUNT)])
target = np.array([[bin_pair(x) for x in np.cumsum(example[:,0]) % 2] for example in training])

In [11]:
#check for a match between the lengths of the datasets before we go ahead
print('shape check:', training.shape, '=', target.shape)

shape check: (100000, 50, 2) = (100000, 50, 2)


In [12]:
model = Sequential()
#pass in the sequence-length so that every possible example's dimension is accounted for 
model.add(Input(shape=(SEQ_LEN, 2), dtype='float32'))
#build the model with the LSTM component for parity persistence,
model.add(LSTM(1, return_sequences=True))
#two possible outcomes for the two possible logicgate values
model.add(Dense(2, activation='softmax'))

In [13]:
#now fit the model to the data and run the epochs
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(training, target, epochs=10, batch_size=128)
model.summary()

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
Model: "sequential_1"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 lstm_1 (LSTM)               (None, 50, 1)             16        
                                                                 
 dense_1 (Dense)             (None, 50, 2)             4         
                                                                 
Total params: 20
Trainable params: 20
Non-trainable params: 0
_________________________________________________________________


In [15]:
predictions = model.predict(training)
i = random.randint(0, COUNT)
chance = predictions[i,-1,0]
print('randomly selected sequence:', training[i,:,0])
print('prediction:', int(chance > 0.5))
print('confidence: {:0.2f}%'.format((chance if chance > 0.5 else 1 - chance) * 100))
print('actual:', np.sum(training[i,:,0]) % 2)

randomly selected sequence: [1 1 0 1 0 1 1 1 0 1 1 1 0 1 1 1 0 1 1 1 0 0 0 1 1 0 1 0 1 0 1 0 0 1 0 1 0
 0 0 0 0 1 1 0 0 0 1 1 0 1]
prediction: 1
confidence: 99.73%
actual: 1


It can be seen that the LSTM configuration does successfully carry forward the parity of the logical gates. In the end, the model is able to predict the parity of the alternatives to a sequence of randomly generated bits with a confidence score of 99.73 percent with 100,000 sample bits serving as the examples in 50-bit sequences. 