 # Solving XOR With a 2x2x1 Neural Network with tflearn
 
Simplest possible feed forward neural network capable solving the XOR problem. 

The example serves as a syntax guide for developers learning to use Tensorflow with tflearn.

In [2]:
from tflearn import DNN
from tflearn.layers.core import input_data, dropout, fully_connected
from tflearn.layers.estimator import regression

We define a dataset consisting of four possible combinations of inputs (X) and their corresponding classes (Y)

In [3]:
X = [[0,0], [0,1], [1,0], [1,1]]
Y = [[0], [1], [1], [0]]

In [4]:
input_layer = input_data(shape=[None, 2]) #input layer of size 2
hidden_layer = fully_connected(input_layer , 2, activation='tanh') #hidden layer of size 2
output_layer = fully_connected(hidden_layer, 1, activation='tanh') #output layer of size 1

regression = regression(output_layer , optimizer='sgd', loss='binary_crossentropy', learning_rate=5)
model = DNN(regression)

Let's train the model

In [None]:
model.fit(X, Y, n_epoch=5000, show_metric=True);

We get an accuracy of 99.9%

Now let's check whether our model works

In [6]:
[i[0] > 0 for i in model.predict(X)]

[False, True, True, False]

## Weight analisys

Unlike AND and OR, XOR problem's outputs are not linearly separable. Therefore, we had to introduce another hidden layer. That way, each node in the hidden layer represents one of the linearly separable logical operation (AND, OR, NOR, ...) and the output layer will act as another logical operation fed by outputs from the previous layer.

A simple example of such an expression is:
$XOR(X_1, X_2) = AND(OR(X_1, X_2), NAND(X_1, X_2))$

Now let's see what our network came up with:

In [7]:
print('Weights in layer1: ', model.get_weights(hidden_layer.W), ', Biases in layer1: ', model.get_weights(hidden_layer.b))
print('Weights in layer2: ', model.get_weights(output_layer.W), ', Biases in layer2: ', model.get_weights(output_layer.b))


Weights in layer1:  [[ 3.86708593 -3.11288071]
 [ 3.87053323 -3.1126008 ]] , Biases in layer1:  [-1.82562542  4.58438063]
Weights in layer2:  [[ 5.19325304]
 [ 5.23881006]] , Biases in layer2:  [-4.87336922]


Based on the activations for different inputs, we can estimate what logical operation each node represents

| Input | Node1 (Hidden Layer) | Node2 (Hidden Layer)   | Output Node |
|------|------|------|------|
|   [0,0]  |-1|1|-1|
|   [0,1]  |1|1|1|
|   [1,0]  |1|1|1|
|   [1,1]  |1|-1|-1|

$Node1$ output is true always apart from when both inputs are false. $Node1$ represents $OR$

$Node2$ output is false only when both inputs are false. It represents $NAND$

$Output Node$ is true only when both $Node1$ and $Node2$ are true. It represents $AND$ operation of activations of the previous layer.


Walla! Our network therefore learned to solve XOR on its own using this equation: 

$$XOR(X_1, X_2) = AND(Node1(X_1, X_2), Node2(X_1, X_2)) = AND(OR(X_1, X_2), NAND(X_1, X_2))$$


Note that your weights will be different every time you fit the model. It's due to the fact that weighs are initialized randomly.