<a target="_blank" href="https://colab.research.google.com/github/ArtificialIntelligenceToolkit/aitk/blob/master/notebooks/Advanced/XOR.ipynb"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Explore how hidden representations evolve

In this notebook we will train a network to learn to solve XOR. This is one of the simplest examples of a problem that is **not** linearly separable and thus requires a network with at least one hidden layer that uses a non-linear activation function.

We will record how the weights change over time during training. Then we can visualize how the weight changes create linearly separable hidden representations from the non-linearly separable inputs.

In [1]:
%pip install aitk --quiet

In [2]:
from aitk.networks import SimpleNetwork
from time import sleep

## Training set

The inputs and outputs for the XOR problem are defined below.

In [3]:
inputs = [
    [0, 0],
    [0, 1],
    [1, 0],
    [1, 1],
]
targets = [[0], [1], [1], [0]]

## Create the network

We will create a simple network with an input layer of size 2, a hidden layer of size 3, and an output layer of size 1, using the sigmoid activation function.

In [4]:
net = SimpleNetwork(2, 3, 1, activation="sigmoid")

## Test the network

The following block of code can be used to test all of the XOR inputs on the network.
* If you run this block before training, you should observe that all of the outputs are approximately the same for each input.
* If you run this block after training you should see that the network has correctly learned to output values close to 0 for the first and last input and close to 1 for the middle two inputs.

In [7]:
for i in range(len(inputs)):
  output = net.propagate(inputs[i])
  net.display(inputs[i])
  print("output is: ", output)
  sleep(1.0)

output is:  [0.17749889194965363]


## Train the network

We will train the network until it solves the problem, or for at most 1500 epochs. As long as the network's output is within tolerance of the desired target value, we will consider it to be correct. For example for the first input the desired output is 0, if the network outputs 0.17 that is considered to be close enough to be correct.

In [6]:
net.fit(
    inputs, targets,
    epochs=1500,
    batch_size=4,
    report_rate=100,
    accuracy=1.0,
    tolerance=0.2,
    save=100
)

Stopped because accuracy beat goal of 1.0
Epoch 972/1500 loss: 0.034054189920425415 - tolerance_accuracy: 1.0


<keras.src.callbacks.history.History at 0x7892a414b910>

## Re-test the network

Go back to the preivous code block to re-test the network after training is complete. It should have successfully learned to solve the problem.

## Visualize evolving representations

Now we can watch how the network's representations evolved over time. The red dots represent the data for which we want the network to output 0 and the blue dots represent the data for which we want the network to output 1.

In [10]:
for epoch, weights in net.get_weights_from_history():
    net.set_weights(weights)
    net.predict_pca(inputs, scale=.5, colors=["r", "b", "b", "r"], sizes=300)
    print("epoch %s" % (epoch))
    sleep(1.0)

epoch 972
