# Solution for the XOR problem using TensorFlow

The XOR problem consists of the non-linear classification problem of the crosses and circles shown in the following figure. This problem is famous because it proves in a very simple manner that a neural network with a non-linear activation function can perform a non-linear classification problem.

![Problem](xor_problem.svg)

In this example, we want to solve this problem with the most simple neural network using TensorFlow for the implementation. The following architecture shall serve our needs.

![Net](xor_net.svg)

To implement this neural network with TensorFlow, the architecture has to be translated in a computational graph. The following flowgraph displays the graph we want to implement. To do so, TensorFlow has three basic blocks, which can be connected: Placeholders, Variables and Operations.

![Graph](xor_graph.svg)

## Imports

In [1]:
import numpy as np
import tensorflow as tf

  from ._conv import register_converters as _register_converters


## Variables

Variables are the free parameters of the graph. These can be altered during training to find the best-fitting set of parameters solving your posed problem.
Here, we assume that we already know the best values and skip the training for now. The following code declares the variables we want to use in our neural network.

In [2]:
# Weights for layer 1
w1 = tf.get_variable("W1", initializer=np.array([[1.0, 1.0],
                                                 [1.0, 1.0]]))

# Bias for layer 1
b1 = tf.get_variable("b1", initializer=np.array([0.0, -1.0]))

# Weights for layer 2
w2 = tf.get_variable("W2", initializer=np.array([[1.0], [-2.0]]))

# Bias for layer 2
b2 = tf.get_variable("b2", initializer=np.array([0.0]))

## Placeholders

Placeholders are the entry-points for your graph. Placeholders can be set when running the graph using the `feed_dict` option to feed in values from outside. Typical use-cases are feeding inputs and labels to a graph to perform the training or only inputs for the inference.

In [3]:
# Placeholder for input values
x = tf.placeholder(tf.float64)

## Operations

Operations are functions, which creates new nodes in the graph from placeholders, variables or other operations. The following cell shows the setup of the neural network model shown above by stacking the mathematical operations.

In [4]:
# Definition of the hidden layer
hidden_layer = tf.nn.relu(b1 + tf.matmul(x, w1))

# Definition of the output layer
y = b2 + tf.matmul(hidden_layer, w2)

## The session

The session is the environment for your graph to be executed. It manages the devices you want to use and runs the previously defined graph or only parts of it.

Note that the first operation, which (almost) always has to be executed before extracting results from the graph is the initialization of your variables!

In [5]:
# Run network on XOR inputs
with tf.Session() as sess:
    # Initialize the variables
    sess.run(tf.global_variables_initializer())
    
    # Run inference
    x_in = np.array([[0, 0], [0, 1], [1, 0], [1, 1]])
    y_out = sess.run(y, feed_dict={x:x_in})

## Results

Let us inspect the response of the neural network and whether it solved the XOR problem successfully!

In [6]:
# Print network response
print("{} : {}".format("Input", "Output"))
for x_, y_ in zip(x_in, y_out):
    print("{} : {}".format(x_, int(np.squeeze(y_))))

Input : Output
[0 0] : 0
[0 1] : 1
[1 0] : 1
[1 1] : 0
