# A Multilayer Neural Network implementation for solving the XOR problem using Tensorflow.

This code could be significantly simplified by using Keras or other libraries, but the aim here is to understand the step-by-step process.

In [3]:
import tensorflow as tf

Defining the sigmoid function

In [0]:
import math

def sigmoid(x):
  return 1 / (1 + math.exp(-x))

Defining the XOR problem input space. Note that even though we use a binary input the output of the neural net will be a real number.

X is a bidimensional input (x1, x2) and Y are the corresponding labels following the XOR output function.

![texto alternativo](https://drive.google.com/uc?id=1q0y13JLtQqGTL_J4PWz9q5Hr8gvlcrzt)


In [0]:
# XOR definition
X = [[0, 0], [0, 1], [1, 0], [1, 1]]
Y = [[0], [1], [1], [0]]

In [0]:
# Neural Network Parameters
N_STEPS = 100000
N_TRAINING = len(X) # in this case, 4

Defining the ANN topology. In this case  2 - 2 - 1

![texto alternativo](https://drive.google.com/uc?id=1JOgtE0qjqvDTMFE5drMXAeZGV33Yrtv0)



In [0]:
N_INPUT_NODES = 2
N_HIDDEN_NODES = 2
N_OUTPUT_NODES = 1
LEARNING_RATE = 0.01

Defining the placeholders and variables. The Variable are the params to train, placeholders are the 'room' for the inputs & labels.

In [0]:
# Create placeholders for variables and define Neural Network structure
x_ = tf.placeholder(tf.float32, shape=[N_TRAINING, N_INPUT_NODES], name="x-input")
y_ = tf.placeholder(tf.float32, shape=[N_TRAINING, N_OUTPUT_NODES], name="y-input")

In [0]:
theta1 = tf.Variable(tf.random_uniform([N_INPUT_NODES, N_HIDDEN_NODES], -1, 1), name="theta1")
theta2 = tf.Variable(tf.random_uniform([N_HIDDEN_NODES, N_OUTPUT_NODES], -1, 1), name="theta2")

In [0]:
bias1 = tf.Variable(tf.zeros([N_HIDDEN_NODES]), name="bias1")
bias2 = tf.Variable(tf.zeros([N_OUTPUT_NODES]), name="bias2")

Defining the function of every layer.

In [0]:
# Use a sigmoidal activation function
layer1 = tf.sigmoid(tf.matmul(x_, theta1) + bias1)
output = tf.sigmoid(tf.matmul(layer1, theta2) + bias2)

Defining the cost function. In this case Cross entropy.

$C=−1/n∑x[yln(o)+(1−y)ln(1−o)]$

where $o$ is the ouput of the net.

The cross entropy cost function is basically a suitable function to measure our errors (This is what you need to know!)



In [0]:
cost = - tf.reduce_mean((y_ * tf.log(output)) + (1 - y_) * tf.log(1.0 - output))

Defining the Learning Processor --> Gradient Descent

In [0]:
train_step = tf.train.GradientDescentOptimizer(LEARNING_RATE).minimize(cost)

In [0]:
init = tf.global_variables_initializer()
sess = tf.Session()
sess.run(init)

Training and testing our algorithm. Note that the whole space is just four inputs, so there are not testing or training datasets. It is the same dataset.

In [0]:
for i in range(100000):
  sess.run(train_step, feed_dict={x_: X, y_: Y})
  if i % 10000 == 0:
      print('Batch ', i)
      print('Input:', X, 'Output(Inference)', sess.run(tf.transpose(output), feed_dict={x_: X, y_: Y}))
      print('Cost ', sess.run(cost, feed_dict={x_: X, y_: Y}))
      print("theta1:", sess.run(theta1))
      print("theta2:", sess.run(theta2))
      print("bias1:", sess.run(bias1))
      print("bias2:", sess.run(bias2))


Printing the final trained params.

In [0]:
print(sess.run(theta1))
print(sess.run(theta2))
print(sess.run(bias1))
print(sess.run(bias2))