# MLP XOR Solution

We will implement tensorflow code for XOR operation using Multi Layer Perceptron (a.k.a MLP)

## single perceptron only works on linearly separable classification
One perceptron is one decision boundary, so it only solve linearly separable problem.

![Image of Perceptron](https://qph.fs.quoracdn.net/main-qimg-a6c557af4280d1f85cacc66e048e82f3)

MLP (multi layer perceptron) with two neurons in hidden layer can solve XOR.  
Two neurons in hidden layer will draw two boundary lines (z1, z2), 

we can make z1, z2 truth table like below,
z1, z2, value
0,  0,  0
0,  1,  1
1,  0,  1

As you can see from below upper 2d chart, now it is linearly separable on z1, z2 axis,  
one perceptron in the next layer can classify output from hidden layer.

![Image of Perceptron](http://cps0715.weebly.com/uploads/7/4/0/3/74035485/8009014_orig.png)

Because step function is hard to optimize using back propagation due to Non-differentiable,  
We will use sigmoid as its activation instead of step function.

![Image of Perceptron](https://raw.githubusercontent.com/minsuk-heo/deeplearning/master/img/MLP_XOR.png) 

# Practice with Tensorflow

In [1]:
import tensorflow as tf

# Define Tensorflow Graph
firstly, we will define train data shape.  
XOR train data has input X and output Y.  

X is [4,2] shape like below,  
[0, 0], [0, 1], [1, 0], [1, 1]  

Y is [4,1] shape like below,  
[[0], [1], [1], [0]]  

In [2]:
X = tf.placeholder(tf.float32, shape=[4,2])
Y = tf.placeholder(tf.float32, shape=[4,1])

# First Layer

In [3]:
# we define first layer has two neurons taking two input values.  
W1 = tf.Variable(tf.random_uniform([2,2]))
# each neuron has one bias.
B1 = tf.Variable(tf.zeros([2]))
# First Layer's output is Z which is the sigmoid(W1 * X + B1)
Z = tf.sigmoid(tf.matmul(X, W1) + B1)

# Second Layer

In [4]:
# we define second layer has one neurons taking two input values.  
W2 = tf.Variable(tf.random_uniform([2,1]))
# one neuron has one bias.
B2 = tf.Variable(tf.zeros([1]))
# Second Layer's output is Y_hat which is the sigmoid(W2 * Z + B2)
Y_hat = tf.sigmoid(tf.matmul(Z, W2) + B2)

# Loss Function

In [5]:
# cross entropy
loss = tf.reduce_mean(-1*((Y*tf.log(Y_hat))+((1-Y)*tf.log(1.0-Y_hat))))

# Optimizer

In [6]:
# Gradient Descent
train_step = tf.train.GradientDescentOptimizer(0.05).minimize(loss)

# Train

In [7]:
# train data
train_X = [[0,0],[0,1],[1,0],[1,1]]
train_Y = [[0],[1],[1],[0]]

In [10]:
# initialize
init = tf.global_variables_initializer()
# Start training
with tf.Session() as sess:
    # Run the initializer
    sess.run(init)
    print("train data: "+str(train_X))
    for i in range(20000):
        sess.run(train_step, feed_dict={X: train_X, Y: train_Y})
        if i % 5000 == 0:
            print('Epoch : ', i)
            print('Output : ', sess.run(Y_hat, feed_dict={X: train_X, Y: train_Y}))
    
    print('Final Output : ', sess.run(Y_hat, feed_dict={X: train_X, Y: train_Y}))

train data: [[0, 0], [0, 1], [1, 0], [1, 1]]
Epoch :  0
Output :  [[0.5759099]
 [0.5988261]
 [0.5978792]
 [0.617011 ]]
Epoch :  5000
Output :  [[0.24778686]
 [0.63629276]
 [0.55345565]
 [0.59068745]]
Epoch :  10000
Output :  [[0.075223  ]
 [0.9185553 ]
 [0.91934305]
 [0.10548425]]
Epoch :  15000
Output :  [[0.0316886 ]
 [0.97331434]
 [0.9735103 ]
 [0.03062824]]
Final Output :  [[0.01958315]
 [0.9845865 ]
 [0.98467547]
 [0.01702813]]
