## ML Lab 3
### Neural Networks

In the following exercise class we explore how to design and train neural networks in various ways.

#### Prerequisites:

In order to follow the exercises you need to:
1. Activate your conda environment from last week via: `source activate <env-name>` 
2. Install tensorflow (https://www.tensorflow.org) via: `pip install tensorflow` (CPU-only)
3. Install keras (provides high level wrapper for tensorflow) (https://keras.io) via: `pip install keras`

## Exercise 1: Create a 2 layer network that acts as an XOR gate using numpy.

XOR is a fundamental logic gate that outputs a one whenever there is an odd parity of ones in its input and zero otherwise. For two inputs this can be thought of as an exclusive or operation and the associated boolean function is fully characterized by the following truth table.

| X | Y | XOR(X,Y) |
|---|---|----------|
| 0 | 0 |    0     |
| 0 | 1 |    1     |
| 1 | 0 |    1     |
| 1 | 1 |    0     |

The function of an XOR gate can also be understood as a classification problem on $v \in \{0,1\}^2$ and we can think about designing a classifier acting as an XOR gate. It turns out that this problem is not solvable by any single layer perceptron (https://en.wikipedia.org/wiki/Perceptron) because the set of points $\{(0,0), (0,1), (1,0), (1,1)\}$ is not linearly seperable.

**Design a two layer perceptron using basic numpy matrix operations that implements an XOR Gate on two inputs. Think about the flow of information and accordingly set the weight values by hand.**

### Data

In [2]:
import numpy as np

def generate_xor_data():
    X = [(i,j) for i in [0,1] for j in [0,1]]
    y = [int(np.logical_xor(x[0], x[1])) for x in X]
    return X, y
    
print(generate_xor_data())

([(0, 0), (0, 1), (1, 0), (1, 1)], [0, 1, 1, 0])


### Hints
A single layer in a multilayer perceptron can be described by the equation $y = f(\vec{b} + W\vec{x})$ with $f$ the logistic function, a smooth and differentiable version of the step function, and defined as $f(z) = \frac{1}{1+e^{-z}}$. $\vec{b}$ is the so called bias, a constant offset vector and $W$ is the weight matrix. However, since we set the weights by hand feel free to use hard thresholding instead of using the logistic function. Write down the equation for a two layer MLP and implement it with numpy. For documentation see https://docs.scipy.org/doc/numpy-1.13.0/reference/ 

In [145]:
"""
Implement your solution here.
"""

'\nImplement your solution here.\n'

### Solution

| X | Y | AND(NOT X, Y) | AND(X,NOT Y) | OR[AND(NOT X, Y), AND(X, NOT Y)]| XOR(X,Y) |
|---|---|---------------|--------------|---------------------------------|----------|
| 0 | 0 |    0          |      0       |                 0               |    0     |
| 0 | 1 |    1          |      0       |                 1               |    1     |
| 1 | 0 |    0          |      1       |                 1               |    1     |
| 1 | 1 |    0          |      0       |                 0               |    0     |

Implement XOR as a combination of 2 AND Gates and 1 OR gate where each neuron in the network acts as one of these gates.

In [167]:
"""
Definitions:

Input = np.array([X,Y])

0 if value < 0.5
1 if value >= 0.5
"""

def threshold(vector):
    return (vector>=0.5).astype(float)

def mlp(x, W0, W1, b0, b1, f):
    x0 = f(np.dot(W0, x) + b0)
    x1 = f(np.dot(W1, x0) + b1)
    return x1

# AND(NOT X, Y)
w_andnotxy = np.array([-1.0, 1.0])
# AND(X, NOT Y)
w_andxnoty = np.array([1.0, -1.0])
# W0 weight matrix:
W0 = np.vstack([w_andnotxy, w_andxnoty])

# OR(X,Y)
w_or = np.array([1., 1.])
W1 = w_or

# No biases needed
b0 = np.array([0.0,0.0])
b1 = 0.0

print("Input", "Output", "XOR")
xx,yy = generate_xor_data()
for x,y in zip(xx, yy):
    print(x, int(mlp(x, W0, W1, b0, b1, threshold)),"  ", y)

Input Output XOR
(0, 0) 0    0
(0, 1) 1    1
(1, 0) 1    1
(1, 1) 0    0
