# 01 Perceptron

**Adapted from Adrin Jalali's talk**

https://github.com/adrinjalali/2017-05-talk-dl

# Review Comments: none

## Perceptron
* linear binary classifier
* simplest neural network with a single neuron (by [Rosenblatt, 1957](https://blogs.umass.edu/brain-wars/files/2016/03/rosenblatt-1957.pdf))
    * one or more inputs with weights
    * processor with **activation function**
    * single output
* feed forward model: from left to right (non-cyclical)

### Artificial Neuron (Perceptron - 1957)
[Source](http://natureofcode.com/book/chapter-10-neural-networks/)

![](http://natureofcode.com/book/imgs/chapter10/ch10_05.png)

$$output = f(x \times w_x + y \times w_y)$$
$$f(x) = sign(x)$$

![](https://upload.wikimedia.org/wikipedia/commons/thumb/4/4f/Signum_function.svg/200px-Signum_function.svg.png)

### Classification

![](http://natureofcode.com/book/imgs/chapter10/ch10_04.png)

#### Add bias
![](http://natureofcode.com/book/imgs/chapter10/ch10_06.png)

$$output = f(x \times w_x + y \times w_y + bias \times w_{bias})$$
$$f(x) = sign(x)$$

#### Feed the data
![](http://natureofcode.com/book/imgs/chapter10/ch10_07.png)

### Derive the weight update formula

$$output = sign(x \times w_x + y \times w_y + b \times w_b)$$

$$error = desired - output$$

$$w_{new} = w + \Delta w$$

$$\Delta w = error \times input$$

$$\Delta w =  error \times input \times \text{learning rate}$$

### Linear Separability

![](http://natureofcode.com/book/imgs/chapter10/ch10_12.png)
![](http://natureofcode.com/book/imgs/chapter10/ch10_13.png)

## Classification

### Logical OR

#### Initialize variables

In [1]:
import numpy as np
import math

sign = lambda x: math.copysign(1, x)

def f(X, W):
    return sign(sum([x * w for x, w in zip(X, W)]))

data = np.array([[0, 0, 1],
                 [1, 0, 1],
                 [0, 1, 1],
                 [1, 1, 1]])
output = np.array([-1, 1, 1, 1])

W = np.random.normal(0, size=3)

#### Error

In [2]:
for i in range(len(data)):
    print('desired: %2d, expected: %2d, error: %2d' % 
          (f(data[i], W), 
           output[i],
           output[i] - f(data[i], W)))

desired: -1, expected: -1, error:  0
desired:  1, expected:  1, error:  0
desired:  1, expected:  1, error:  0
desired:  1, expected:  1, error:  0


In [3]:
def total_error(X, W, y):
    return sum([abs(y[i] - f(X[i,], W)) for i in range(len(X))])

#### Update Weights

In [4]:
def new_w(X, W, y, learning_rate):
    output = f(X, W)
    error = y - output
    delta_w = np.array([error * x * learning_rate for x in X])
    return W + delta_w

#### Train

In [5]:
learning_rate = 0.5
print('starting error: %2g' % total_error(data, W, output))

for epoch in range(100):
    for i in range(len(data)):
        W = new_w(data[i,:], W, output[i], learning_rate)
        
    print('error: %2g, W: %s' % (total_error(data, W, output), str(W)))
    if total_error(data, W, output) == 0:
        break

starting error:  0
error:  0, W: [ 2.38564177  0.88884477 -0.2028783 ]


### Logical AND

#### Initialize variables

In [6]:
data = np.array([[0, 0, 1],
                 [1, 0, 1],
                 [0, 1, 1],
                 [1, 1, 1]])
output = np.array([-1, -1, -1, 1])

W = np.random.normal(0, size=3)

#### Train

In [7]:
learning_rate = 0.5
print('starting error: %2g' % total_error(data, W, output))

for epoch in range(100):
    for i in range(len(data)):
        W = new_w(data[i,:], W, output[i], learning_rate)
        
    print('error: %2g, W: %s' % (total_error(data, W, output), str(W)))
    if total_error(data, W, output) == 0:
        break

starting error:  6
error:  4, W: [ 1.68591822 -0.4002626   0.10451537]
error:  2, W: [ 1.68591822  0.5997374  -0.89548463]
error:  4, W: [ 1.68591822  1.5997374  -0.89548463]
error:  0, W: [ 0.68591822  1.5997374  -1.89548463]


### Logical XOR

#### Initialize variables

In [8]:
data = np.array([[0, 0, 1],
                 [1, 0, 1],
                 [0, 1, 1],
                 [1, 1, 1]])
output = np.array([-1, 1, 1, -1])

W = np.random.normal(0, size=3)

#### Train

In [9]:
learning_rate = 0.1
print('starting error: %2g' % total_error(data, W, output))

for epoch in range(100):
    for i in range(len(data)):
        W = new_w(data[i,:], W, output[i], learning_rate)
        
    print('error: %2g, W: %s' % (total_error(data, W, output), str(W)))
    if total_error(data, W, output) == 0:
        break

starting error:  2
error:  2, W: [-0.31711116  1.42594613 -1.369042  ]
error:  4, W: [-0.31711116  1.22594613 -1.369042  ]
error:  2, W: [-0.11711116  1.22594613 -1.169042  ]
error:  4, W: [-0.11711116  1.02594613 -1.169042  ]
error:  4, W: [-0.11711116  0.82594613 -1.169042  ]
error:  4, W: [-0.11711116  0.82594613 -0.969042  ]
error:  4, W: [-0.11711116  0.62594613 -0.969042  ]
error:  4, W: [-0.11711116  0.62594613 -0.769042  ]
error:  4, W: [-0.11711116  0.42594613 -0.769042  ]
error:  4, W: [-0.11711116  0.42594613 -0.569042  ]
error:  4, W: [-0.11711116  0.22594613 -0.569042  ]
error:  4, W: [-0.11711116  0.22594613 -0.369042  ]
error:  4, W: [-0.11711116  0.02594613 -0.369042  ]
error:  4, W: [-0.11711116  0.02594613 -0.169042  ]
error:  4, W: [-0.11711116 -0.17405387 -0.169042  ]
error:  6, W: [-0.11711116 -0.17405387  0.030958  ]
error:  6, W: [-0.11711116 -0.17405387  0.030958  ]
error:  6, W: [-0.11711116 -0.17405387  0.030958  ]
error:  6, W: [-0.11711116 -0.17405387  0.030

#### Train (observe more closely)

In [10]:
learning_rate = 0.1
print('starting error: %2g' % total_error(data, W, output))

for epoch in range(100):
    for i in range(len(data)):
        W = new_w(data[i,:], W, output[i], learning_rate)
        
        print('error: %2g, W: %s' % (total_error(data, W, output), str(W)))
    if total_error(data, W, output) == 0:
        break

starting error:  6
error:  4, W: [-0.11711116 -0.17405387 -0.169042  ]
error:  4, W: [ 0.08288884 -0.17405387  0.030958  ]
error:  4, W: [0.08288884 0.02594613 0.230958  ]
error:  6, W: [-0.11711116 -0.17405387  0.030958  ]
error:  4, W: [-0.11711116 -0.17405387 -0.169042  ]
error:  4, W: [ 0.08288884 -0.17405387  0.030958  ]
error:  4, W: [0.08288884 0.02594613 0.230958  ]
error:  6, W: [-0.11711116 -0.17405387  0.030958  ]
error:  4, W: [-0.11711116 -0.17405387 -0.169042  ]
error:  4, W: [ 0.08288884 -0.17405387  0.030958  ]
error:  4, W: [0.08288884 0.02594613 0.230958  ]
error:  6, W: [-0.11711116 -0.17405387  0.030958  ]
error:  4, W: [-0.11711116 -0.17405387 -0.169042  ]
error:  4, W: [ 0.08288884 -0.17405387  0.030958  ]
error:  4, W: [0.08288884 0.02594613 0.230958  ]
error:  6, W: [-0.11711116 -0.17405387  0.030958  ]
error:  4, W: [-0.11711116 -0.17405387 -0.169042  ]
error:  4, W: [ 0.08288884 -0.17405387  0.030958  ]
error:  4, W: [0.08288884 0.02594613 0.230958  ]
error:  

#### Multilayer perceptron
![](http://natureofcode.com/book/imgs/chapter10/ch10_14.png)