# Introduction to Deep Learning with Python

## Chapter 1: Introduction

### 1.1 Rosenblatt's Perceptron

The **perceptron** was the first type of a artificial neuron introduced by [Frank Rosenblatt][1] in the late 1950. It's design was inspired by the McCulloch-Pitts model of a neuron. While perceptrons nowadays were replaced by other types of neurons, their basic design continues to exist in modern neural networks. 

A perceptron can be used to learn a *linearly separable* classification task. It takes **inputs** ${[x_1, x_2, ..., x_n]}$ and computes a binary output $y_i$. The **weights** ${[w_1, w_2, ..., w_n]}$ express the importance of the respective inputs to the output. The output is calculated as a weightes sum over the inputs:

$$y_i = \sum_{i}w_i x_i$$

[1]: http://www.ling.upenn.edu/courses/cogs501/Rosenblatt1958.pdf

![Perceptron](00_ressources/img/chapter_1/perceptron.png)

In order to ensure that $y_i$ is a binary outcome, the perceptron uses a *step-function* with an estimated *threshold* also called **bias**:

$$y_i = 
\begin{cases}
    0 &\text{if $w \cdot x+b \leq 0$}\cr  
    1 &\text{if $w \cdot x+b \geq 0$}
\end{cases}$$

where $w \cdot x \equiv \sum_{i}w_i x_i$ is the *dot product* between $x$ and $w$ and $b$ is the threshold.




![Stepfunction](00_ressources/img/chapter_1/step_function.png)

Finally, the perceptron learns by iteratively updating the weight vector $w$ in the following way:

$$w \leftarrow \dot{w} + \nu \cdot (y_i - \hat{y_i}) \cdot x_i $$

where \dot{w} is the new weight vector, $\nu$ is a *learning rate*, $(y_i - \hat{y_i})$ is the error in the current iteration and is the current input $x_i$.

In [29]:
# Coding Rosenblatt's Perceptron from scratch
# -------------------------------------------
import numpy as np
import random

random.seed(1)

# Step function
def unit_step(x):
    if x < 0:
        return(0)
    else:
        return(1)

# Data
X = np.array([[0,0,1], 
              [0,1,1], 
              [1,0,1], 
              [1,1,1]]
            )
# Label
y = np.array([0,1,1,1])

w = np.random.rand(3) # Weights
errors = []           # Errors
eta = 0.2             # Learning rate
n = 100               # Epochs

# Training
for i in range(n):
    # Get row index
    index = random.randint(0,3)
    # Define minibatch (online)
    x_batch = X[index,:]
    y_batch = y[index]
    # Calculate activation
    y_hat = unit_step(np.dot(w, x_batch))
    # Caluclate error
    error = y_batch - y_hat
    errors.append(error)
    # Update weights
    w += eta * error * x_batch

# Prediction  
for index, x in enumerate(X):
    y_hat = np.dot(x, w)
    print("{}: {} -> {} | {}".format(index, round(y_hat, 3), unit_step(y_hat), y[index]))
    


0: -0.108 -> 0 | 0
1: 0.841 -> 1 | 1
2: 0.861 -> 1 | 1
3: 1.81 -> 1 | 1
