# Introduction to Feedforward Neural Networks

A neural network is computational model that has slightly different meanings in different communitites.

* **Cognitive science view**: a computational model of the brain consisting of artificial neural perceptrons

* **Machine Learning view**: 
   * **Linear algebra view**: a network of perceptron-like nodes, i.e., a set of matrix multiplication operations
   * **Graph theory view**: a computational graph model (with automatic differentiation)


As the name already suggests, a neural network is a network. It can be seen as a model that is build up from basic building blocks. Lets first look at one such building block, for instance, a single perceptron. 

<img src="pics/lego.jpg" width=300>

## From biological neurons to artificial neural networks

To get started, I will first introduce a type of artificial neuron the **perceptron**. It was introduced with the well-known perceptron algorithm by Rosenblatt (1957), inspired by earlier work on McCulloch-Pitts to model neurons in the brain. In layman's terms, a neuron gets information through dendrites and if enough information is accumulated the neuron 'fires' and send information down the axon: 

<img src="pics/neuron.jpg" width="350" style="float: left"><img src="pics/neuron-simple.png" width="350">

### How does the perceptron work?
The basic perceptron gets **inputs** $x_1,..,x_n$ and produces an **output** $y$. It does so by **weighting** the inputs by $w_1,..,w_n$, sums up the weighted intputs and sends this weighted sum through an **activation function** $\sigma$ doto see if the neuron "fires". That is, if the weighted sum is above some **threshold** it will output 1, otherwise 0.

Mathematically, the perceptron is formulated as: 

$y = \sigma(\sum_{j=1}^d w_{kj} x_j )$

We can visualize the perceptron as (for a given perceptron node $k$): <img src="pics/perceptron.png" width=400> 

## What is $\sigma$?

In the perceptron $\sigma$ is a **threshold** function. Intuitively, the perceptron only fires if the weighted sum is above some threshold. We can formulize this intuition as:


$$\begin{equation}
    y=
    \begin{cases}
      1 & \text{if} (\sum_j w_j x_j) > threshold\\
      0 & \text{otherwise}\\
    \end{cases}
  \end{equation}$$
  
Lets rewrite the equation of the perceptron. First, notice that $\sum_{j=1} w_{j} x_j $ is the **dot product** of the weights and input, and can be written as: 

$$\sum_{j=1} w_{j} x_j = \vec{w} \cdot \vec{x}$$ where $\vec{w}$ and $\vec{x}$ are now vectors. If it is clear from context we avoid the explicit vector notation and simply write: $w \cdot x$. Second, we will move the threshold inside the equation by introducing $b$ the bias term $b=-threshold$. Using these two changes, the equation rewrites as:

$$\begin{equation}
    y=
    \begin{cases}
      1 & \text{if} (w \cdot x + b) > 0)\\
      0 & \text{otherwise}\\
    \end{cases}
  \end{equation}$$

### Example
Suppose we have a perceptron with two inputs, weights -2 and -2 and bias term 3. This is illustrated as: <img src="pics/and.png">
What function does this simple perceptron compute?

In [11]:
def compute(input_array):
    a = input_array[0]*-2 + input_array[1]*-2 + 1 * 3
    if a > 0:
        return 1
    else:
        return 0
i=[0,0]
print(compute(i))
print(compute([0,1]))
print(compute([1,1]))

1
1
0


In [None]:
i=[0,1]
y=

* threshold
* sigma
* network, activation functions

### References
* More details in [Michael Nielsen's book chapter 1](http://neuralnetworksanddeeplearning.com/chap1.html)