# Deep Learning Fast Start

### Feed Forward

Alfred Essa, Shirin Mojarad



# Forward Propagation

**Learning Objective 1**: State the definition of a *feedforward* network

**Learning Objective 2**: Describe the data structures, in terms of *matrix algebra*, for representing a feedforward network 

**Learning Objective 3**: Understand the end-to-end *computation* or forward propagation for a feedforward network

**Learning Objective 4**: Write a Python class to compute the forward propagation steps resulting in the final output

# Feedforward Networks

<img src="images/feedforward1.png" width="75%" height="75%" />

> Feedforward networks are the quintessential neural network. They are referred to as feedforward because information flows sequentially through each layer without feedback or loops.

# Forward Propagation Computation

<img src="images/feedforward2.png" width="75%" height="75%" />

- A feedforward neural network composes together a series of functions: $f(x) = f^{(3)}(f^{(2)}(f^{(1)}(x)))$

- $f^{(1)}$ is the first layer of the network, $f^{(2)}$ is the second layer, and so on

### Network Class

In [1]:
import numpy as np

In [2]:
def relu(z):
    return np.maximum(0,z)

def sigmoid(z):
        return 1.0/(1.0+np.exp(-z))
    
def heaviside(z):
    if z<0:
        return 0
    else:
        return 1

In [3]:
class Network(object):
    def __init__(self,sizes):
        self.num_layers = len(sizes)
        self.sizes = sizes
        self.biases = [np.random.randn(y,1) for y in sizes[1:]]
        self.weights = [np.random.randn(y,x) 
                        for x,y in zip(sizes[:-1], sizes[1:])]
        
    def feedforward(self,a,phi):
        for b,w in zip(self.biases, self.weights):
            z = np.dot(w,a)+b
            a = phi(z)
        print(z)
        print(a)
        return a
        
    def set_biases(self,newbiases):
        self.biases = [np.array(newbiases)]
        
    def set_weights(self,newweights):
        self.weights = [np.array([newweights])]
        
    def show_parameters(self):
        print("biases: =", self.biases)
        print("weights: =", self.weights)
        

### Neuron 1: Training Example 1,2

In [4]:
x1 = [2.3,4.5,1.3]
x2 = [1.3,2.5,4.3]

In [20]:
N1 = Network([3,2])

In [21]:
N1.show_parameters()

biases: = [array([[ 0.55539814],
       [ 1.43197763]])]
weights: = [array([[-0.88430825,  0.35230626, -1.42274356],
       [ 1.20126055, -1.19043282,  0.91832867]])]


In [22]:
N1.set_biases([[-1]])

In [23]:
N1.set_weights([3.2,-1.9,2.5])

In [24]:
N1.show_parameters()

biases: = [array([[-1]])]
weights: = [array([[ 3.2, -1.9,  2.5]])]


In [11]:
N1.feedforward(x1,sigmoid)

[[ 1.06]]
[[ 0.74269055]]


array([[ 0.74269055]])

In [12]:
N1.feedforward(x2,sigmoid)

[[ 9.16]]
[[ 0.99989485]]


array([[ 0.99989485]])

In [13]:
X = [[2.3,1.3],[4.5,2.5],[1.3,4.3]]


In [14]:
N1.feedforward(X,sigmoid)

[[ 1.06  9.16]]
[[ 0.74269055  0.99989485]]


array([[ 0.74269055,  0.99989485]])

### Neuron2 - Training Example 1,2

In [15]:
N2 = Network([3,1])

In [16]:
N2.set_weights([1.1,-1.5,-1.2])

In [17]:
N2.set_biases(11)

In [18]:
N2.feedforward(x1,sigmoid)

[ 5.22]
[ 0.99462175]


array([ 0.99462175])

In [19]:
N2.feedforward(x2,sigmoid)

[ 3.52]
[ 0.9712515]


array([ 0.9712515])

### Data Representation

We represent the input $\mathbf{x}$ as a column vector. $\mathbf{x} =\begin{bmatrix}
    x_1\\
    x_2\\
    \vdots \\
    x_n
\end{bmatrix}$

We represent each layer's weights as a matrix. $$\mathbf{W}^\top =  \begin{bmatrix} { w }_{ 11 } & { w}_{ 12 } & {w}_{13}\\ { w }_{ 21 } & { w }_{ 22 } &{w}_{23} \\ \vdots & \vdots & \vdots \\ { w }_{ n1 } & { w }_{ n2 }& {w}_{n3} \end{bmatrix}$$

We represent the biases $b$ as a column vector. $\mathbf{b} =\begin{bmatrix}
    b_1\\
    b_2\\
    \vdots \\
    b_n
\end{bmatrix}$

$ Z = \begin{bmatrix} { w }_{ 11 } & { w}_{ 12 } & {w}_{13}\\ { w }_{ 21 } & { w }_{ 22 } &{w}_{23} \\ \vdots & \vdots & \vdots \\ { w }_{ n1 } & { w }_{ n2 }& {w}_{n3} \end{bmatrix}$ **DOT** $\begin{bmatrix}
    x_1\\
    x_2\\
    \vdots \\
    x_n
\end{bmatrix} + \begin{bmatrix}
    b_1\\
    b_2\\
    \vdots \\
    b_n
\end{bmatrix}$

$$ a = \phi (\mathbf{W}^\top \cdot \mathbf{X} + \mathbf{B}) $$