In [1]:
# default_exp forward
# default_cls_lvl 2

# Introduction to Neural Prediction: Forward Propagation
>

In this Chapter:
- A Simple Network Making a Prediction.
- What is a Neural Network, and What does it do ?
- Making a Prediction with Multiple Inputs.
- Making a Prediction with Multiple Outputs.
- Making a Prediction with Multiple Inputs and Outputs.
- Predicting on Predictions.

*"I Try not to get envolved in the Business of Prediction. It's a Quick way to look like an Idiot."*

## 1. Predict 
### This Chapter is about prediction

- You learned about the paradigm "Predict, Compare, Learn.", in This Chapter, We'll dive deep into the first step "**Predict**".
- In your first Neural Network, you're going to predict one data point at a time, like so:
<div style="text-align:center;"><img style="width:600px;" src="static/imgs/03/one-to-one.png" /></div>

- The number of data points you process has a significant impact on what a network looks like.
- "How do I choose the number of data points I propagate at a time?"
    - The Answer is based on whether you think the neural network can be accurate with the data you give it.
    - If i'm trying to predict the presence of a cat in a picture, I definitely want to send all the pixels of the image.
        - can you predict cat presence with one pixel ? Me Neither !
- You can create a network only after you understand the `shape` of the input & output data sets.

- You're going to build a network with a single knob mapping from the input point to the output.
    - These knobs are actually called "weights".
- Here's your first network, with a single weight mapping from the input "# toes" to the output "win?":
<div style="text-align:center;"><img style="width:600px;" src="static/imgs/03/first-network.png" /></div>

## A Simple Neural Network Making a prediction
### Let's Start with the simplest neural network possible

In [1]:
# an empty network.
weight = .1
def neural_network(input, weight):
    prediction = input * weight
    return prediction

In [2]:
# inserting one input datapoint.
number_of_toes = [8.5, 9.5, 10, 9]
input = number_of_toes[0]
pred = neural_network(input, weight)
print(pred)

0.8500000000000001


- So What is a Neural Network ?
    - For Now, It's one or more weights that you can multiply by the input data to make a prediction.
- What is Input Data ?
    - It's a number that you recorded in the real world somewhere.
- What is a Prediction ?
    - A Prediction is what the neural network tells you, given the input data.
- Is this Prediction always right ?
    - No. Sometimes the neural network will make mistakes, but it can learn from them.
- How does the network learn ?
    - **Trial and Error**.
    - First, it tries to make a prediction.
    - Then, it sees whether the prediction was too high or too low.
    - Finally, it changes the weight (up or down) to predict more accurately the next time it sees the same input.

## What Does this neural network do ?
### It multiplies the Input by a Weight. It "Scales" the Input by a Certain Amount

- A neural network, in its simplest form, uses the power of *multiplication*.
- Some Weight Values make the Input Bigger, & other values make it Smaller.
- **A Neural Network Accepts an Input Value as Information and a Weight Value as Knowledge and Outputs a Prediction**.
- A NN uses the knowledge stored in its weights to interpret the Information in the Input.
- In the Previous Example, The **NN Doesn't have access to any information except one instance**.
    - If, After this Prediction, you were to feed in `number_of_toes[1]`, the NN wouldn't remember the prediction it made in the last timestep.
    - A NN knows only what you feed it as input, it forgets everything else.
- Later, You'll learn how to give a NN a **"Short-term memory" by feeding in multiple inputs at once**.
- the Weights can be interpreted as a measure of sensitivity between the input and the prediction.
    - Weight is a Volume Knob.

## Making a Prediction with Multiple Inputs
### Neural Networks can Combine Intelligence from multiple Data Points

<div style="text-align:center;"><img style="width:333px;" src="static/imgs/03/many-to-one.png" /></div>

In [3]:
# input [list], weights [list]: my implementation.
def w_sum(ws, ins):
    # should've added `assert` to force rules on inputs.
    muls = []
    for i in range(len(ws)):
        muls.append(ws[i]*ins[i])
    return sum(muls)

In [11]:
# Book's Implementation.
def w_sum(a,b):
    assert(len(a) == len(b))
    output = 0
    for i in range(len(a)):
        output += (a[i] * b[i])
    return output

In [4]:
# An empty network with multiple inputs.
weights = [.1, .2, 0]
def neural_network(input, weights):
    pred = w_sum(input, weights)
    return pred

This data set is the current status at the beginning of each game for the first four games in a season:
- *toes*: current average number of toes per player.
- *wlrec*: current games won (in precent)
- *nfans*: fan count (in millions)

In [5]:
toes = [8.5, 9.5, 9.9, 9]
wlrec = [.65, .8, .8, .9]
nfans = [1.2, 1.3, .5, 1]
input = [toes[0], wlrec[0], nfans[0]]

In [6]:
neural_network(input, weights)

0.9800000000000001

## Multiple Inputs: What Does this Network Do?
### It Multiples 3 Inputs by 3 Knob Weights & Sums them.
### This is a Weighted Sum.

- You Multiply Each Input by its own weight.
- Because you have multiple inputs, you have to sum their respective local predictions.
- This is called the weighted Sum, or the Dot Product.

#### Challenge: Vector Math
Being able to manipulate vectors is a cornerstore technique for deep learning. See if you can write functions that perform the following operations:

In [10]:
def elementwise_multiplication(vec_a, vec_b):
    assert(len(vec_a) == len(vec_b))
    c = []
    for i in range(len(vec_a)):
        c.append(vec_a[i] * vec_b[i])
    return c

In [12]:
def elementwise_addition(vec_a, vec_b):
    assert(len(vec_a) == len(vec_b))
    c = []
    for i in range(len(vec_b)):
        c.append(vec_a[i] + vec_b[i])
    return c

In [21]:
def vector_sum(vec_a):
    assert(type(vec_a) == type([]))
    return sum(vec_a)

In [24]:
def vector_avg(vec_a):
    assert(type(vec_a) == type([]))
    return (sum(vec_a)/len(vec_a))

In [26]:
a, b = [1,2,3], [4,5,6]
vector_sum(elementwise_multiplication(a, b))

32

- The Intuition behind how & why a dot product works is easily one of the most important parts of truly understanding how neural networks make predictions.
- Loosly stated, **A Dot Product Gives you a Notion of Similarity between Two vectors**.
- Consider the following example:
<div style="text-align:center;"><img style="width:666px;" src="static/imgs/03/dot-product.png" /></div>

- You can Equate the properties of the dot product to the logical `AND`.
- Neural Networks are also able to model partial `AND`ing.
- In this Analogy, Negative weights tend to imply a Logical `NOT` operator.
    - Any positive weight paired with a negative weight will cause the overall score to go down.
- After `AND`s, Comes the `OR`s, because if any of the rows show weight, the score is affected.
- Amusingly, this gives us a kind of crude language for reading weights.
- The following examples assume you're performing the dot product and the `then` to these `if` statements is an abstract "then give high score":

<div style="text-align:center;"><img style="width:666px;" src="static/imgs/03/dot-product-analysis.png" /></div>

- This Analogy will help you significantly in the future, especially when putting networks together in increasingly complex ways.

- **The Neural Network Gives a High score to the Input Most similar to the Weights**.
- Notice that `nfans` is completely ignored in the prediction because the weight associated with it is $0$.
- **You Can't Shuffle Weights**

## Multiple Inputs: Complete Runnable Code

- There is a Python Library called `NumPy`. Which stands for "Numerical Python".
- Here is the same Code using `NumPy`:

In [2]:
import numpy as np

In [5]:
weights = np.array([.1, .2, 0])
toes = [8.5, 9.5, 9.9, 9]
wlrec = [.65, .8, .8, .9]
nfans = [1.2, 1.3, .5, 1]
def neural_network(weights, input):
    assert(weights.shape[0] == input.shape[0])
    return np.dot(weights, input)

In [6]:
neural_network(weights, np.array([toes[0], wlrec[0], nfans[0]]))

0.9800000000000001

## Making a Prediction with Multiple Outputs
### Neural Networks can also make multiple predictions using only a single input.

- Prediction occurs the same as if there were three disconnected single-weight neural networks.
<div style="text-align:center;"><img style="width:333px;" src="static/imgs/03/NN-1-to-many.png" /></div>

In [7]:
def neural_network(weights, input):
    return input * weights

In [8]:
neural_network(np.array([3]), np.array([.2, .7, 0]))

array([0.6, 2.1, 0. ])

In [9]:
# Book's Implementation.
weights = [.3, .2, .9]
def neural_network(weights, input):
    pred = ele_mul(input, weights)
    return pred

In [10]:
# let's implement `ele_mul`:
def ele_mul(c, l):
    assert(type(c) == type(0))
    result = []
    for i in range(len(l)):
        result.append(c * l[i])
    return result

- **Notice that the 3 predictions are completely separate. Unlike NNs w/ multiple inputs & single output, this network truly behaves as three independant components, each receiving the same input data.**

## Predicting with Multiple Inputs & Outputs
### Neural Networks can predict Multiple Outputs Given Multiple Inputs

<div style="text-align:center;"><img style="width:222px;" src="static/imgs/03/NN-many-to-many.png" /></div>

In [3]:
        # toes # %win # fans
weights = [[.1, .1, -.3],  # 1st Neuron: Hurt ?
           [.1, .2, .0],   # win ?
           [.0, 1.3, .1]]  # Sad ?

In [4]:
def neural_network(input, weights):
    pred = vect_mat_mult(input, weights)
    return pred

In [12]:
# Input [R_{1x3}] ; Weights [R_{3x3}]
def vect_mat_mult(vect, matrix):
    assert(len(vect) == len(matrix))
    output = [0, 0, 0]
    for i in range(len(vect)):
        output[i] = w_sum(vect, matrix[i])
    return output

In [13]:
# inputs.
toes = [8.5, 9.5, 9.9, 9.0]
wlrec = [.65, .8, .8, .9]
nfans = [1.2, 1.3, .5, 1.0]

In [14]:
# one column.
input = [toes[0], wlrec[0], nfans[0]]

In [15]:
pred = neural_network(input, weights); pred

[0.555, 0.9800000000000001, 0.9650000000000001]

## Multiple Inputs & Outputs: How does it work ?
### It performs three independent weighted sums of the input to make three predictions

<div style="text-align:center;"><img style="width:333px;" src="static/imgs/03/many-to-many-NN.png" /></div>

- Think of it as three weights going into each output nodes
- Think about this neural network as three independent dot products: three independent weighted sums of the input.
- A list of vectors is called a matrix.

## Predicting on Predictions 
### Neural Networks can be Stacked!

<div style="text-align:center;"><img style="width:333px;" src="static/imgs/03/deep-NN.png" /></div>

- You can also take the output of one network and feed it as input to another network.
- This results in two consecutive vector-matrix multiplications.

In [32]:
# A Network with multiple inputs and outputs.
ih_wgt = [[0.1, 0.2, -0.1],
          [-0.1,0.1, 0.9],
          [0.1, 0.4, 0.1]]
hp_wgt = [[0.3, 1.1, -0.3],
          [0.1, 0.2, 0.0],
          [0.0, 1.3, 0.1]]

In [19]:
weights = [ih_wgt, hp_wgt]

In [20]:
def neural_network(input, weights):
    hid = vect_mat_mult(input, weights[0])
    pred = vect_mat_mult(hid, weights[1])
    return pred

In [23]:
pred = neural_network(input, weights); pred

[0.21350000000000002, 0.14500000000000002, 0.5065]

The following listing shows you how you can do the same operations coded in the previous section using a convenient Python library called `Numpy`. Using Libraries like `Numpy` makes your code faster and easier to read and write.

In [24]:
import numpy as np 

In [33]:
ih_wgt = np.array(ih_wgt).transpose()  # fixing book error.
hp_wgt = np.array(hp_wgt).transpose()  # fixing book error.
weights = [ih_wgt, hp_wgt]

In [34]:
def neural_network(input, weights):
    out = input.dot(weights[0])
    pred = out.dot(weights[1])
    return pred

In [35]:
toes = np.array(toes)
wlrec = np.array(wlrec)
nfans = np.array(nfans)

In [36]:
input = np.array([toes[0], wlrec[0], nfans[0]])

In [37]:
pred = neural_network(input, weights); pred

array([0.2135, 0.145 , 0.5065])

## A Quick Primer on NumPy
## NumPy does a few things for you, let's reveal the magic.

- You'll keep using native python functions to be sure you fully understand what's going on inside them.
- If you create a matrix with only one row, you are creating a vector.
- You create matrices by listing (rows, columns): **Rows Come first, Columns come Second**
- Let's check some examples:

In [38]:
a = np.array([0,1,2,3])  # a vector.
b = np.array([4,5,6,7])  # another vector.
c = np.array([[0,1,2,3], [4,5,6,7]])  # A Matrix.
d = np.zeros((2,4))  # 2x4 matrix of zeros.
e = np.random.rand(2,5)  # 2x5 matrix of random number between 0 & 1.

In [39]:
print(a,b,c,d,e)

[0 1 2 3] [4 5 6 7] [[0 1 2 3]
 [4 5 6 7]] [[0. 0. 0. 0.]
 [0. 0. 0. 0.]] [[0.93936902 0.03476097 0.02302402 0.54727859 0.34876852]
 [0.54575811 0.76579746 0.58947409 0.07631537 0.00246894]]


In [40]:
# element-wise multiplication.
print(a*.2)

[0.  0.2 0.4 0.6]


In [41]:
# element-wise multiplication.
print(c*.1)

[[0.  0.1 0.2 0.3]
 [0.4 0.5 0.6 0.7]]


In [42]:
# multiply two vectors (element wise).
print(a*b)

[ 0  5 12 21]


In [43]:
# complex element-wise multiplications.
print(a*b*.3)

[0.  1.5 3.6 6.3]


In [44]:
# element-wise row multiplications (because of compatible shapes).
print(a*c)

[[ 0  1  4  9]
 [ 0  5 12 21]]


In [45]:
# error in case of incompatible shapes.
print(a*e)

ValueError: operands could not be broadcast together with shapes (4,) (2,5) 

- When you multiply two variables with the * function, NumPy Automatically detects what kind of variables you are working with and tries to figure out what kind of operation you're talking about.
- when you use (+, -, /, ..), either the two variables should have number of columns or one of them should be one number.
- When you read NumPy, you doing doing two things: reading the operations and keeping track of the shapes.

In [47]:
a = np.zeros((1,4))
b = np.zeros((4,3))
c = a.dot(b)
print(c.shape)

(1, 3)


- If you put (rows, columns) in the variable description and used the dot product, you're dotting the variables next to each other. (**Order Matters**)
- Let's check more examples that demonstrate the concept of `shape`:

In [1]:
import numpy as np

In [2]:
a = np.zeros((2,4))
b = np.zeros((4,3))
c = a.dot(b)
print(c.shape)

(2, 3)


In [3]:
e = np.zeros((2,1))
f = np.zeros((1,3))
g = e.dot(f)
print(g.shape)

(2, 3)


In [6]:
h = np.zeros((5,4)).T
i = np.zeros((5,6))
j = h.dot(i)
print(j.shape)

(4, 6)


In [7]:
import numpy as np

h = np.zeros((5,4))
i = np.zeros((5,6))
j = h.dot(i)
print(j.shape)

ValueError: shapes (5,4) and (5,6) not aligned: 4 (dim 1) != 5 (dim 0)

## Summary
### To Predict, Neural Networks Perform Repeated Weighted Sums of the Input

- The Network's Intelligence depends on the Weight values you give it.
- Everything we've done in this chapter is a form of what's called forward propagation.
    - It's called this because you're propagating activations through the network

# Sketches

<div style="text-align:center;"><img style="width:333px" src="static/imgs/03/1-to-1-NN-Note.jpg" /><img style="width:333px" src="static/imgs/03/Reminder-1-to-1-NN-Note.jpg" /></div>