#### Parametric vs Non Parametric Learning

Trial and error method 
**vs** 
Solve using counting, probability etc

When no. of parameters are determined, just what setting/value they should be set at is to be identified 
**vs** 
Count based - keeps adding new parameters/settings as it finds something new to learning

#### Workflow of Parametric Learning:

1. Use some data to predict
2. Compare the output prediction to the truth
3. Adjust the weights/values for the parameters, to predict better next time on similar data

#### Sample neural network

In [1]:
def neural_network(input, weight):
    prediction = input * weight
    return prediction

#### Multiple inputs and weights

In [4]:
def neural_network_2(input, weights):
    prediction = w_sum(input, weights)
    return prediction

def w_sum(a, b):
    assert len(a) == len(b)
    output = 0
    for i in range(len(a)):
        output += a[i] * b[i]

    return output

weights = [0.1, 0.2, 0]
inputs = [8.5, 0.6, 1.2]

pred = neural_network_2(inputs, weights)
pred

0.9700000000000001

#### Multiple outputs and weights

In [3]:
def neural_network_3(input, weights):
    pred = ele_mul(input, weights)
    return pred

def ele_mul(number, vector):
    output = [0, 0, 0]
    assert len(output) == len(vector)
    for i in range(len(vector)):
        output[i] = number * vector[i]

    return output

weights = [0.3, 0.2, 0.9]
win_loss_records = [0.65, 0.8, 0.8, 0.9]
input = win_loss_records[0]
pred = neural_network_3(input, weights)
pred

[0.195, 0.13, 0.5850000000000001]

#### Multiple inputs and outputs


![](images\img_1.png)

In [6]:
def neural_network_4(input, weights):
    pred = vec_mat_mul(input, weights)
    return pred

def     _mul(vector, matrix):
    output = [0] * len(vector)
    for i in range(len(vector)):
        output[i] = w_sum(vector, matrix[i])
    
    return output

def w_sum(a, b):
    assert(len(a) == len(b))
    output = 0
    for i in range(len(a)):
        output += a[i] * b [i]
    return output

toes = [8.5, 9.5, 9.9, 9.0]
wlrec = [0.65, 0.8, 0.8, 0.9]
nfans = [1.2, 1.3, 0.5, 1.0]
inputs = [toes[0], wlrec[0], nfans[0]]

weights = [[0.1, 0.1, -0.3],[0.1, 0.2, 0.0], [0.0, 1.3, 0.1]]

pred = neural_network_4(inputs, weights)
pred

[0.555, 0.9800000000000001, 0.9650000000000001]

#### Stacking predictions
![](deep-learning\images\img_2.png)

In [7]:
def neural_network_4(input, weights):
    pred = vec_mat_mul(input, weights[0])
    pred = vec_mat_mul(pred, weights[1])

    return pred

def vec_mat_mul(vector, matrix):
    output = [0] * len(vector)
    for i in range(len(vector)):
        output[i] = w_sum(vector, matrix[i])
    
    return output

def w_sum(a, b):
    assert(len(a) == len(b))
    output = 0
    for i in range(len(a)):
        output += a[i] * b [i]
    return output

toes = [8.5, 9.5, 9.9, 9.0]
wlrec = [0.65, 0.8, 0.8, 0.9]
nfans = [1.2, 1.3, 0.5, 1.0]
inputs = [toes[0], wlrec[0], nfans[0]]

weights_1 = [[0.1, 1.2, -0.1],[-0.1, 0.1, 0.9], [0.1, 0.4, 0.1]]
weights_2 = [[0.1, 0.1, -0.3],[0.1, 0.2, 0.0], [0.0, 1.3, 0.1]]
weights = [weights_1, weights_2]

pred = neural_network_4(inputs, weights)
pred

[-0.18849999999999997, 0.21000000000000002, 0.5065]

##### Same using numpy

In [6]:
import numpy as np
weights_1 = np.array([[0.1, 1.2, -0.1],[-0.1, 0.1, 0.9], [0.1, 0.4, 0.1]])
weights_2 = np.array([[0.1, 0.1, -0.3],[0.1, 0.2, 0.0], [0.0, 1.3, 0.1]])
weights = [weights_1, weights_2]

def neural_network(input, weights):
    hid = input.dot(weights[0])
    pred = input.dot(weights[1])
    return pred

toes = np.array([8.5])
wlrec = np.array([0.65])
nfans = np.array([1.2])

input = np.array([toes[0], wlrec[0], nfans[0]])
pred = neural_network(input, weights)
pred



array([ 0.915,  2.54 , -2.43 ])

#### Compare and Learn
Post prediction, the next step is evaluation against actual result to tell how far was the prediction.

##### Mean Squared Error (the compare part)
One of the ways to do so is MSE. It tells whether the prediction was accurate, more by x amount, less by x amount.

##### Gradient Descent (the learn part) 
GD fixes this by adjusting the weight
1. Compare the number for each weight
2. Move weight according to number
3. Repeat if needed in next iteration.

In [7]:
# Example of measuring in python
knob_weight = 0.5
input = 0.5
goal_pred = 0.8

pred = input * knob_weight
error = (pred - goal_pred)**2
error

0.30250000000000005

We square it because to avoid negative errors, as the difference can be -ve in some cases. Ex - in case of an archer, the arrow can be higher than the target, or lower than the target.

We measure error so that we can adjust the weights accordingly. The end goal in deep learning is to find weights such that the errors can be reduced to 0, and the prediction can be as close as the actuals.

Squaring the errors help prioritizing what weights to focus more on. error of 10 results in error of 1000, whereas error of 0.01 will result in 0.00001. As expected, working on the one with error 10 will yield more results

#### Hot and Cold Learning

In [8]:
# Running hot and cold learning
weight = 0.5
input = 0.5
goal_prediction = 0.8
step_amount = 0.001

def neural_network(input, weight):
    prediction = input * weight
    return prediction

for iter in range(1101):
    prediction = neural_network(input, weight)
    error = (prediction - goal_prediction) ** 2

    up_pred = neural_network(input, weight + step_amount)
    up_error = (goal_prediction - up_pred) ** 2

    down_pred = neural_network(input, weight - step_amount)
    down_error = (goal_prediction - down_pred) ** 2

    if down_error < up_error:
        weight = weight - step_amount
    elif down_error > up_error:
        weight = weight + step_amount

print("Error", error, "Prediction", prediction)

Error 1.0799505792475652e-27 Prediction 0.7999999999999672


: 

This way of doing Hot and Cold Learning is inefficient as it causes 3 times the prediction function to run

#### Gradient Descent and using it to learn

In [5]:
weight = 0
input = 1.1
goal_prediction = 0.8
alpha = 1

def neural_network(input, weight):
    prediction = input * weight
    return prediction

for iter in range(4):
    print("Weight", weight)
    prediction = neural_network(input, weight)
    # square method
    error = (prediction - goal_prediction) ** 2
    # error = ((input*weight) - goal_prediction) ** 2
    delta = prediction - goal_prediction
    weight_delta = input * delta
    weight -= weight_delta * alpha
    print("Error", error, "Prediction", prediction)
    print("Delta", delta, "Weight Delta", weight_delta)
    print("---")

Weight 0
Error 0.6400000000000001 Prediction 0.0
Delta -0.8 Weight Delta -0.8800000000000001
---
Weight 0.8800000000000001
Error 0.02822400000000005 Prediction 0.9680000000000002
Delta 0.16800000000000015 Weight Delta 0.1848000000000002
---
Weight 0.6951999999999999
Error 0.0012446784000000064 Prediction 0.76472
Delta -0.03528000000000009 Weight Delta -0.0388080000000001
---
Weight 0.734008
Error 5.4890317439999896e-05 Prediction 0.8074088
Delta 0.007408799999999993 Weight Delta 0.008149679999999992
---


As this line states:
```python
error = ((input*weight) - goal_prediction) ** 2
```
If we fix the input and goal_prediction, which is something we are aware of from the beginning, then there's a direct relation between error and weight

The goal hence is to have the weight adjusted in such a way that the error reaches the bottom of a bell curve.

That would be the most optimal value of weight

In above example, after 4 iterations. We reach the goal prediction, which is around 0.8.

#### Error vs Derivative

Error is how much we missed

Derivative is the relationship between weights and how much we missed, i.e. on changing how much weight change, changes the error

Moreover, derivative is
1. The rate of change of a function with respect to a variable at
a particular point
2. The slope at a point on a line or curve
3. Helps identify the senstivity between 2 variables.

#### Gradient Descent with multiple inputs

In [20]:
def w_sum(input_arr, weight_arr):
    """
    Calculate weighted sum between input arr and weight arr such that
    output[0] = input[0] * weight[0]
           1          1           1
    and so on
    """
    assert len(input_arr) == len(weight_arr)
    output = 0
    for i in range(len(input_arr)):
        output += (input_arr[i] * weight_arr[i])
    return output

def neural_network(input, weight):
    """
    Call method responsible for doing weighted sum here
    """
    prediction = w_sum(input, weight)
    return prediction

def element_mul(number, input_arr):
    """
    Initialize output array as [0, 0, 0]
    For every input value, multiply it with delta to generate
    new set of input_arr values
    """
    output = [0, 0, 0]
    assert len(output) == len(input_arr)
    for i in range(len(input_arr)):
        output[i] = number * input_arr[i]
    return output

toes = [8.5, 9.5, 9.9, 9.0]
win_loss_rec = [0.65, 0.8, 0.8, 0.9]
num_fans = [1.2, 1.3, 0.5, 1.0]
win_or_loss = [1, 1, 0, 1]

weights = [0.1, 0.2, -0.1]
true = win_or_loss[0]
input = [toes[0], win_loss_rec[0], num_fans[0]]

pred = neural_network(input, weights)

error = (pred - true) ** 2
delta = pred - true
weight_delta = element_mul(delta, input)
alpha = 0.01
# once weight delta is calculated, updated all the weights accordingly
for i in range(len(weights)):
    weights[i] -= alpha * weight_delta[i]

print(weights)

[0.1119, 0.20091, -0.09832]


Based on above:

**Delta**, how much we want the node value to change

**Weight Delta**, an estimate for the direction and amount we should move our weights to reduce node delta, inferred by the derivative

##### Gradient Descent with multiple inputs + multiple iterations

In [19]:

def neural_network(inputs, weights):
    output = 0
    for i in range(len(inputs)):
        output += (inputs[i] * weights[i])
    return output

def element_mul(number, input_arr):
    output = [0, 0, 0]
    assert len(output) == len(input_arr)
    for i in range(len(input_arr)):
        output[i] = number * input_arr[i]
    return output

toes = [8.5, 9.5, 9.9, 9.0]
win_loss_rec = [0.65, 0.8, 0.8, 0.9]
num_fans = [1.2, 1.3, 0.5, 1.0]
win_or_loss = [1, 1, 0, 1]

alpha = 0.01
weights = [0.1, 0.2, -0.1]
true = win_or_loss[0]
input = [toes[0], win_loss_rec[0], num_fans[0]]


for iter in range(3):
    pred = neural_network(input, weights)
    error = (pred - true) ** 2
    delta = pred - true
    weight_delta = element_mul(delta, input)
    print("Prediction:", pred, "Error:", error, "Delta:", delta)
    print("Weights:", weights)
    print("Weight Deltas", weight_delta)
    print("---")
    for i in range(len(weights)):
        weights[i] -= alpha * weight_delta[i]


Prediction: 0.8600000000000001 Error: 0.01959999999999997 Delta: -0.1399999999999999
Weights: [0.1, 0.2, -0.1]
Weight Deltas [0, -0.09099999999999994, -0.16799999999999987]
---
Prediction: 0.9382250000000001 Error: 0.003816150624999989 Delta: -0.06177499999999991
Weights: [0.1, 0.2273, -0.04960000000000005]
Weight Deltas [0, -0.040153749999999946, -0.07412999999999989]
---
Prediction: 0.97274178125 Error: 0.000743010489422852 Delta: -0.027258218750000007
Weights: [0.1, 0.239346125, -0.02736100000000008]
Weight Deltas [0, -0.017717842187500006, -0.032709862500000006]
---


So the end goal, in the neural network is to find the lowest point in the error plane where the lowest point refers to lowest error

#### Gradient Descent with multiple inputs + multiple outputs

In [22]:
def neural_network(inputs, weights):
    pred = element_mul(inputs, weights)
    return pred

def element_mul(number, input_arr):
    output = [0, 0, 0]
    assert len(output) == len(input_arr)
    for i in range(len(input_arr)):
        output[i] = number * input_arr[i]
    return output

weights = [0.3, 0.2, 0.9]
win_loss_rec = [0.65, 1.0, 1.0, 0.9]

hurt = [0.1]
win = [1]
sad = [0.1]

input = win_loss_rec[0]
goal_prediction = [hurt[0], win[0], sad[0]]

pred = neural_network(input, weights)
error = [0, 0, 0]
delta = [0, 0, 0]

alpha = 0.1
for i in range(len(goal_prediction)):
    error[i] = (pred[i] - goal_prediction[i]) ** 2
    delta[i] = pred[i] - goal_prediction[i]

weight_deltas = element_mul(input, delta)

for i in range(len(weights)):
    weights[i] -= (weight_deltas[i] * alpha)

print("Prediction:", pred, "Error:", error, "Delta:", delta)
print("Weights:", weights)
print("Weight Deltas", weight_delta)
print("---")

Prediction: [0.195, 0.13, 0.5850000000000001] Error: [0.009025, 0.7569, 0.2352250000000001] Delta: [0.095, -0.87, 0.4850000000000001]
Weights: [0.293825, 0.25655, 0.868475]
Weight Deltas [-1.189999999999999, -0.09099999999999994, -0.16799999999999987]
---


Updated to have multiple outputs

In [14]:
def neural_network(inputs, weights):
    # 6. Perform vector (arrays of values) multiplication
    # instead of single value
    pred = vec_mat_mul(inputs, weights)
    return pred

def vec_mat_mul(vector_arr, matrix_arr):
    output = [0, 0, 0] # to return value, same size as vector_arr
    for i in range(len(vector_arr)):
        # multiply [1x3] * [1x3]
        # get single output value, as weighted sum
        output[i] = w_sum(vector_arr, matrix_arr[i]) 

    return output

def w_sum(a, b):
    # 7. Between arrays of 1x3 and 1x3, multiply and keep adding to come up
    # with single value
    assert len(a) == len(b)
    output = 0
    for i in range(len(a)):
        output += a[i] * b[i]
    
    # return single value
    return output

def outer_prod(vec_a, vec_b):
    # 10. Generate a 2D array of size len(vec_a)*len(vec_b)
    # Return a final array of multiplication between vec_a and vec_b
    out = [[0] * len(vec_a) for _ in range(len(vec_b))]
    for i in range(len(vec_a)):
        for j in range(len(vec_b)):
            out[i][j] = vec_a[i] * vec_b[j]
        
    return out


# 1. Multiple weights for each input
              #toes #win #fans
weight_arr = [[0.1, 0.1, -0.3], #hurt
              [0.1, 0.2, 0.0], #win
              [0.0, 1.3, 0.1]] #sad

# 2. Each feature/input and its value
toes = [8.5, 9.5, 9.9, 9.0]
win_loss_rec = [0.65, 1.0, 1.0, 0.9]
num_fans = [1.2, 1.3, 0.5, 1.0]

# 3. Prediction
hurt = [0.1] # 1x4
win = [1] # 1x4
sad = [0.1] # 1x4

# 4. Identifying input and output (goal prediction)
input_arr = [toes[0], win_loss_rec[0], num_fans[0]]
goal_prediction = [hurt[0], win[0], sad[0]]

# 5. Generate predictions
generated_prediction = neural_network(input_arr, weight_arr)

# 8. prediction returned from NN for given input arr and weight arr
# is [0.555, 0.9800000000000001, 0.9650000000000001]

# initialize error and delta
error = [0, 0, 0]
delta = [0, 0, 0]
alpha = 0.1 # step, i.e. amount of updation to be done at each iteration

# for each value, compare against generated prediction and goal
# save it in error and delta
for i in range(len(goal_prediction)):
    error[i] = (generated_prediction[i] - goal_prediction[i]) ** 2
    delta[i] = generated_prediction[i] - goal_prediction[i]

# 9. Get weight deltas based on the input_arr and delta calculated
weight_deltas = outer_prod(input_arr, delta)

# 11. According to weight_deltas, received, update the weights as per need
for i in range(len(weights)):
    for j in range(len(weights[0])):
        weights[i][j] -= (weight_deltas[i][j] * alpha)

print("Prediction:", pred)
print("Error:", error)
print("Delta:", delta)
print("Weights:", weights)
print("Weight Deltas", weight_deltas)
print("---")


Prediction: [0.555, 0.9800000000000001, 0.9650000000000001]
Error: [0.20702500000000007, 0.0003999999999999963, 0.7482250000000001]
Delta: [0.45500000000000007, -0.019999999999999907, 0.8650000000000001]
Weights: [[-3.7675000000000014, 0.26999999999999924, -7.652500000000003], [-0.1957500000000001, 0.21299999999999997, -0.5622500000000001], [-0.5459999999999999, 1.3239999999999996, -0.9380000000000001]]
Weight Deltas [[3.8675000000000006, -0.1699999999999992, 7.352500000000001], [0.29575000000000007, -0.01299999999999994, 0.5622500000000001], [0.546, -0.023999999999999886, 1.038]]
---
