# Compare

Comparing gives a measurement of how much a prediction “missed” by

In [1]:
knob_weight = 0.5
input = 0.5
goal_pred = 0.8

pred = input * knob_weight
error = (pred - goal_pred) ** 2  # Forces the raw error to be positive

print(error)

0.30250000000000005


# Learn

Learning tells each weight how it can change to reduce the error.

In [2]:
weight = 0.1
lr = 0.01 # Define the left-right adjustment for the weight


def neural_network(input, weight):
    prediction = input * weight
    return prediction

In [3]:
# Data
number_of_toes = [8.5]
win_or_lose_binary = [1] # (won!!!)

# Predict and calculate the error
input = number_of_toes[0]
true = win_or_lose_binary[0]

pred = neural_network(input,weight)
error = (pred - true) ** 2

print("error:", error)

error: 0.022499999999999975


In [4]:
# "turn the dial" up and down to found a smaller error

p_up = neural_network(input,weight+lr)
e_up = (p_up - true) ** 2
print("e_up:", e_up)

p_dn = neural_network(input,weight-lr)
e_dn = (p_dn - true) ** 2
print("e_down:", e_dn)

e_up: 0.004224999999999993
e_down: 0.05522499999999994


Results from 1 iteration:

- Original error: `0.022499999999999975`
- Dial up: `0.004224999999999993`
- Dial down: `0.05522499999999994`

the `Dial up` reduced the error so `weight+lr` will be the next weight!

# Hot and cold learning

Easy to understand but...

- It's inefficient and 
- Sometimes it’s impossible to predict the exact goal prediction

Next example itearates the "dial" up and down (by `step_amount = 0.001`) 1101 times. 

(We end up with `Error:1.0799505792475652e-27` and `Prediction:0.7999999999999672`)

In [5]:
weight = 0.5
input = 0.5
goal_prediction = 0.8
step_amount = 0.001  # How much to move the weights each iteration

for iteration in range(1101):  # Repeat learning many times so the error can keep getting smaller.
    
    prediction = input * weight
    error = (prediction - goal_prediction) ** 2

    if iteration%100 == 0:
        print("Iteration:" + str(iteration))
        print("\tError:" + str(error), "\tPrediction:" + str(prediction))

    up_prediction = input * (weight + step_amount)
    up_error = (goal_prediction - up_prediction) ** 2
    down_prediction = input * (weight - step_amount)
    down_error = (goal_prediction - down_prediction) ** 2
    
    if(down_error < up_error):
        weight = weight - step_amount
    if(down_error > up_error):
        weight = weight + step_amount

print("Error:" + str(error) + " Prediction:" + str(prediction))


Iteration:0
	Error:0.30250000000000005 	Prediction:0.25
Iteration:100
	Error:0.25 	Prediction:0.30000000000000004
Iteration:200
	Error:0.20249999999999996 	Prediction:0.3500000000000001
Iteration:300
	Error:0.15999999999999992 	Prediction:0.40000000000000013
Iteration:400
	Error:0.1224999999999999 	Prediction:0.4500000000000002
Iteration:500
	Error:0.0899999999999999 	Prediction:0.5000000000000002
Iteration:600
	Error:0.06250000000000266 	Prediction:0.5499999999999947
Iteration:700
	Error:0.04000000000000434 	Prediction:0.5999999999999892
Iteration:800
	Error:0.0225000000000049 	Prediction:0.6499999999999837
Iteration:900
	Error:0.01000000000000437 	Prediction:0.6999999999999782
Iteration:1000
	Error:0.0025000000000027357 	Prediction:0.7499999999999727
Iteration:1100
	Error:1.0799505792475652e-27 	Prediction:0.7999999999999672
Error:1.0799505792475652e-27 Prediction:0.7999999999999672


# Calculating both direction and amount from error


> note: The `pure error` indicates the raw direction and amount you missed. If this is a positive number, you predicted too high, and vice versa. If this is a big number, you missed by a big amount, and so on.

```
pure error = (your_pred - goal_pred)
```

Now we'll see the `gradient descent` (scaling, negative reversal, and stopping).

These three attributes have the combined effect of translating the pure error into the absolute amount you want to change weight. They do so by addressing three major edge cases where the pure error isn’t sufficient to make a good modification to weight.

```
direction_and_amount = (pred - goal_pred) * input
```


In [6]:
weight = 0.5
goal_pred = 0.8
input = 0.5

for iteration in range(20):
    pred = input * weight
    error = (pred - goal_pred) ** 2
    direction_and_amount = (pred - goal_pred) * input  # gradient descent!!!
    weight = weight - direction_and_amount
    print("Error:" + str(error) + " Prediction:" + str(pred))

Error:0.30250000000000005 Prediction:0.25
Error:0.17015625000000004 Prediction:0.3875
Error:0.095712890625 Prediction:0.49062500000000003
Error:0.05383850097656251 Prediction:0.56796875
Error:0.03028415679931642 Prediction:0.6259765625
Error:0.0170348381996155 Prediction:0.669482421875
Error:0.00958209648728372 Prediction:0.70211181640625
Error:0.005389929274097089 Prediction:0.7265838623046875
Error:0.0030318352166796153 Prediction:0.7449378967285156
Error:0.0017054073093822882 Prediction:0.7587034225463867
Error:0.0009592916115275371 Prediction:0.76902756690979
Error:0.0005396015314842384 Prediction:0.7767706751823426
Error:0.000303525861459885 Prediction:0.7825780063867569
Error:0.00017073329707118678 Prediction:0.7869335047900676
Error:9.603747960254256e-05 Prediction:0.7902001285925507
Error:5.402108227642978e-05 Prediction:0.7926500964444131
Error:3.038685878049206e-05 Prediction:0.7944875723333098
Error:1.7092608064027242e-05 Prediction:0.7958656792499823
Error:9.614592036015323

# Calculating the weight delta and putting it on the weight


The main difference is that we now use a `delta` parameter.

> `delta` is a measurement of how much a node missed. Suppouse the true prediction is 1.0, and the
network’s prediction was 0.85, so the network was low by 0.15. Thus, delta is negative 0.15.

In [7]:
pred = neural_network(input,weight)

error = (pred - goal_pred) ** 2
delta = pred - goal_pred  # Here's the new delta!
weight_delta = input * delta  # measure of how much a weight caused the network to miss

alpha = 0.01  # An hyperparameter to control how "fast" the NN lears!
weight -= weight_delta * alpha

# Learning is just reducing error

**KEY POINT!!!**: For any `input` and `goal_pred`, an **exact relationship** is defined between `error` and `weight`, found by combining the prediction and error formulas. In this case:

$error = ((0.5 * weight) - 0.8) ** 2$

So we need to find the `weight` where `error` it's minimum

In [8]:
weight, goal_pred, input = (0.0, 0.8, 0.5)

for iteration in range(4):
    pred = input * weight
    error = (pred - goal_pred) ** 2
    delta = pred - goal_pred
    weight_delta = delta * input
    weight = weight - weight_delta
    print("Error:" + str(error) + " Prediction:" + str(pred))

Error:0.6400000000000001 Prediction:0.0
Error:0.3600000000000001 Prediction:0.2
Error:0.2025 Prediction:0.35000000000000003
Error:0.11390625000000001 Prediction:0.4625


> We’ll spend the rest of this book (and many deep learning researchers will spend the rest of their lives) trying everything you can imagine on that `pred` calculation so that **it can make good predictions**

# Alpha in code

Usando el metodo que ya conocemos podemos toparnos un nuevo problema!
que pasa si `input` NO es adecuado???

Muy grande, muy chico, etc.

Dado que `weight_delta = delta * input` depende del `input` y el peso al final es: `weight = weight - weight_delta`.

Nuestro peso final o se altera de sobremanera (`input = 2.0`) llega al resultado esperado (`input = 1.1`) o bien se tarda MUCHISIMO en llegar a nuestro objetivo (`input = 0.1`)!

In [25]:
for input in [0.1, 1.1, 2.0]:
    weight = 0.0
    goal_pred = 0.8
    print("-"*40, f"For the input = {input}:", sep="\n")
    for iteration in range(5):
        pred = input * weight
        error = (pred - goal_pred) ** 2
        delta = pred - goal_pred
        weight_delta = delta * input
        weight = weight - weight_delta
        print(f"Iteration({iteration}):", 
              f"Error: {round(error, 2)}",
              f"Prediction: {round(pred, 2)}",
              sep="\n\t")
    print("-"*10,
          f"Final Prediction = {pred}",
          f"Final Error = {error}",
          sep="\n")


----------------------------------------
For the input = 0.1:
Iteration(0):
	Error: 0.64
	Prediction: 0.0
Iteration(1):
	Error: 0.63
	Prediction: 0.01
Iteration(2):
	Error: 0.61
	Prediction: 0.02
Iteration(3):
	Error: 0.6
	Prediction: 0.02
Iteration(4):
	Error: 0.59
	Prediction: 0.03
----------
Final Prediction = 0.031523192
Final Error = 0.5905566044338689
----------------------------------------
For the input = 1.1:
Iteration(0):
	Error: 0.64
	Prediction: 0.0
Iteration(1):
	Error: 0.03
	Prediction: 0.97
Iteration(2):
	Error: 0.0
	Prediction: 0.76
Iteration(3):
	Error: 0.0
	Prediction: 0.81
Iteration(4):
	Error: 0.0
	Prediction: 0.8
----------
Final Prediction = 0.798444152
Final Error = 2.4206629991042546e-06
----------------------------------------
For the input = 2.0:
Iteration(0):
	Error: 0.64
	Prediction: 0.0
Iteration(1):
	Error: 5.76
	Prediction: 3.2
Iteration(2):
	Error: 51.84
	Prediction: -6.4
Iteration(3):
	Error: 466.56
	Prediction: 22.4
Iteration(4):
	Error: 4199.04
	Predi

Ahora lo unico que debemos hacer es añadir el `alpha`, de la siguiente manera:

```
weight = weight - (alpha * (input * (pred - goal_pred)))
```

Y... atinarle al mejor posible...

In [38]:

for input, alpha in [(0.1, 75), (1.1, 1), (2.0, 0.2)]:
    weight = 0.0
    goal_pred = 0.8
    print("-"*40, f"For the input = {input}:", sep="\n")
    for iteration in range(5):
        pred = input * weight
        error = (pred - goal_pred) ** 2
        delta = pred - goal_pred
        weight_delta = delta * input
        weight = weight - (alpha * (input * (pred - goal_pred)))
        print(f"Iteration({iteration}):", 
              f"Error: {round(error, 2)}",
              f"Prediction: {round(pred, 2)}",
              sep="\n\t")
    print("-"*10,
          f"Final Prediction = {pred}",
          f"Final Error = {error}",
          sep="\n")


----------------------------------------
For the input = 0.1:
Iteration(0):
	Error: 0.64
	Prediction: 0.0
Iteration(1):
	Error: 0.04
	Prediction: 0.6
Iteration(2):
	Error: 0.0
	Prediction: 0.75
Iteration(3):
	Error: 0.0
	Prediction: 0.79
Iteration(4):
	Error: 0.0
	Prediction: 0.8
----------
Final Prediction = 0.796875
Final Error = 9.765625000000278e-06
----------------------------------------
For the input = 1.1:
Iteration(0):
	Error: 0.64
	Prediction: 0.0
Iteration(1):
	Error: 0.03
	Prediction: 0.97
Iteration(2):
	Error: 0.0
	Prediction: 0.76
Iteration(3):
	Error: 0.0
	Prediction: 0.81
Iteration(4):
	Error: 0.0
	Prediction: 0.8
----------
Final Prediction = 0.798444152
Final Error = 2.4206629991042546e-06
----------------------------------------
For the input = 2.0:
Iteration(0):
	Error: 0.64
	Prediction: 0.0
Iteration(1):
	Error: 0.03
	Prediction: 0.64
Iteration(2):
	Error: 0.0
	Prediction: 0.77
Iteration(3):
	Error: 0.0
	Prediction: 0.79
Iteration(4):
	Error: 0.0
	Prediction: 0.8
-