# Calculating Both direction and amount from error

In [1]:
weight = 0.5
goal_pred = 0.8
input = 0.5

for iteration in range(20):
    pred = input * weight
    error = (pred - goal_pred) ** 2
    direction_and_amount = (pred - goal_pred) * input
    weight = weight - direction_and_amount
    
    print("Error: " + str(error) + "\tPrediction: " + str(pred) + "\tdirection_and_amount: " + str(direction_and_amount))

Error: 0.30250000000000005	Prediction: 0.25	direction_and_amount: -0.275
Error: 0.17015625000000004	Prediction: 0.3875	direction_and_amount: -0.20625000000000002
Error: 0.095712890625	Prediction: 0.49062500000000003	direction_and_amount: -0.1546875
Error: 0.05383850097656251	Prediction: 0.56796875	direction_and_amount: -0.11601562500000001
Error: 0.03028415679931642	Prediction: 0.6259765625	direction_and_amount: -0.08701171875000002
Error: 0.0170348381996155	Prediction: 0.669482421875	direction_and_amount: -0.06525878906250004
Error: 0.00958209648728372	Prediction: 0.70211181640625	direction_and_amount: -0.04894409179687503
Error: 0.005389929274097089	Prediction: 0.7265838623046875	direction_and_amount: -0.03670806884765626
Error: 0.0030318352166796153	Prediction: 0.7449378967285156	direction_and_amount: -0.02753105163574221
Error: 0.0017054073093822882	Prediction: 0.7587034225463867	direction_and_amount: -0.020648288726806685
Error: 0.0009592916115275371	Prediction: 0.76902756690979	d

**What is the direction_and_amount?**

It represents how we want to change our weight. Th e fi rst (1) is what we call "pure
error" which equals (pred - goal_pred). Th is number represents "the raw direction and amount that we missed". Th e second part (2) is the multiplication by the
input which performs scaling, negative reversal and stopping...modifying the "pure
error" so that it's ready to update our weight.


**What is the "pure error"?**

It's the (pred - goal_pred) which indicates "the raw direction and amount that we
missed". If this is a positive number then we predicted too high and vice versa. If this
is a big number then we missed by a big amount, etc.


**What is "scaling, negative reversal, and stopping"?**

Th ese three attributes have the combined aff ect of translating our "pure error" into
"the absolute amount that we want to change our weight". Th ey do so by addressing
three major edge cases at which points the "pure error" is not suffi cient to make a
good modifi cation to our weight.

**What is "stopping"?**

This is the first (and simplest) affect on our "pure error" caused by multiplying it
by our input. Imagine plugging in a CD player into your stereo. If you turned the
volume all the way up but the CD player was off... it simply wouldn't matter. "Stopping" addresses this in our neural network... if our input is 0, then it will force our
direction_and_amount to also be 0. We don't learn (i.e. "change the volume") when
our input is 0 because there's nothing to learn... every weight value has the same
error... and moving it makes no difference because the pred is always 0.

**What is "negative reversal"?**

This is probably our most difficult and important effect. Normally (when input is
positive), moving our weight upward makes our prediction move upward. However, if our input is negative, then all of a sudden our weight changes directions!!!
When our input is negative, then moving our weight up makes the prediction go
down. It's reversed!!! How do we address this? Well, multiplying our "pure error" by
our input will reverse the sign of our direction_and_amount in the event that our
input is negative. This is "negative reversal", ensuring that our weight moves in
the correct direction, even if the input is negative.

**What is "scaling"?**

Scaling is the second effect on our "pure error" caused by multiplying it by our
input. Logically, it means that if our input was big, our weight update should also be
big. This is more of a "side affect" as it can often go out of control. Later, we will use
alpha to address when this scaling goes out of control.

In [2]:
weight, goal_pred, input = (0.5, 0.8, 1.5)

for iteration in range(20):
    pred = input * weight
    error = (pred - goal_pred) ** 2
    delta = pred - goal_pred
    weight_delta = delta * input
    weight = weight - weight_delta
        
    #print("Error: " + str(error) + " Prediction: " + str(pred))
    print(f"weight: {weight_delta}")

weight: -0.07500000000000007
weight: 0.09375
weight: -0.1171875
weight: 0.146484375
weight: -0.18310546875
weight: 0.2288818359375
weight: -0.286102294921875
weight: 0.35762786865234375
weight: -0.4470348358154297
weight: 0.5587935447692871
weight: -0.6984919309616088
weight: 0.8731149137020108
weight: -1.0913936421275132
weight: 1.3642420526593917
weight: -1.7053025658242398
weight: 2.1316282072803
weight: -2.6645352591003753
weight: 3.3306690738754696
weight: -4.163336342344338
weight: 5.204170427930423


In [3]:
alpha, weight, goal_pred, input = (0.01, 0.5, 0.8, 8)

for iteration in range(20):
    pred = input * weight
    error = (pred - input) ** 2
    derivative = (pred - goal_pred) * input
    weight = weight - alpha*derivative
        
    print("Error: " + str(error) + " Prediction: " + str(pred))
    print(f"weight: {weight}")

Error: 16.0 Prediction: 4.0
weight: 0.244
Error: 36.578304 Prediction: 1.952
weight: 0.15184
Error: 46.0400246784 Prediction: 1.21472
weight: 0.1186624
Error: 49.71238177112063 Prediction: 0.9492992
weight: 0.106718464
Error: 51.06892176374524 Prediction: 0.853747712
weight: 0.10241864704
Error: 51.561746251616256 Prediction: 0.81934917632
weight: 0.1008707129344
Error: 51.73974239098203 Prediction: 0.8069657034752
weight: 0.100313456656384
Error: 51.803896081509386 Prediction: 0.802507653251072
weight: 0.10011284439629824
Error: 51.827001140513346 Prediction: 0.8009027551703859
weight: 0.10004062398266737
Error: 51.835320222816435 Prediction: 0.800324991861339
weight: 0.10001462463376026
Error: 51.838315255879124 Prediction: 0.8001169970700821
weight: 0.1000052648681537
Error: 51.83939348896269 Prediction: 0.8000421189452296
weight: 0.10000189535253534
Error: 51.83978165561784 Prediction: 0.8000151628202827
weight: 0.10000068232691273
Error: 51.83992139596945 Prediction: 0.80000545861

In [4]:
alpha = 0.01
weight = 0.5
input = 5
goal = 0.8

for i in range(20):
    pred = input * weight
    error = (pred - goal) ** 2
    derivative = (pred - goal) * input
    weight = weight - (derivative * alpha)
    
    print(pred)

2.5
2.0749999999999997
1.75625
1.5171875000000001
1.337890625
1.20341796875
1.1025634765625
1.026922607421875
0.9701919555664062
0.9276439666748048
0.8957329750061036
0.8717997312545777
0.8538497984409332
0.8403873488307
0.830290511623025
0.8227178837172687
0.8170384127879515
0.8127788095909637
0.8095841071932228
0.8071880803949171
