# The mechanics of learning 
--- 
This chapter covers:
- Understanding how algorithms can learn from data
- Reframing learning as parameter estimation, using differentiation and gradient descent
- Walking through a simple learning algorithm
- How PyTorch supports learning with autograd

## Learning is just parameter estimation
---
Here we start with a temperature predict model. Here is a thermometer with unknown units.   
We have train data of temperatures in Celsius and the unknown unit.   
We aim to predict the temperature in Celsius given its unknown unit measure.

In [11]:
import torch

t_c = [0.5, 14.0, 15.0 ,28.0 ,11.0, 8.0, 3.0, -4.0, 6.0, 13.0, 21.0]
t_u = [35.7, 55.9, 58.2, 81.9, 56.3, 48.9, 33.9, 21.8, 48.4, 60.4, 68.4]
t_c = torch.tensor(t_c)
t_u = torch.tensor(t_u)

In [4]:
# define a model, not with nn.Module
def model(t_u, w, b):
    return w * t_u + b

In [5]:
# define loss function as mean square loss
def loss_fn(t_p, t_c):
    squared_diffs = (t_p - t_c) ** 2
    return squared_diffs.mean()

In [12]:
w = torch.ones(())
b = torch.zeros(())
print(w, b)

t_p = model(t_u, w, b)
t_p

tensor(1.) tensor(0.)


tensor([35.7000, 55.9000, 58.2000, 81.9000, 56.3000, 48.9000, 33.9000, 21.8000,
        48.4000, 60.4000, 68.4000])

In [13]:
loss = loss_fn(t_p, t_c)
loss

tensor(1763.8848)

In [14]:
# implement gradient descent

delta = 0.1 
loss_rate_of_change_w = \
    (loss_fn(model(t_u, w+delta, b), t_c) - loss_fn(model(t_u, w, b), t_c)) / (2 * delta)

the following is so simple that I'd like to skip it