<a href="https://colab.research.google.com/github/Noodle96/Topicos_Inteligencia_Artificial/blob/main/introduccion_deep_learning_with_pytorch/05_using_derivates_to_update_the_model.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [76]:
import torch
import torch.nn as nn
import torch.nn.functional as F
from torch.nn import CrossEntropyLoss

**Modelo y descripcion del modelo**

In [77]:
# Create the model and run a forward pass
model = nn.Sequential(
    nn.Linear(5, 3),
    nn.Linear(3, 4),
    nn.Linear(4, 2)
)

print(model)

Sequential(
  (0): Linear(in_features=5, out_features=3, bias=True)
  (1): Linear(in_features=3, out_features=4, bias=True)
  (2): Linear(in_features=4, out_features=2, bias=True)
)


In [78]:
for name, param in model.named_parameters():
    print(f'''name: {name}, param: {param}, shape: {param.shape}\n''')

name: 0.weight, param: Parameter containing:
tensor([[-0.0839, -0.4413, -0.3870, -0.3310,  0.2520],
        [-0.0645, -0.4109,  0.1575, -0.1019,  0.2477],
        [ 0.2617, -0.2029,  0.4198, -0.1179, -0.2146]], requires_grad=True), shape: torch.Size([3, 5])

name: 0.bias, param: Parameter containing:
tensor([ 0.0392, -0.3165, -0.1976], requires_grad=True), shape: torch.Size([3])

name: 1.weight, param: Parameter containing:
tensor([[-0.0545,  0.2838, -0.3261],
        [ 0.2636, -0.4711,  0.3656],
        [-0.3387,  0.1374, -0.1095],
        [-0.1880, -0.4313, -0.4147]], requires_grad=True), shape: torch.Size([4, 3])

name: 1.bias, param: Parameter containing:
tensor([-0.2977, -0.1985,  0.0346,  0.2299], requires_grad=True), shape: torch.Size([4])

name: 2.weight, param: Parameter containing:
tensor([[ 0.4224,  0.4897, -0.4568, -0.1278],
        [ 0.0516,  0.1470, -0.0409,  0.3786]], requires_grad=True), shape: torch.Size([2, 4])

name: 2.bias, param: Parameter containing:
tensor([-0.18

In [79]:
sample = torch.randn(5)
sample

tensor([-1.7776, -0.7555, -2.0096,  1.2856,  0.4624])

>   La variable prediccion va a contener a los w_i_j de los w_i y a los bias b_i

In [80]:
prediction = model(sample)
prediction

tensor([-0.3724,  0.0762], grad_fn=<ViewBackward0>)

> Recordemos que model[i] hace referencia a los **pesos** y **bias** entre la capa[i] y la capa[i+1]

In [81]:
grad_before = model[0].weight.grad
print(grad_before) # None

None


In [82]:
grad_before_bias = model[0].bias.grad
print(grad_before_bias) # None

None


In [83]:
# Calculate the loss and compute the gradients
criterion = CrossEntropyLoss()
target = torch.randn(size=(2,))
print(target)

tensor([-0.4196,  0.4259])


> la variable loss tambien tiene referencia al modelo porque el atributo prediction tambien hace referencia al modelo

In [84]:
# Calculate the loss
loss = criterion(prediction, target)
print(loss)
# Compute the gradientd
loss.backward()

tensor(-0.1851, grad_fn=<DivBackward1>)


In [85]:
grad_after = model[0].weight.grad
print(grad_after)

tensor([[-0.2297, -0.0976, -0.2597,  0.1661,  0.0597],
        [-0.0788, -0.0335, -0.0891,  0.0570,  0.0205],
        [-0.1950, -0.0829, -0.2204,  0.1410,  0.0507]])


**Accedemos a la gradiente de cada capa**

In [86]:
# Access each layer's gradients
model[0].weight.grad, model[0].bias.grad

(tensor([[-0.2297, -0.0976, -0.2597,  0.1661,  0.0597],
         [-0.0788, -0.0335, -0.0891,  0.0570,  0.0205],
         [-0.1950, -0.0829, -0.2204,  0.1410,  0.0507]]),
 tensor([0.1292, 0.0443, 0.1097]))

In [87]:
model[1].weight.grad, model[1].bias.grad

(tensor([[ 0.1550, -0.0351, -0.2510],
         [ 0.1433, -0.0324, -0.2320],
         [-0.1739,  0.0394,  0.2815],
         [-0.2117,  0.0479,  0.3428]]),
 tensor([ 0.1565,  0.1446, -0.1755, -0.2137]))

In [88]:
model[2].weight.grad, model[2].bias.grad

(tensor([[ 0.0455, -0.1765, -0.0659,  0.3400],
         [-0.0455,  0.1765,  0.0659, -0.3400]]),
 tensor([ 0.4220, -0.4220]))

**Updating model parameter**
*   Update the weights by subtracting local gradients scaled by the **learning rate**

In [89]:
# Learning rate is typically small
lr = 0.001
# Update the weights
weight = model[0].weight
print("weight before: ", weight)
weight_grad = model[0].weight.grad
weight = weight - lr * weight_grad
print("weight after: ", weight)

weight before:  Parameter containing:
tensor([[-0.0839, -0.4413, -0.3870, -0.3310,  0.2520],
        [-0.0645, -0.4109,  0.1575, -0.1019,  0.2477],
        [ 0.2617, -0.2029,  0.4198, -0.1179, -0.2146]], requires_grad=True)
weight after:  tensor([[-0.0837, -0.4412, -0.3867, -0.3311,  0.2520],
        [-0.0644, -0.4109,  0.1576, -0.1019,  0.2477],
        [ 0.2619, -0.2028,  0.4201, -0.1180, -0.2147]], grad_fn=<SubBackward0>)


In [90]:
# Update the biases
bias = model[0].bias
print("bias before: ", bias)
bias_grad = model[0].bias.grad
bias = bias - lr * bias_grad
print("bias after: ", bias)


bias before:  Parameter containing:
tensor([ 0.0392, -0.3165, -0.1976], requires_grad=True)
bias after:  tensor([ 0.0391, -0.3165, -0.1977], grad_fn=<SubBackward0>)
