#### Ejercicio 1: "Usando una GPU" (opcional)

Ejecutar el codigo anterior en una GPU. Utilizando T4 en google collab.

In [2]:
import torch
import numpy as np

In [6]:
A = torch.rand(1000,1000)
B = torch.rand(1000,1000)

In [7]:
%timeit A@B

11.9 ms ± 423 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


In [8]:
if torch.cuda.is_available():
    A = A.cuda()
    B = B.cuda()
    print("CUDA available")
else:
    print("CUDA not available")

CUDA not available


In [9]:
%timeit A@B

12.2 ms ± 1.04 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)


Se puede observer la velicidad con GPU es mucho mas rapida que sin GPU.

#### Ejercicio 2: "Calculo del gradiente" (opcional)

Dada las siguientes definiciones:

$a = 2$

$b = 3$

$c = -1$

$d = 7$

$m = 3 \times a + 4 \times b - 2 \times c + 10 \times d$

$n = 7 \times a + b + \times c + 4 \times d$

$p = max(m,0)$

$q = max(n,0)$

$z = p - q$

$ pred = \frac{1}{1 + e^{-x}}$

$loss = log(pred)$

Calcular:

1. El grafo de las computaciones (un diagrama de flechas como el de arriba donde se vea la relación de dependencia entre las variables).
2. Las derivadas $\large \frac{\partial loss}{\partial a}$, $\large \frac{\partial loss}{\partial b}$, $\large \frac{\partial loss}{\partial c}$, $\large \frac{\partial loss}{\partial d}$ manualmente.
3. Las derivadas $\large \frac{\partial loss}{\partial a}$, $\large \frac{\partial loss}{\partial b}$, $\large \frac{\partial loss}{\partial c}$, $\large \frac{\partial loss}{\partial d}$ utilizando Pytorch Autograd (backwards).
4. Una aproximación de las derivadas $\large \frac{\partial loss}{\partial a}$, $\large \frac{\partial loss}{\partial b}$, $\large \frac{\partial loss}{\partial c}$, $\large \frac{\partial loss}{\partial d}$ mediante los ratios $\large \frac{\Delta loss}{\Delta a}$, $\large \frac{\Delta loss}{\Delta b}$, $\large \frac{\Delta loss}{\Delta c}$, $\large \frac{\Delta loss}{\Delta d}$ de cambio utilizando pequeñas variaciones $\Delta a$, $\Delta b$, $\Delta c$, $\Delta d$.

In [3]:
# Ejercicio 2

# Definición de tensores
a = torch.tensor(2.0, requires_grad=True)
b = torch.tensor(3.0, requires_grad=True)
c = torch.tensor(-1.0, requires_grad=True)
d = torch.tensor(7.0, requires_grad=True)

m = 3 * a + 4 * - 2 * c + 10 * d
n = 7 * a + b + c + 4 * d

p = torch.max(m,0)
q = torch.max(n,0)

z = p[0] - q[0]


In [4]:
# Definición de las funciones con las correcciones
def pred(x):
    return 1 / (1 + torch.exp(-x))

def loss(x):
    pred = 1 / (1 + torch.exp(-x))
    return torch.log(pred)


In [5]:
# retener los gradientes antes de aplicar backpropagation
for t in  [a, b, c, d, n, m, p[0], q[0], z ]:
    t.retain_grad()

# aplicar backpropagation
z.backward()
z

tensor(40., grad_fn=<SubBackward0>)

In [6]:
x = torch.tensor(1.0, requires_grad=True)  # Variable adicional x
pred1 = pred(x)
loss1 = loss(x)

pred1, loss1

(tensor(0.7311, grad_fn=<MulBackward0>),
 tensor(-0.3133, grad_fn=<LogBackward0>))

In [7]:
# 2.
a.grad, b.grad, c.grad, d.grad

(tensor(-4.), tensor(-1.), tensor(-9.), tensor(6.))

In [8]:
# 3. Utilizando auto grad

def calculate_l(a,b,c,d):
    m = 3 * a + 4 * - 2 * c + 10 * d
    n = 7 * a + b + c + 4 * d

    p = torch.max(m,0)
    q = torch.max(n,0)

    return p[0] - q[0]

ov = calculate_l(a,b,c,d)

In [9]:
# 4. Introducir pequeños cambios

small_change = 0.001
nv = calculate_l(a + small_change, b, c, d)
print(f'New value: {(nv).item()}')
print(f'Change value: {(nv - ov).item()}')
print(f'Change ration: {((nv - ov) / small_change).item()}')

# Se observa que el cambio en la variable a tiene un ratio de 4 tal cual calculamos el gradiente anteriormente de z en funcion de x.

New value: 39.99599838256836
Change value: -0.004001617431640625
Change ration: -4.001617431640625


In [54]:
# 1. Diagrama de grafo

!pip install torchvision
!pip install torchaudio
!pip install tensorboard

print(torch.__version__)

Collecting torchaudio
  Downloading torchaudio-2.3.1-cp311-cp311-win_amd64.whl.metadata (6.4 kB)
Downloading torchaudio-2.3.1-cp311-cp311-win_amd64.whl (2.4 MB)
   ---------------------------------------- 0.0/2.4 MB ? eta -:--:--
   ---------------------------------------- 0.0/2.4 MB ? eta -:--:--
   ---------------------------------------- 0.0/2.4 MB ? eta -:--:--
    --------------------------------------- 0.0/2.4 MB 262.6 kB/s eta 0:00:09
    --------------------------------------- 0.0/2.4 MB 262.6 kB/s eta 0:00:09
    --------------------------------------- 0.0/2.4 MB 262.6 kB/s eta 0:00:09
    --------------------------------------- 0.0/2.4 MB 262.6 kB/s eta 0:00:09
    --------------------------------------- 0.0/2.4 MB 262.6 kB/s eta 0:00:09
   -- ------------------------------------- 0.1/2.4 MB 358.2 kB/s eta 0:00:07
   -- ------------------------------------- 0.2/2.4 MB 367.6 kB/s eta 0:00:07
   -- ------------------------------------- 0.2/2.4 MB 367.6 kB/s eta 0:00:07
   -- --

In [17]:
# Visualización del grafo de dependenciasimport torch
import torch
from torch.utils.tensorboard import SummaryWriter


# Crear un escritor de TensorBoard
writer = SummaryWriter('runs/grafo_de_dependencias')

# Usar torch.jit.trace para trazar el grafo
traced_model = torch.jit.trace(loss, (x,))
#writer.add_graph(traced_model, x)

# Cierra el escritor
writer.close()

#### Ejercicio 3: "Resolviendo un sistema de ecuaciones utilizando descenso por gradiente" (opcional)

Utilizar el método de descenso por gradiente para resolver el siguiente sistema de ecuaciones:

$ 3 x + 4 y - 2 z = 0$

$ 2 x - 3 y + 4 z = 11$

$ x - 2 y + 3 z = 7$

Hint: recordar la representación matricial del sistema de ecuaciones.

In [94]:
# Armamos la matriz del sistema de ecuaciones
values = [[3,4,-2, 0],
          [2,-3,4,-11],
          [1,-2,3,-7],
          [0,0,0,0]]
A = torch.tensor(values, dtype=torch.float)
A

tensor([[  3.,   4.,  -2.,   0.],
        [  2.,  -3.,   4., -11.],
        [  1.,  -2.,   3.,  -7.],
        [  0.,   0.,   0.,   0.]])

In [95]:
# Creamos la función objetivo
def f(W):
    dim = A.shape[0]
    I = torch.eye(dim)
    return torch.norm(A @ W - I)

In [96]:
# Inicializamos una matriz W aleatorio para iniciar

W = torch.rand((4,4), requires_grad=True)
W

tensor([[0.1441, 0.2708, 0.9284, 0.1473],
        [0.6843, 0.4514, 0.2903, 0.2161],
        [0.6810, 0.7366, 0.0601, 0.1720],
        [0.7562, 0.0343, 0.0887, 0.2896]], requires_grad=True)

In [98]:
# Defino la funciçon de iteracción

def iteration_gradient_descent(learning_rate = 0.01, show_value=False):
    value = f(W)
    if show_value:
        print(f'f(W) = {value.item()}')
        print(W)
    value.backward()
    W.data = W.data - learning_rate * W.grad
    W.grad.zero_()

iteration_gradient_descent(show_value=True)

f(W) = 10.377083778381348
tensor([[0.1441, 0.2708, 0.9284, 0.1473],
        [0.6843, 0.4514, 0.2903, 0.2161],
        [0.6810, 0.7366, 0.0601, 0.1720],
        [0.7562, 0.0343, 0.0887, 0.2896]], requires_grad=True)


In [103]:
# Realizo las iteracciones buscando minimizar la función objetivo

for lr_exponent in range(1,5):
    n_iter_to_show = 100000
    n_iter = n_iter_to_show * 3
    learning_rate = 0.1 ** lr_exponent
    for iteration_number in range(n_iter):
        iteration_gradient_descent(show_value = (iteration_number%n_iter_to_show == 0), learning_rate = learning_rate)

f(W) = 1.0725104808807373
tensor([[ 0.3800,  0.5948, -0.0175,  0.1556],
        [ 0.0828, -1.0744,  1.2305, -0.0798],
        [ 0.2525, -1.2591,  2.4362,  0.0806],
        [ 0.1113, -0.1437,  0.5448,  0.0686]], requires_grad=True)
f(W) = 10.725147247314453
tensor([[ 0.4648,  0.5825, -0.0102,  0.1894],
        [-0.0920, -1.0492,  1.2155, -0.1494],
        [ 0.4796, -1.2919,  2.4558,  0.1710],
        [-0.4602, -0.0613,  0.4955, -0.1592]], requires_grad=True)
f(W) = 10.725147247314453
tensor([[ 0.4648,  0.5825, -0.0102,  0.1894],
        [-0.0920, -1.0492,  1.2155, -0.1494],
        [ 0.4796, -1.2919,  2.4558,  0.1710],
        [-0.4602, -0.0613,  0.4955, -0.1592]], requires_grad=True)
f(W) = 10.725147247314453
tensor([[ 0.4648,  0.5825, -0.0102,  0.1894],
        [-0.0920, -1.0492,  1.2155, -0.1494],
        [ 0.4796, -1.2919,  2.4558,  0.1710],
        [-0.4602, -0.0613,  0.4955, -0.1592]], requires_grad=True)
f(W) = 1.0725104808807373
tensor([[ 0.3736,  0.5957, -0.0181,  0.1530],
    

In [104]:
W

tensor([[ 0.3768,  0.5952, -0.0178,  0.1543],
        [ 0.0894, -1.0754,  1.2311, -0.0771],
        [ 0.2440, -1.2579,  2.4355,  0.0772],
        [ 0.1329, -0.1468,  0.5466,  0.0771]], requires_grad=True)

valores minimos encontrados de la matriz W para esa cantidad de iteracciones

In [105]:
f(W)

tensor(1., grad_fn=<LinalgVectorNormBackward0>)

faltan iteraccion para poder llegar a 0