<a href="https://colab.research.google.com/github/vicentcamison/idal_ia3/blob/main/3%20Aprendizaje%20profundo%20(II)/Sesion%201/04_Pytorch_con_GPU_CUDA.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

![IDAL](https://i.imgur.com/tIKXIG1.jpg)  

#<strong>**Máster en Inteligencia Artificial Avanzada y Aplicada  IA^3**</strong>
---


# Que es CUDA?

Mucha gente confunde CUDA con un lengaje o con una API. No lo es. Es más que eso. CUDA en una plataforma de cálculo computerizado paralelo y un modelo de programación que permite aprovechar las GPUs para tareas de propósito general de una forma fácil y elegante. Los desarrolladores pueden continuar trabajando en C, C++, Fortran, Python y una lista cada día más amplia e incorpora extensiones de estos lenguajes en forma de unas pocas palabras clave básicas.

Estas palabras clave permiten al desarrollador expresar cantidades masivas de paralelismo y dirigir al compilador a la porción de la aplicación que se mapea a la GPU. En definitiva, hace que el acceso a la gran potencia computacional de las GPUs se haya incorporado en los lenguajes de programación de propósito general, permitiendo una gran expansión de técnicas y tecnologías que requieren de esa potencia, como las técnicas de aprensdizaje máquina, inteligencia artificial y más concretamente aprendizaje profundo (_deep learning_)

# Como instalo PyTorch para GPU?

En primer lugar es necesario tener una tarjeta gñrafica NVIDIA compatible y con los drivers CUDA instalado y actualizados correctamente.  A continuación selecciona la versión de Pytorch correspondiente al descargarlo de la [página oficial](https://pytorch.org/get-started/locally/)

# Como saber si tienes CUDA disponible

In [1]:
import torch
torch.cuda.is_available()
# True

True

# Usando GPU y CUDA


In [2]:
## Id del dispositivo por defecto
torch.cuda.current_device()

0

In [None]:
# 0
torch.cuda.get_device_name(0) # Obtenemos el nombre del dispositivo ID '0'

'Tesla T4'

In [None]:
# Retorna el uso de memoria actual provocado por
# tensores en bytes para el dispositivo dado
torch.cuda.memory_allocated()

173056

In [None]:
# Retorna la memoria gestionada por el Returns the current GPU memory managed by the
# gestor de memoria en bytes para el dispositivo dado
torch.cuda.memory_cached()



2097152

# Usando CUDA en lugar de CPU

In [None]:
# CPU
a = torch.FloatTensor([1.,2.])

In [None]:
a

tensor([1., 2.])

In [None]:
a.device

device(type='cpu')

In [None]:
# GPU
a = torch.FloatTensor([1., 2.]).cuda()

In [None]:
a.device

device(type='cuda', index=0)

In [None]:
torch.cuda.memory_allocated()

173056

## Enviando modelos a la GPU

In [None]:
import torch
import torch.nn as nn
import torch.nn.functional as F
from torch.utils.data import Dataset, DataLoader
from sklearn.model_selection import train_test_split

import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline

In [None]:
class Model(nn.Module): 
# Regresion logística
    def __init__(self, input_size=4, num_classes=3):
        super().__init__()
        self.linear = nn.Linear(input_size, num_classes)
        
    def forward(self, xb):
        out = self.linear(xb)
        return out

In [None]:
class MLP(nn.Module):  #Opcional, solo para probar
  # MLP 2 capa oculta
    def __init__(self, in_features=4, h1=8, h2=9, out_features=3):
        super().__init__()
        self.fc1 = nn.Linear(in_features,h1)    # input layer
        self.fc2 = nn.Linear(h1, h2)            # hidden layer
        self.out = nn.Linear(h2, out_features)  # output layer
        
    def forward(self, x):
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.out(x)
        return x


In [None]:
torch.manual_seed(32)
model = Model()

In [None]:
# Comprobación: discuss.pytorch.org/t/how-to-check-if-model-is-on-cuda
next(model.parameters()).is_cuda

False

In [None]:
gpumodel = model.cuda()

In [None]:
next(gpumodel.parameters()).is_cuda

True

In [None]:
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [None]:
df = pd.read_csv('/content/drive/MyDrive/Colab Notebooks/Datos/iris.csv')
X = df.drop('target',axis=1).values
y = df['target'].values


In [None]:
df.head()

Unnamed: 0,sepal length (cm),sepal width (cm),petal length (cm),petal width (cm),target
0,5.1,3.5,1.4,0.2,0.0
1,4.9,3.0,1.4,0.2,0.0
2,4.7,3.2,1.3,0.2,0.0
3,4.6,3.1,1.5,0.2,0.0
4,5.0,3.6,1.4,0.2,0.0


## Conjuntos Train-Test

In [None]:
X_train, X_test, y_train, y_test = train_test_split(X,y,test_size=0.2,random_state=33)

## Convertir Tensores a .cuda() tensors

In [None]:
X_train = torch.FloatTensor(X_train).cuda()
X_test = torch.FloatTensor(X_test).cuda()
y_train = torch.LongTensor(y_train).cuda()
y_test = torch.LongTensor(y_test).cuda()

## Preparacion de datos

In [None]:
trainloader = DataLoader(X_train, batch_size=60, shuffle=True)
testloader = DataLoader(X_test, batch_size=60, shuffle=False)

## Función de coste, optimizador y evaluador

In [None]:
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=0.01)

In [None]:
def accuracy(outputs, labels):
    _, preds = torch.max(outputs, dim=1)
    return torch.tensor(torch.sum(preds == labels).item() / len(preds))

## Entrenamiento con GPU

In [None]:
import time
epochs = 300
losses = []
accs =[]
start = time.time()
for i in range(epochs):
    i+=1
    y_pred = gpumodel.forward(X_train)
    loss = criterion(y_pred, y_train)
    acc = accuracy(y_pred, y_train)
    losses.append(loss)
    accs.append(acc)
    
    # log:
    if i%10 == 1:
        print(f'epoch: {i:2}  loss: {loss.item():10.8f}  acc: {acc.item():10.8f}')

    optimizer.zero_grad()
    loss.backward()
    optimizer.step()
    
print(f'TOTAL TRAINING TIME: {time.time()-start}')

epoch:  1  loss: 1.90774024  acc: 0.64999998
epoch: 11  loss: 1.05946147  acc: 0.64999998
epoch: 21  loss: 0.94435710  acc: 0.35833332
epoch: 31  loss: 0.79906476  acc: 0.63333333
epoch: 41  loss: 0.72149867  acc: 0.69166666
epoch: 51  loss: 0.65472120  acc: 0.93333334
epoch: 61  loss: 0.60460609  acc: 0.89166665
epoch: 71  loss: 0.56408793  acc: 0.94166666
epoch: 81  loss: 0.53180271  acc: 0.94166666
epoch: 91  loss: 0.50473028  acc: 0.95833331
epoch: 101  loss: 0.48167929  acc: 0.95833331
epoch: 111  loss: 0.46155009  acc: 0.95833331
epoch: 121  loss: 0.44366735  acc: 0.96666664
epoch: 131  loss: 0.42752853  acc: 0.96666664
epoch: 141  loss: 0.41278446  acc: 0.96666664
epoch: 151  loss: 0.39917985  acc: 0.95833331
epoch: 161  loss: 0.38652790  acc: 0.96666664
epoch: 171  loss: 0.37468818  acc: 0.96666664
epoch: 181  loss: 0.36355385  acc: 0.95833331
epoch: 191  loss: 0.35304171  acc: 0.95833331
epoch: 201  loss: 0.34308663  acc: 0.95833331
epoch: 211  loss: 0.33363569  acc: 0.9583333

In [None]:
_, preds = torch.max(y_pred, dim=1)
print(f'Aciertos: {torch.sum(preds == y_train).item()}')
print(f'Muestras totales: {len(preds)}')


Aciertos: 116
Muestras totales: 120


# Curiosidad: Volviendo a CPU


In [None]:
torch.manual_seed(32)
model2 = Model()

X_train, X_test, y_train, y_test = train_test_split(X,y,test_size=0.2,random_state=33)

X_train = torch.FloatTensor(X_train)
X_test = torch.FloatTensor(X_test)
y_train = torch.LongTensor(y_train)
y_test = torch.LongTensor(y_test)

trainloader = DataLoader(X_train, batch_size=60, shuffle=True)
testloader = DataLoader(X_test, batch_size=60, shuffle=False)

criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=0.01)

import time
epochs = 300
losses = []
start = time.time()
for i in range(epochs):
    i+=1
    y_pred = model2(X_train)
    loss = criterion(y_pred, y_train)
    losses.append(loss)
    
    # a neat trick to save screen space:
    if i%10 == 1:
        print(f'epoch: {i:2}  loss: {loss.item():10.8f}')

    optimizer.zero_grad()
    loss.backward()
    optimizer.step()
    
print(f'TOTAL TRAINING TIME: {time.time()-start}')

epoch:  1  loss: 1.90773988
epoch: 11  loss: 1.90773988
epoch: 21  loss: 1.90773988
epoch: 31  loss: 1.90773988
epoch: 41  loss: 1.90773988
epoch: 51  loss: 1.90773988
epoch: 61  loss: 1.90773988
epoch: 71  loss: 1.90773988
epoch: 81  loss: 1.90773988
epoch: 91  loss: 1.90773988
epoch: 101  loss: 1.90773988
epoch: 111  loss: 1.90773988
epoch: 121  loss: 1.90773988
epoch: 131  loss: 1.90773988
epoch: 141  loss: 1.90773988
epoch: 151  loss: 1.90773988
epoch: 161  loss: 1.90773988
epoch: 171  loss: 1.90773988
epoch: 181  loss: 1.90773988
epoch: 191  loss: 1.90773988
epoch: 201  loss: 1.90773988
epoch: 211  loss: 1.90773988
epoch: 221  loss: 1.90773988
epoch: 231  loss: 1.90773988
epoch: 241  loss: 1.90773988
epoch: 251  loss: 1.90773988
epoch: 261  loss: 1.90773988
epoch: 271  loss: 1.90773988
epoch: 281  loss: 1.90773988
epoch: 291  loss: 1.90773988
TOTAL TRAINING TIME: 0.21269488334655762


## Fin del Notebook

Referencias y modelos empleados para el Notebook: 

*   Documentación de [Pytorch](https://pytorch.org/docs/stable/index.html) 
*   [PyTorch Tutorial for Deep Learning Researchers](https://github.com/yunjey/pytorch-tutorial) by Yunjey Choi
*   [FastAI](https://www.fast.ai/) development notebooks by Jeremy Howard.
*   Documentación y cursos en [Pierian Data](https://www.pieriandata.com/)
*   Tutoriales y notebooks del curso "Deep Learning with Pytorch: Zero to GANs" de [Aakash N S](https://jovian.ai/aakashns)
* [A visual proof that neural networks can compute any function](http://neuralnetworksanddeeplearning.com/chap4.html), también conocido como Teorema de Aproximación Universal
* [But what *is* a neural network?](https://www.youtube.com/watch?v=aircAruvnKk) - Una introducción muy intuitiva a lo que son las redes neuronales y lo que implican las capas ocultas.