# 10 - PyTorch B√°sico: Neural Networks Fundamentales

## üéØ Objetivos
- Entender los conceptos fundamentales de PyTorch
- Trabajar con tensores y operaciones b√°sicas
- Crear redes neuronales desde cero
- Entrenar modelos con diferentes optimizadores
- Evaluar y guardar modelos

## üìö Tecnolog√≠as
- **PyTorch**: Framework de deep learning
- **torchvision**: Datasets y transformaciones
- **matplotlib**: Visualizaci√≥n

## ‚≠ê Complejidad: B√°sico

## 1. Instalaci√≥n y Setup

In [None]:
# Instalar PyTorch y dependencias
!pip install torch torchvision torchaudio matplotlib numpy pandas scikit-learn -q

In [None]:
import torch
import torch.nn as nn
import torch.optim as optim
import torch.nn.functional as F
from torch.utils.data import DataLoader, TensorDataset
import torchvision
import torchvision.transforms as transforms
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.datasets import make_classification, make_moons
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
import warnings
warnings.filterwarnings('ignore')

print(f"‚úÖ PyTorch version: {torch.__version__}")
print(f"‚úÖ CUDA available: {torch.cuda.is_available()}")
if torch.cuda.is_available():
    print(f"‚úÖ CUDA device: {torch.cuda.get_device_name(0)}")

# Configurar device
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print(f"\nüñ•Ô∏è Using device: {device}")

## 2. Tensores: El Coraz√≥n de PyTorch

Los tensores son arrays multidimensionales similares a NumPy pero con soporte para GPU.

In [None]:
# Crear tensores de diferentes formas
print("üìä Creaci√≥n de Tensores:\n")

# Tensor desde lista
t1 = torch.tensor([1, 2, 3, 4, 5])
print(f"1. Desde lista: {t1}")

# Tensor de ceros
t2 = torch.zeros(3, 4)
print(f"\n2. Ceros (3x4):\n{t2}")

# Tensor de unos
t3 = torch.ones(2, 3)
print(f"\n3. Unos (2x3):\n{t3}")

# Tensor aleatorio
t4 = torch.randn(2, 3)  # Distribuci√≥n normal
print(f"\n4. Aleatorio normal (2x3):\n{t4}")

# Tensor desde NumPy
np_array = np.array([[1, 2, 3], [4, 5, 6]])
t5 = torch.from_numpy(np_array)
print(f"\n5. Desde NumPy:\n{t5}")

# Informaci√≥n del tensor
print(f"\nüìä Informaci√≥n del tensor:")
print(f"   Shape: {t4.shape}")
print(f"   Dtype: {t4.dtype}")
print(f"   Device: {t4.device}")

## 3. Operaciones con Tensores

In [None]:
# Operaciones matem√°ticas
a = torch.tensor([1, 2, 3, 4], dtype=torch.float32)
b = torch.tensor([5, 6, 7, 8], dtype=torch.float32)

print("üî¢ Operaciones Matem√°ticas:\n")
print(f"a = {a}")
print(f"b = {b}")
print(f"\nSuma: a + b = {a + b}")
print(f"Resta: a - b = {a - b}")
print(f"Multiplicaci√≥n: a * b = {a * b}")
print(f"Divisi√≥n: a / b = {a / b}")
print(f"Potencia: a ** 2 = {a ** 2}")

# Operaciones matriciales
A = torch.randn(3, 4)
B = torch.randn(4, 5)
C = torch.matmul(A, B)  # Multiplicaci√≥n matricial

print(f"\nüî¢ Multiplicaci√≥n Matricial:")
print(f"   A shape: {A.shape}")
print(f"   B shape: {B.shape}")
print(f"   C = A @ B shape: {C.shape}")

# Operaciones de agregaci√≥n
print(f"\nüìä Agregaciones:")
print(f"   Sum: {a.sum()}")
print(f"   Mean: {a.mean()}")
print(f"   Max: {a.max()}")
print(f"   Min: {a.min()}")
print(f"   Std: {a.std()}")

## 4. Autograd: Diferenciaci√≥n Autom√°tica

PyTorch calcula gradientes autom√°ticamente para el entrenamiento.

In [None]:
# Crear tensor con requires_grad=True para tracking de gradientes
x = torch.tensor([2.0, 3.0], requires_grad=True)
print(f"x = {x}")

# Operaci√≥n: y = x^2 + 3x + 1
y = x**2 + 3*x + 1
print(f"y = x^2 + 3x + 1 = {y}")

# Calcular suma para backpropagation
z = y.sum()
print(f"z = sum(y) = {z}")

# Backward: calcular gradientes
z.backward()

# El gradiente de z respecto a x es dy/dx = 2x + 3
print(f"\nüìä Gradientes:")
print(f"   dz/dx = {x.grad}")
print(f"   Esperado (2x + 3) para x=[2,3]: {2*x.data + 3}")

## 5. Red Neuronal Simple: Clasificaci√≥n Binaria

Crearemos una red neuronal para clasificaci√≥n binaria.

In [None]:
# Generar datos sint√©ticos
X, y = make_moons(n_samples=1000, noise=0.2, random_state=42)

# Normalizar
scaler = StandardScaler()
X = scaler.fit_transform(X)

# Split train/test
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42
)

# Convertir a tensores
X_train = torch.FloatTensor(X_train)
y_train = torch.LongTensor(y_train)
X_test = torch.FloatTensor(X_test)
y_test = torch.LongTensor(y_test)

# Visualizar datos
plt.figure(figsize=(10, 4))
plt.subplot(1, 2, 1)
plt.scatter(X_train[:, 0], X_train[:, 1], c=y_train, cmap='viridis', alpha=0.6)
plt.title('Training Data')
plt.xlabel('Feature 1')
plt.ylabel('Feature 2')

plt.subplot(1, 2, 2)
plt.scatter(X_test[:, 0], X_test[:, 1], c=y_test, cmap='viridis', alpha=0.6)
plt.title('Test Data')
plt.xlabel('Feature 1')
plt.ylabel('Feature 2')
plt.tight_layout()
plt.show()

print(f"üìä Dataset:")
print(f"   Train: {X_train.shape}")
print(f"   Test: {X_test.shape}")

## 6. Definir la Red Neuronal

In [None]:
# Definir arquitectura de la red
class SimpleNN(nn.Module):
    def __init__(self, input_size, hidden_size, output_size):
        super(SimpleNN, self).__init__()
        # Capas
        self.fc1 = nn.Linear(input_size, hidden_size)
        self.relu = nn.ReLU()
        self.fc2 = nn.Linear(hidden_size, hidden_size)
        self.fc3 = nn.Linear(hidden_size, output_size)
        
    def forward(self, x):
        # Forward pass
        x = self.fc1(x)
        x = self.relu(x)
        x = self.fc2(x)
        x = self.relu(x)
        x = self.fc3(x)
        return x

# Crear modelo
input_size = 2  # 2 features
hidden_size = 16
output_size = 2  # 2 clases

model = SimpleNN(input_size, hidden_size, output_size).to(device)

print("üß† Arquitectura de la Red:\n")
print(model)

# Contar par√°metros
total_params = sum(p.numel() for p in model.parameters())
trainable_params = sum(p.numel() for p in model.parameters() if p.requires_grad)

print(f"\nüìä Par√°metros:")
print(f"   Total: {total_params:,}")
print(f"   Entrenables: {trainable_params:,}")

## 7. Entrenamiento del Modelo

In [None]:
# Configurar entrenamiento
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.01)

# Mover datos a device
X_train = X_train.to(device)
y_train = y_train.to(device)
X_test = X_test.to(device)
y_test = y_test.to(device)

# Historial de entrenamiento
train_losses = []
test_losses = []
train_accs = []
test_accs = []

# Entrenamiento
num_epochs = 100

print("üöÄ Iniciando entrenamiento...\n")

for epoch in range(num_epochs):
    # Modo entrenamiento
    model.train()
    
    # Forward pass
    outputs = model(X_train)
    loss = criterion(outputs, y_train)
    
    # Backward pass
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()
    
    # Calcular accuracy
    _, predicted = torch.max(outputs.data, 1)
    train_acc = (predicted == y_train).sum().item() / y_train.size(0)
    
    # Evaluaci√≥n en test
    model.eval()
    with torch.no_grad():
        test_outputs = model(X_test)
        test_loss = criterion(test_outputs, y_test)
        _, test_predicted = torch.max(test_outputs.data, 1)
        test_acc = (test_predicted == y_test).sum().item() / y_test.size(0)
    
    # Guardar m√©tricas
    train_losses.append(loss.item())
    test_losses.append(test_loss.item())
    train_accs.append(train_acc)
    test_accs.append(test_acc)
    
    # Imprimir progreso
    if (epoch + 1) % 10 == 0:
        print(f"Epoch [{epoch+1}/{num_epochs}]")
        print(f"  Train Loss: {loss.item():.4f}, Train Acc: {train_acc:.4f}")
        print(f"  Test Loss: {test_loss.item():.4f}, Test Acc: {test_acc:.4f}")

print("\n‚úÖ Entrenamiento completado!")

## 8. Visualizaci√≥n del Entrenamiento

In [None]:
# Graficar p√©rdida y accuracy
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(14, 4))

# Loss
ax1.plot(train_losses, label='Train Loss', linewidth=2)
ax1.plot(test_losses, label='Test Loss', linewidth=2)
ax1.set_xlabel('Epoch')
ax1.set_ylabel('Loss')
ax1.set_title('Training and Test Loss')
ax1.legend()
ax1.grid(True, alpha=0.3)

# Accuracy
ax2.plot(train_accs, label='Train Accuracy', linewidth=2)
ax2.plot(test_accs, label='Test Accuracy', linewidth=2)
ax2.set_xlabel('Epoch')
ax2.set_ylabel('Accuracy')
ax2.set_title('Training and Test Accuracy')
ax2.legend()
ax2.grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

print(f"üìä Resultados finales:")
print(f"   Train Accuracy: {train_accs[-1]:.4f}")
print(f"   Test Accuracy: {test_accs[-1]:.4f}")

## 9. Visualizaci√≥n de Fronteras de Decisi√≥n

In [None]:
# Crear grid para visualizar fronteras de decisi√≥n
def plot_decision_boundary(model, X, y):
    # Mover a CPU para plotting
    X = X.cpu().numpy()
    y = y.cpu().numpy()
    
    # Crear mesh
    h = 0.02
    x_min, x_max = X[:, 0].min() - 0.5, X[:, 0].max() + 0.5
    y_min, y_max = X[:, 1].min() - 0.5, X[:, 1].max() + 0.5
    xx, yy = np.meshgrid(np.arange(x_min, x_max, h),
                         np.arange(y_min, y_max, h))
    
    # Predecir en todo el grid
    model.eval()
    with torch.no_grad():
        Z = model(torch.FloatTensor(np.c_[xx.ravel(), yy.ravel()]).to(device))
        Z = torch.argmax(Z, dim=1).cpu().numpy()
    
    Z = Z.reshape(xx.shape)
    
    # Plot
    plt.figure(figsize=(10, 8))
    plt.contourf(xx, yy, Z, alpha=0.3, cmap='viridis')
    plt.scatter(X[:, 0], X[:, 1], c=y, cmap='viridis', edgecolors='black', s=50)
    plt.xlabel('Feature 1')
    plt.ylabel('Feature 2')
    plt.title('Decision Boundary')
    plt.colorbar()
    plt.show()

plot_decision_boundary(model, X_test, y_test)

## 10. Guardar y Cargar Modelos

In [None]:
# Guardar modelo completo
torch.save(model.state_dict(), 'simple_nn_model.pth')
print("‚úÖ Modelo guardado: simple_nn_model.pth")

# Guardar checkpoint completo (modelo + optimizador + √©poca)
checkpoint = {
    'epoch': num_epochs,
    'model_state_dict': model.state_dict(),
    'optimizer_state_dict': optimizer.state_dict(),
    'train_loss': train_losses[-1],
    'test_loss': test_losses[-1],
    'train_acc': train_accs[-1],
    'test_acc': test_accs[-1]
}
torch.save(checkpoint, 'simple_nn_checkpoint.pth')
print("‚úÖ Checkpoint guardado: simple_nn_checkpoint.pth")

# Cargar modelo
loaded_model = SimpleNN(input_size, hidden_size, output_size).to(device)
loaded_model.load_state_dict(torch.load('simple_nn_model.pth'))
loaded_model.eval()
print("\n‚úÖ Modelo cargado exitosamente")

# Verificar que funciona
with torch.no_grad():
    test_outputs = loaded_model(X_test)
    _, predicted = torch.max(test_outputs, 1)
    accuracy = (predicted == y_test).sum().item() / y_test.size(0)
    print(f"üìä Accuracy del modelo cargado: {accuracy:.4f}")

## 11. Ejemplo con MNIST (Dataset Cl√°sico)

In [None]:
# Cargar MNIST
transform = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.1307,), (0.3081,))
])

train_dataset = torchvision.datasets.MNIST(
    root='./data', 
    train=True, 
    download=True, 
    transform=transform
)

test_dataset = torchvision.datasets.MNIST(
    root='./data', 
    train=False, 
    download=True, 
    transform=transform
)

train_loader = DataLoader(train_dataset, batch_size=64, shuffle=True)
test_loader = DataLoader(test_dataset, batch_size=64, shuffle=False)

print(f"üìä MNIST Dataset:")
print(f"   Train samples: {len(train_dataset)}")
print(f"   Test samples: {len(test_dataset)}")

# Visualizar algunos ejemplos
examples = iter(train_loader)
images, labels = next(examples)

fig, axes = plt.subplots(2, 5, figsize=(12, 5))
for i, ax in enumerate(axes.flat):
    ax.imshow(images[i].squeeze(), cmap='gray')
    ax.set_title(f'Label: {labels[i]}')
    ax.axis('off')
plt.tight_layout()
plt.show()

## 12. Red para MNIST

In [None]:
class MNISTNet(nn.Module):
    def __init__(self):
        super(MNISTNet, self).__init__()
        self.fc1 = nn.Linear(28*28, 128)
        self.fc2 = nn.Linear(128, 64)
        self.fc3 = nn.Linear(64, 10)
        self.dropout = nn.Dropout(0.2)
        
    def forward(self, x):
        x = x.view(-1, 28*28)  # Flatten
        x = F.relu(self.fc1(x))
        x = self.dropout(x)
        x = F.relu(self.fc2(x))
        x = self.dropout(x)
        x = self.fc3(x)
        return x

# Crear modelo MNIST
mnist_model = MNISTNet().to(device)
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(mnist_model.parameters(), lr=0.001)

print("üß† Modelo MNIST:")
print(mnist_model)
print(f"\nüìä Par√°metros: {sum(p.numel() for p in mnist_model.parameters()):,}")

In [None]:
# Entrenar MNIST (solo 5 √©pocas para demo)
num_epochs = 5

print("üöÄ Entrenando MNIST...\n")

for epoch in range(num_epochs):
    mnist_model.train()
    train_loss = 0
    correct = 0
    total = 0
    
    for batch_idx, (data, target) in enumerate(train_loader):
        data, target = data.to(device), target.to(device)
        
        optimizer.zero_grad()
        output = mnist_model(data)
        loss = criterion(output, target)
        loss.backward()
        optimizer.step()
        
        train_loss += loss.item()
        _, predicted = output.max(1)
        total += target.size(0)
        correct += predicted.eq(target).sum().item()
    
    # Evaluaci√≥n
    mnist_model.eval()
    test_loss = 0
    test_correct = 0
    test_total = 0
    
    with torch.no_grad():
        for data, target in test_loader:
            data, target = data.to(device), target.to(device)
            output = mnist_model(data)
            test_loss += criterion(output, target).item()
            _, predicted = output.max(1)
            test_total += target.size(0)
            test_correct += predicted.eq(target).sum().item()
    
    print(f"Epoch {epoch+1}/{num_epochs}:")
    print(f"  Train Loss: {train_loss/len(train_loader):.4f}, Acc: {100.*correct/total:.2f}%")
    print(f"  Test Loss: {test_loss/len(test_loader):.4f}, Acc: {100.*test_correct/test_total:.2f}%")

print("\n‚úÖ Entrenamiento MNIST completado!")

## 13. Predicci√≥n en MNIST

In [None]:
# Hacer predicciones en ejemplos del test set
mnist_model.eval()

# Obtener batch de test
test_examples = iter(test_loader)
test_images, test_labels = next(test_examples)

# Predecir
with torch.no_grad():
    test_images = test_images.to(device)
    outputs = mnist_model(test_images)
    _, predictions = torch.max(outputs, 1)

# Visualizar predicciones
fig, axes = plt.subplots(3, 5, figsize=(15, 9))
for i, ax in enumerate(axes.flat):
    ax.imshow(test_images[i].cpu().squeeze(), cmap='gray')
    pred_label = predictions[i].cpu().item()
    true_label = test_labels[i].item()
    color = 'green' if pred_label == true_label else 'red'
    ax.set_title(f'Pred: {pred_label} | True: {true_label}', color=color)
    ax.axis('off')
plt.tight_layout()
plt.show()

## 14. Resumen y Mejores Pr√°cticas

### ‚úÖ Conceptos Clave:
1. **Tensores**: Arrays multidimensionales con soporte GPU
2. **Autograd**: Diferenciaci√≥n autom√°tica para gradientes
3. **nn.Module**: Clase base para redes neuronales
4. **forward()**: Define el flujo de datos
5. **Loss Functions**: CrossEntropyLoss, MSELoss, etc.
6. **Optimizers**: Adam, SGD, RMSprop, etc.
7. **DataLoader**: Manejo eficiente de datos en batches

### üí° Mejores Pr√°cticas:
- ‚úÖ Usa GPU cuando est√© disponible (`.to(device)`)
- ‚úÖ Normaliza tus datos antes de entrenar
- ‚úÖ Usa `model.train()` y `model.eval()` apropiadamente
- ‚úÖ Usa `torch.no_grad()` para inferencia (ahorra memoria)
- ‚úÖ Guarda checkpoints durante el entrenamiento
- ‚úÖ Monitorea m√©tricas en train y test
- ‚úÖ Usa dropout para regularizaci√≥n
- ‚úÖ Experimenta con diferentes arquitecturas y hiperpar√°metros

### üöÄ Pr√≥ximos Pasos:
- Redes Convolucionales (CNN) para im√°genes
- Redes Recurrentes (RNN/LSTM) para secuencias
- Transfer Learning con modelos pre-entrenados
- Integraci√≥n con MLflow para tracking

In [None]:
print("üéâ Tutorial de PyTorch B√°sico completado!")
print(f"\nüìä Resumen:")
print(f"   Device usado: {device}")
print(f"   Modelo Simple NN - Test Acc: {test_accs[-1]:.4f}")
print(f"   Modelo MNIST - Test Acc: {100.*test_correct/test_total:.2f}%")
print(f"\nüìÅ Archivos guardados:")
print(f"   - simple_nn_model.pth")
print(f"   - simple_nn_checkpoint.pth")