# Modelo RNN Sin Memoria (Simple RNN)

En este cuaderno se implementa una Red Neuronal Recurrente (RNN) básica (sin celdas de memoria compleja como LSTM o GRU) para la clasificación de texto.
El objetivo es entrenar y evaluar el modelo utilizando los embeddings pre-procesados.

In [4]:
import numpy as np
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import TensorDataset, DataLoader
from sklearn.metrics import classification_report, accuracy_score

# Configuración del dispositivo (GPU si está disponible)
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(f"Usando dispositivo: {device}")

Usando dispositivo: cpu


## Carga y Preprocesamiento de Datos

Se cargan los archivos `.npy` que contienen los embeddings y las etiquetas.
Los datos de entrada tienen la forma `(N, 384)`. Para utilizarlos en una RNN, los redimensionamos a `(N, 1, 384)`, tratando cada embedding como una secuencia de longitud 1.
Las etiquetas se ajustan para iniciar en 0 (de 1-5 a 0-4).

In [5]:
# Cargar datos
train_X = np.load('train_all.npy')
train_y = np.load('train_labels.npy')
valid_X = np.load('valid_all.npy')
valid_y = np.load('valid_labels.npy')
test_X = np.load('test_all.npy')
test_y = np.load('test_labels.npy')

# Convertir a tensores de PyTorch
# Redimensionar X a (Batch, Sequence Length, Features) -> (N, 1, 384)
X_train_tensor = torch.tensor(train_X, dtype=torch.float32).unsqueeze(1).to(device)
y_train_tensor = torch.tensor(train_y - 1, dtype=torch.long).to(device) # Restar 1 para que las clases sean 0-4

X_valid_tensor = torch.tensor(valid_X, dtype=torch.float32).unsqueeze(1).to(device)
y_valid_tensor = torch.tensor(valid_y - 1, dtype=torch.long).to(device)

X_test_tensor = torch.tensor(test_X, dtype=torch.float32).unsqueeze(1).to(device)
y_test_tensor = torch.tensor(test_y - 1, dtype=torch.long).to(device)

print(f"Forma de X_train: {X_train_tensor.shape}")
print(f"Forma de y_train: {y_train_tensor.shape}")

# Crear DataLoaders
batch_size = 64
train_dataset = TensorDataset(X_train_tensor, y_train_tensor)
valid_dataset = TensorDataset(X_valid_tensor, y_valid_tensor)
test_dataset = TensorDataset(X_test_tensor, y_test_tensor)

train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True)
valid_loader = DataLoader(valid_dataset, batch_size=batch_size, shuffle=False)
test_loader = DataLoader(test_dataset, batch_size=batch_size, shuffle=False)

Forma de X_train: torch.Size([1200000, 1, 384])
Forma de y_train: torch.Size([1200000])


## Definición del Modelo

Se define una arquitectura RNN simple utilizando `nn.RNN` de PyTorch.
- **Input Size**: 384 (dimensión del embedding).
- **Hidden Size**: 128 (tamaño del estado oculto).
- **Output Size**: 5 (número de clases).
- **Batch First**: True (para que la entrada sea (Batch, Seq, Feature)).

In [6]:
class SimpleRNNModel(nn.Module):
    def __init__(self, input_size, hidden_size, output_size):
        super(SimpleRNNModel, self).__init__()
        self.rnn = nn.RNN(input_size, hidden_size, batch_first=True)
        self.fc = nn.Linear(hidden_size, output_size)

    def forward(self, x):
        # x shape: (batch, seq_len, input_size)
        # out shape: (batch, seq_len, hidden_size)
        # h_n shape: (num_layers, batch, hidden_size)
        out, h_n = self.rnn(x)
        
        # Tomamos la salida del último paso de tiempo
        last_out = out[:, -1, :]
        
        # Capa totalmente conectada para clasificación
        out = self.fc(last_out)
        return out

# Instanciar el modelo
input_size = 384
hidden_size = 128
output_size = 5
model = SimpleRNNModel(input_size, hidden_size, output_size).to(device)
print(model)

SimpleRNNModel(
  (rnn): RNN(384, 128, batch_first=True)
  (fc): Linear(in_features=128, out_features=5, bias=True)
)


## Entrenamiento

Se define la función de pérdida (`CrossEntropyLoss`) y el optimizador (`Adam`).
Se entrena el modelo por un número determinado de épocas, monitoreando la pérdida y precisión en el conjunto de validación.

In [7]:
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)

num_epochs = 10

for epoch in range(num_epochs):
    model.train()
    train_loss = 0.0
    correct_train = 0
    total_train = 0
    
    for X_batch, y_batch in train_loader:
        optimizer.zero_grad()
        outputs = model(X_batch)
        loss = criterion(outputs, y_batch)
        loss.backward()
        optimizer.step()
        
        train_loss += loss.item()
        _, predicted = torch.max(outputs.data, 1)
        total_train += y_batch.size(0)
        correct_train += (predicted == y_batch).sum().item()
        
    train_acc = 100 * correct_train / total_train
    avg_train_loss = train_loss / len(train_loader)
    
    # Validación
    model.eval()
    valid_loss = 0.0
    correct_valid = 0
    total_valid = 0
    
    with torch.no_grad():
        for X_batch, y_batch in valid_loader:
            outputs = model(X_batch)
            loss = criterion(outputs, y_batch)
            valid_loss += loss.item()
            _, predicted = torch.max(outputs.data, 1)
            total_valid += y_batch.size(0)
            correct_valid += (predicted == y_batch).sum().item()
            
    valid_acc = 100 * correct_valid / total_valid
    avg_valid_loss = valid_loss / len(valid_loader)
    
    print(f'Epoch [{epoch+1}/{num_epochs}], '
          f'Train Loss: {avg_train_loss:.4f}, Train Acc: {train_acc:.2f}%, '
          f'Valid Loss: {avg_valid_loss:.4f}, Valid Acc: {valid_acc:.2f}%')

Epoch [1/10], Train Loss: 1.1183, Train Acc: 50.88%, Valid Loss: 1.0956, Valid Acc: 52.05%
Epoch [2/10], Train Loss: 1.0853, Train Acc: 52.28%, Valid Loss: 1.0768, Valid Acc: 52.86%
Epoch [2/10], Train Loss: 1.0853, Train Acc: 52.28%, Valid Loss: 1.0768, Valid Acc: 52.86%
Epoch [3/10], Train Loss: 1.0735, Train Acc: 52.79%, Valid Loss: 1.0736, Valid Acc: 52.63%
Epoch [3/10], Train Loss: 1.0735, Train Acc: 52.79%, Valid Loss: 1.0736, Valid Acc: 52.63%
Epoch [4/10], Train Loss: 1.0672, Train Acc: 53.04%, Valid Loss: 1.0683, Valid Acc: 53.44%
Epoch [4/10], Train Loss: 1.0672, Train Acc: 53.04%, Valid Loss: 1.0683, Valid Acc: 53.44%
Epoch [5/10], Train Loss: 1.0631, Train Acc: 53.26%, Valid Loss: 1.0682, Valid Acc: 53.07%
Epoch [5/10], Train Loss: 1.0631, Train Acc: 53.26%, Valid Loss: 1.0682, Valid Acc: 53.07%
Epoch [6/10], Train Loss: 1.0601, Train Acc: 53.39%, Valid Loss: 1.0643, Valid Acc: 53.39%
Epoch [6/10], Train Loss: 1.0601, Train Acc: 53.39%, Valid Loss: 1.0643, Valid Acc: 53.39%

## Evaluación

Se evalúa el modelo con el conjunto de prueba (`test_all.npy`).
Se calcula la precisión final y se muestra un reporte de clasificación detallado.

In [8]:
model.eval()
all_preds = []
all_labels = []

with torch.no_grad():
    for X_batch, y_batch in test_loader:
        outputs = model(X_batch)
        _, predicted = torch.max(outputs.data, 1)
        all_preds.extend(predicted.cpu().numpy())
        all_labels.extend(y_batch.cpu().numpy())

# Calcular métricas
accuracy = accuracy_score(all_labels, all_preds)
print(f'Test Accuracy: {accuracy * 100:.2f}%')

print("\nClassification Report:")
print(classification_report(all_labels, all_preds, target_names=['Clase 1', 'Clase 2', 'Clase 3', 'Clase 4', 'Clase 5']))

Test Accuracy: 53.42%

Classification Report:
              precision    recall  f1-score   support

     Clase 1       0.67      0.62      0.65      6000
     Clase 2       0.45      0.48      0.46      6000
     Clase 3       0.42      0.45      0.43      6000
     Clase 4       0.48      0.40      0.44      6000
     Clase 5       0.65      0.72      0.69      6000

    accuracy                           0.53     30000
   macro avg       0.54      0.53      0.53     30000
weighted avg       0.54      0.53      0.53     30000

