# AG News Classification ‚Äî Comparaci√≥n de Transformers
Este cuaderno cumple la r√∫brica de la **Task 2**:
- Carga y particiona el dataset AG News (70/15/15)
- Entrena RoBERTa, DeBERTa y ModernBERT
- Calcula F1-scores y genera comparaci√≥n visual
- Opcional bonus (usa `rpp_classified.json` para alinear con LLM)

In [None]:
# === 0. Dependencias ===
!pip -q install datasets transformers torch scikit-learn matplotlib seaborn
import pandas as pd, numpy as np, torch, matplotlib.pyplot as plt, seaborn as sns
from datasets import load_dataset
from sklearn.metrics import f1_score, classification_report
from transformers import AutoTokenizer, AutoModelForSequenceClassification, TrainingArguments, Trainer

In [None]:
# === 1. Dataset AG News ===
dataset = load_dataset('ag_news')
train_val = dataset['train'].train_test_split(test_size=0.3, seed=42)
train = train_val['train']
val_test = train_val['test'].train_test_split(test_size=0.5, seed=42)
val, test = val_test['train'], val_test['test']
print(train.shape, val.shape, test.shape)

In [None]:
# === 2. Tokenizaci√≥n ===
def tokenize(batch, tokenizer):
    return tokenizer(batch['text'], padding='max_length', truncation=True, max_length=128)

model_names = [
    'roberta-base',
    'microsoft/deberta-base',
    'jinaai/jina-bert-v2-base-es'  # ModernBERT (multiling√ºe)
]

In [None]:
# === 3. Entrenamiento y evaluaci√≥n ===
f1_scores = {}
for model_name in model_names:
    print(f'\nüöÄ Entrenando {model_name}...')
    tokenizer = AutoTokenizer.from_pretrained(model_name)
    tokenized_train = train.map(lambda x: tokenize(x, tokenizer), batched=True)
    tokenized_val = val.map(lambda x: tokenize(x, tokenizer), batched=True)
    tokenized_train.set_format('torch', columns=['input_ids','attention_mask','label'])
    tokenized_val.set_format('torch', columns=['input_ids','attention_mask','label'])

    model = AutoModelForSequenceClassification.from_pretrained(model_name, num_labels=4)
    args = TrainingArguments(output_dir=f'outputs/{model_name}',
                             evaluation_strategy='epoch',
                             save_strategy='epoch',
                             learning_rate=2e-5, per_device_train_batch_size=8,
                             per_device_eval_batch_size=8, num_train_epochs=1,
                             weight_decay=0.01, logging_dir='logs', logging_steps=100)

    trainer = Trainer(model=model, args=args, train_dataset=tokenized_train,
                      eval_dataset=tokenized_val, tokenizer=tokenizer)
    trainer.train()

    preds = trainer.predict(tokenized_val)
    y_true = preds.label_ids
    y_pred = preds.predictions.argmax(-1)
    f1 = f1_score(y_true, y_pred, average='macro')
    f1_scores[model_name] = f1
    print(classification_report(y_true, y_pred))

print('\nüìä F1-scores obtenidos:', f1_scores)

In [None]:
# === 4. Visualizaci√≥n ===
sns.barplot(x=list(f1_scores.keys()), y=list(f1_scores.values()))
plt.title('Comparaci√≥n de F1-score (validaci√≥n)')
plt.ylabel('F1-macro')
plt.xticks(rotation=45)
plt.tight_layout()
plt.show()

### üß© Bonus (comparaci√≥n con LLM)
Puedes cargar `data/rpp_classified.json` y comparar las predicciones de los tres modelos sobre las 50 noticias con las etiquetas LLM para calcular un nuevo F1-score.