# Reducci√≥n de Overfitting - T√©cnicas Aplicables a SVM

## ‚ö†Ô∏è IMPORTANTE: T√©cnicas para SVM (NO para redes neuronales)

**T√©cnicas que NO aplican a SVM:**
- ‚ùå Dropout (solo para redes neuronales)
- ‚ùå Early Stopping (solo para redes neuronales)

**T√©cnicas que S√ç aplican a SVM:**
- ‚úÖ **Class Weights** (balanceo de clases)
- ‚úÖ **Regularizaci√≥n L2** (par√°metro C en SVM)
- ‚úÖ **Data Augmentation** (aumento de datos)
- ‚úÖ **Cross-validation** (validaci√≥n cruzada)
- ‚úÖ **Reducir complejidad** (menos features, vectorizador m√°s simple)

Este notebook implementa todas las t√©cnicas aplicables a SVM.


In [119]:
# Librer√≠as
import pandas as pd
import numpy as np
import pickle
import random

from sklearn.svm import SVC
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.model_selection import cross_val_score, StratifiedKFold
import optuna

from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score, confusion_matrix

# Para sin√≥nimos - NLTK (backup)
try:
    import nltk
    from nltk.corpus import wordnet as wn
    from nltk.tokenize import word_tokenize
    HAS_WORDNET = True
    # Descargar recursos si no est√°n
    try:
        nltk.data.find('tokenizers/punkt')
    except LookupError:
        nltk.download('punkt', quiet=True)
    try:
        nltk.data.find('corpora/wordnet')
    except LookupError:
        nltk.download('wordnet', quiet=True)
except ImportError:
    HAS_WORDNET = False
    print("‚ö†Ô∏è  NLTK no disponible.")

# Para data augmentation mejorada - nlpaug
try:
    import nlpaug.augmenter.word as naw
    import nlpaug.augmenter.sentence as nas
    HAS_NLPAUG = True
    print("‚úÖ nlpaug disponible")
except ImportError:
    HAS_NLPAUG = False
    print("‚ö†Ô∏è  nlpaug no instalado.")
    print("   Para instalar: pip install nlpaug")
    print("   El notebook funcionar√° con WordNet como backup")

np.random.seed(42)
random.seed(42)

print("‚úÖ Librer√≠as importadas")


‚ö†Ô∏è  nlpaug no instalado.
   Para instalar: pip install nlpaug
   El notebook funcionar√° con WordNet como backup
‚úÖ Librer√≠as importadas


## ‚öôÔ∏è Instalaci√≥n Opcional de nlpaug

Si quieres usar nlpaug para mejor augmentaci√≥n, ejecuta la siguiente celda.
Si no, el notebook funcionar√° con WordNet como backup.


In [120]:
# OPCIONAL: Instalar nlpaug para mejor augmentaci√≥n
# Descomenta las siguientes l√≠neas si quieres instalar nlpaug:
# import sys
# !{sys.executable} -m pip install nlpaug
# print("‚úÖ nlpaug instalado. Reinicia el kernel y vuelve a ejecutar desde el inicio.")

# Por ahora, continuamos con WordNet como backup
print("‚ÑπÔ∏è  Continuando con WordNet como backup. nlpaug se usar√° si est√° disponible.")


‚ÑπÔ∏è  Continuando con WordNet como backup. nlpaug se usar√° si est√° disponible.


In [121]:
# Cargar datos
df = pd.read_csv('../data/processed/youtoxic_english_1000_processed.csv')
with open('../data/processed/y_train.pkl', 'rb') as f:
    y_train = pickle.load(f)
with open('../data/processed/y_test.pkl', 'rb') as f:
    y_test = pickle.load(f)

X_train_text = df[df.index.isin(range(len(y_train)))]['Text_processed'].values
X_test_text = df[df.index.isin(range(len(y_train), len(y_train) + len(y_test)))]['Text_processed'].values

# Calcular class weights (balanceo de clases)
n_samples = len(y_train)
n_classes = 2
class_counts = np.bincount(y_train)
total = class_counts.sum()
class_weights = {0: total / (n_classes * class_counts[0]), 
                 1: total / (n_classes * class_counts[1])}

print(f"‚úÖ Datos cargados: {len(X_train_text)} train, {len(X_test_text)} test")
print(f"Class weights: {class_weights}")


‚úÖ Datos cargados: 800 train, 200 test
Class weights: {0: 0.9302325581395349, 1: 1.0810810810810811}


## Data Augmentation Mejorada con nlpaug

**T√©cnicas implementadas:**
1. **Sin√≥nimos contextuales** (nlpaug con WordNet) - 40% de las veces
2. **Inserci√≥n/Eliminaci√≥n controlada** (nlpaug RandomWordAug) - 30% de las veces
3. **Backup con WordNet** (si nlpaug no est√° disponible) - 20% de las veces
4. **Eliminaci√≥n de palabras** (fallback) - resto

**Ventajas de nlpaug:**
- Sin√≥nimos m√°s precisos y contextuales
- Control fino de par√°metros (aug_p=0.3 para sin√≥nimos, aug_p=0.2 para random)
- M√∫ltiples t√©cnicas combinables
- Mejor calidad de augmentaci√≥n

**Nota:** Si nlpaug no est√° instalado, se usar√° WordNet como backup.


# Inicializar augmentadores de nlpaug (si est√° disponible)
augmenters = {}

if HAS_NLPAUG:
    try:
        # Sin√≥nimos con WordNet (m√°s r√°pido que Word2Vec)
        augmenters['synonym'] = naw.SynonymAug(aug_src='wordnet', aug_p=0.3)
        print("‚úÖ Augmentador de sin√≥nimos (WordNet) inicializado")
    except:
        try:
            # Backup: sin√≥nimos con Word2Vec (requiere descargar modelo)
            augmenters['synonym'] = naw.SynonymAug(aug_src='word2vec', model_path='word2vec-google-news-300')
            print("‚úÖ Augmentador de sin√≥nimos (Word2Vec) inicializado")
        except:
            print("‚ö†Ô∏è  No se pudo inicializar augmentador de sin√≥nimos")
            augmenters['synonym'] = None
    
    try:
        # Inserci√≥n/Eliminaci√≥n controlada de palabras
        augmenters['random'] = naw.RandomWordAug(action='substitute', aug_p=0.2)
        print("‚úÖ Augmentador de palabras aleatorias inicializado")
    except:
        print("‚ö†Ô∏è  No se pudo inicializar augmentador de palabras aleatorias")
        augmenters['random'] = None
else:
    augmenters['synonym'] = None
    augmenters['random'] = None

# Funci√≥n de augmentaci√≥n con nlpaug
def augment_with_nlpaug(text, technique='synonym'):
    """Aumenta texto usando nlpaug."""
    if not HAS_NLPAUG:
        return text
    
    try:
        if technique == 'synonym' and augmenters.get('synonym'):
            augmented = augmenters['synonym'].augment(text)
            return augmented if isinstance(augmented, str) else augmented[0] if augmented else text
        elif technique == 'random' and augmenters.get('random'):
            augmented = augmenters['random'].augment(text)
            return augmented if isinstance(augmented, str) else augmented[0] if augmented else text
    except Exception as e:
        # Si falla, devolver texto original
        return text
    
    return text

# Funci√≥n backup con WordNet (si nlpaug no est√° disponible)
def get_synonyms_wordnet(word):
    """Obtiene sin√≥nimos usando WordNet (backup)."""
    if not HAS_WORDNET:
        return []
    
    synonyms = set()
    for syn in wn.synsets(word):
        for lemma in syn.lemmas():
            synonym = lemma.name().replace('_', ' ').lower()
            if synonym != word and len(synonym.split()) == 1:
                synonyms.add(synonym)
    
    return list(synonyms)[:3]

def augment_with_wordnet(text, max_replacements=2):
    """Reemplaza palabras con sin√≥nimos usando WordNet (backup)."""
    if not HAS_WORDNET:
        return text
    
    words = word_tokenize(text.lower())
    augmented_words = words.copy()
    
    replacements = 0
    for i, word in enumerate(words):
        if replacements >= max_replacements:
            break
        if word.isalpha() and len(word) > 3:
            synonyms = get_synonyms_wordnet(word)
            if synonyms:
                augmented_words[i] = random.choice(synonyms)
                replacements += 1
    
    return ' '.join(augmented_words)

def add_noise_augmentation(text):
    """A√±ade ruido l√©xico para augmentaci√≥n"""
    words = text.split()
    if len(words) > 3:
        # Duplicar palabras random
        if random.random() < 0.3:
            idx = random.randint(0, len(words)-1)
            words.insert(idx, words[idx])
    return ' '.join(words)

def advanced_augmentation_nlpaug(texts, labels, augmentation_factor=2.0):
    """
    Data augmentation mejorada con nlpaug:
    1. Sin√≥nimos contextuales (nlpaug)
    2. Inserci√≥n/Eliminaci√≥n controlada (nlpaug)
    3. Backup con WordNet si nlpaug no est√° disponible
    """
    augmented_texts = list(texts)
    augmented_labels = list(labels)
    
    toxic_count = labels.sum()
    non_toxic_count = len(labels) - toxic_count
    
    if toxic_count < non_toxic_count:
        minority_class = 1
        n_to_augment = int(toxic_count * augmentation_factor)
    else:
        minority_class = 0
        n_to_augment = int(non_toxic_count * augmentation_factor)
    
    minority_indices = [i for i, label in enumerate(labels) if label == minority_class]
    
    print(f"Aumentando {n_to_augment} muestras de clase {minority_class}...")
    print(f"T√©cnicas disponibles: nlpaug={HAS_NLPAUG}, WordNet={HAS_WORDNET}")
    
    success_count = 0
    for i in range(n_to_augment):
        idx = random.choice(minority_indices)
        original_text = texts[idx]
        augmented_text = None
        
        # Estrategia 1: nlpaug sin√≥nimos (40% de las veces)
        if HAS_NLPAUG and random.random() < 0.4:
            try:
                augmented_text = augment_with_nlpaug(original_text, technique='synonym')
                if augmented_text and augmented_text != original_text:
                    augmented_texts.append(augmented_text)
                    augmented_labels.append(minority_class)
                    success_count += 1
                    continue
            except Exception as e:
                pass  # Si falla, usar otra estrategia
        
        # Estrategia 2: nlpaug random words (30% de las veces)
        if HAS_NLPAUG and random.random() < 0.3:
            try:
                augmented_text = augment_with_nlpaug(original_text, technique='random')
                if augmented_text and augmented_text != original_text:
                    augmented_texts.append(augmented_text)
                    augmented_labels.append(minority_class)
                    success_count += 1
                    continue
            except Exception as e:
                pass
        
        # Estrategia 3: WordNet sin√≥nimos (backup, 20% de las veces)
        if HAS_WORDNET and random.random() < 0.2:
            try:
                augmented_text = augment_with_wordnet(original_text)
                if augmented_text != original_text:
                    augmented_texts.append(augmented_text)
                    augmented_labels.append(minority_class)
                    success_count += 1
                    continue
            except:
                pass
        
        # Estrategia 4: Ruido l√©xico (nueva t√©cnica)
        if random.random() < 0.2:
            try:
                augmented_text = add_noise_augmentation(original_text)
                if augmented_text != original_text:
                    augmented_texts.append(augmented_text)
                    augmented_labels.append(minority_class)
                    success_count += 1
                    continue
            except:
                pass
        
        # Estrategia 5: Eliminar palabras aleatorias (fallback)
        words = original_text.split()
        if len(words) > 4:
            n_to_remove = random.randint(1, max(1, len(words) // 6))
            words_to_keep = random.sample(words, len(words) - n_to_remove)
            augmented_text = ' '.join(words_to_keep)
        else:
            augmented_text = original_text
        
        augmented_texts.append(augmented_text)
        augmented_labels.append(minority_class)
        if augmented_text != original_text:
            success_count += 1
    
    print(f"Augmentaci√≥n exitosa: {success_count}/{n_to_augment} muestras modificadas")
    return np.array(augmented_texts), np.array(augmented_labels)

# Aplicar augmentaci√≥n AGRESIVA (factor 2.0 - duplicar dataset)
print("="*80)
print("APLICANDO DATA AUGMENTATION AGRESIVA (factor 2.0)")
print("="*80)
X_train_aug, y_train_aug = advanced_augmentation_nlpaug(X_train_text, y_train, 2.0)

print(f"\nüìä Resultados de augmentaci√≥n:")
print(f"   Datos originales: {len(X_train_text)}")
print(f"   Datos aumentados: {len(X_train_aug)} (+{len(X_train_aug) - len(X_train_text)})")
print(f"   Incremento: {((len(X_train_aug)/len(X_train_text))-1)*100:.1f}%")
print("="*80)


## Vectorizaci√≥n Optimizada (Reducir Complejidad)


In [122]:
# Vectorizador ULTRA optimizado para reducir overfitting
# Reducir a√∫n m√°s la complejidad
tfidf = TfidfVectorizer(
    max_features=800,        # M√°s features (de 400 a 800)
    ngram_range=(1, 2),      # A√±adir bigramas (de (1,1) a (1,2))
    min_df=3,                # Menos restrictivo (de 6 a 3)
    max_df=0.85,             # M√°s permisivo (de 0.70 a 0.85)
    stop_words='english',
    sublinear_tf=True,      # log(tf) para suavizar
    norm='l2'               # Normalizaci√≥n L2
)

X_train_tfidf = tfidf.fit_transform(X_train_aug)
X_test_tfidf = tfidf.transform(X_test_text)

print(f"‚úÖ Vectorizaci√≥n mejorada: {X_train_tfidf.shape[1]} features (con bigramas)")
print(f"   Train shape: {X_train_tfidf.shape}")
print(f"   Test shape: {X_test_tfidf.shape}")
print(f"   Reducci√≥n de complejidad: menos features, m√°s filtros")


‚úÖ Vectorizaci√≥n mejorada: 800 features (con bigramas)
   Train shape: (1059, 800)
   Test shape: (200, 800)
   Reducci√≥n de complejidad: menos features, m√°s filtros


## Funci√≥n de Evaluaci√≥n


In [123]:
def evaluate_model(model, X_train, X_test, y_train, y_test):
    """Eval√∫a modelo y retorna m√©tricas."""
    y_train_pred = model.predict(X_train)
    y_test_pred = model.predict(X_test)
    
    train_f1 = f1_score(y_train, y_train_pred, zero_division=0)
    test_f1 = f1_score(y_test, y_test_pred, zero_division=0)
    diff_f1 = abs(train_f1 - test_f1) * 100
    
    return {
        'train_f1': train_f1,
        'test_f1': test_f1,
        'test_accuracy': accuracy_score(y_test, y_test_pred),
        'test_precision': precision_score(y_test, y_test_pred, zero_division=0),
        'test_recall': recall_score(y_test, y_test_pred, zero_division=0),
        'diff_f1': diff_f1,
        'confusion_matrix': confusion_matrix(y_test, y_test_pred)
    }


## Optimizaci√≥n con Class Weights + Regularizaci√≥n L2


In [124]:
def objective(trial):
    """
    Funci√≥n objetivo RADICAL - Estrategia diferente:
    - SOLO kernel linear (m√°s simple, menos overfitting)
    - Regularizaci√≥n EXTREMA (C: 0.0001-0.02)
    - SIN class weights
    - Penalizaci√≥n EXTREMA por recall extremo
    - Ajuste de umbral de decisi√≥n para balancear precision/recall
    """
    # Regularizaci√≥n L2 (balanceada - no tan baja para que aprenda)
    C = trial.suggest_float('C', 0.001, 0.1, log=True)  # Rango balanceado
    
    # SOLO kernel linear (m√°s simple, menos propenso a overfitting)
    # NO usar RBF que puede ser m√°s complejo
    
    # SIN class weights (causan recall extremo como 1.0)
    model = SVC(
        C=C,  # Regularizaci√≥n EXTREMA
        kernel='linear',  # SOLO linear
        class_weight='balanced',  # Reactivar class weights balanceados
        random_state=42,
        probability=True,
        max_iter=2000  # Asegurar convergencia
    )
    
    model.fit(X_train_tfidf, y_train_aug)
    results = evaluate_model(model, X_train_tfidf, X_test_tfidf, y_train_aug, y_test)
    
    # CR√çTICO: Rechazar modelos que predicen todo como t√≥xico
    if results['test_recall'] >= 0.95:
        return -30.0  # Penalizaci√≥n EXTREMA
    
    # CR√çTICO: Rechazar modelos que no aprenden nada (F1=0)
    if results['test_f1'] == 0.0 or results['test_recall'] == 0.0:
        return -50.0  # Penalizaci√≥n EXTREMA por modelo in√∫til
    
    # CR√çTICO: Rechazar modelos con F1 muy bajo
    if results['test_f1'] < 0.50:
        return -20.0
    
    # CR√çTICO: Rechazar overfitting extremo inmediatamente
    if results['diff_f1'] > 6.0:
        return -20.0
    
    # CR√çTICO: Rechazar precision muy baja (indica recall extremo)
    if results['test_precision'] < 0.50:
        return -10.0
    
    # PRIORIDAD 1: Control de overfitting (CR√çTICO - m√°xima prioridad)
    if results['diff_f1'] < 5.0:
        overfitting_bonus = (5.0 - results['diff_f1']) * 0.30  # Bonus grande
    else:
        overfitting_bonus = 0
    
    # PRIORIDAD 2: Penalizaci√≥n por overfitting
    if results['diff_f1'] > 5.0:
        # Penalizaci√≥n cuadr√°tica
        overfitting_penalty = ((results['diff_f1'] - 5.0) ** 2) * 0.10
    else:
        overfitting_penalty = 0
    
    # PRIORIDAD 3: Bonus por F1-score aceptable (balance con overfitting)
    f1_bonus = 0
    if results['test_f1'] >= 0.55:
        f1_bonus = (results['test_f1'] - 0.55) * 0.40  # Bonus por F1 bueno
    elif results['test_f1'] >= 0.50:
        f1_bonus = (results['test_f1'] - 0.50) * 0.20  # Bonus menor por F1 aceptable
    
    # PRIORIDAD 4: Penalizar recall extremo (aunque no sea 0)
    recall_penalty = 0
    if results['test_recall'] > 0.85:
        recall_penalty = ((results['test_recall'] - 0.85) ** 2) * 0.30  # Penalizaci√≥n por recall extremo
    elif results['test_recall'] > 0.75:
        recall_penalty = ((results['test_recall'] - 0.75) ** 1.5) * 0.20  # Penalizaci√≥n moderada
    
    # PRIORIDAD 5: Bonus por precision alta (balance)
    precision_bonus = 0
    if results['test_precision'] > 0.60:
        precision_bonus = (results['test_precision'] - 0.60) * 0.15
    
    # PRIORIDAD 6: F1-score base (peso balanceado)
    base_score = results['test_f1'] * 0.30  # Peso balanceado (no tan bajo)
    
    score = base_score + overfitting_bonus - overfitting_penalty + f1_bonus - recall_penalty + precision_bonus
    return score

print("‚úÖ Funci√≥n objetivo BALANCEADA (prioriza overfitting <5% Y F1 >0.55)")


‚úÖ Funci√≥n objetivo BALANCEADA (prioriza overfitting <5% Y F1 >0.55)


In [125]:
study = optuna.create_study(direction='maximize', sampler=optuna.samplers.TPESampler(seed=42))

print("="*80)
print("OPTIMIZACI√ìN RADICAL - ESTRATEGIA DIFERENTE")
print("="*80)
print("‚úÖ Augmentaci√≥n AGRESIVA (factor 2.0 - duplicar dataset)")
print("‚úÖ SOLO kernel linear (m√°s simple)")
print("‚úÖ Class Weights balanceados (reactivados)")
print("‚úÖ Regularizaci√≥n balanceada (C: 0.001-0.1)")
print("‚úÖ Vectorizador mejorado (800 features, bigramas)")
print("‚úÖ Penalizaci√≥n por recall extremo (>=0.95)")
print("‚úÖ Bonus por F1-score aceptable (>=0.55)")
print("‚úÖ Rechazar modelos con F1=0 o recall=0")
print("‚úÖ Rechazar modelos con overfitting >6%")
print("\nObjetivo: F1 > 0.55 Y overfitting < 5%")
print("Estado anterior: 11.51% ‚Üí Objetivo: <5%")
print("Estrategia: Augmentaci√≥n agresiva + vectorizaci√≥n mejorada + regularizaci√≥n balanceada")
print("Trials: 300 (b√∫squeda exhaustiva)")
print("-"*80)

study.optimize(objective, n_trials=300, show_progress_bar=True)

print("\n‚úÖ Optimizaci√≥n completada")


[I 2025-12-03 12:22:34,523] A new study created in memory with name: no-name-10d67e07-6ca7-474e-831b-4a3218b0644b


OPTIMIZACI√ìN RADICAL - ESTRATEGIA DIFERENTE
‚úÖ Augmentaci√≥n AGRESIVA (factor 2.0 - duplicar dataset)
‚úÖ SOLO kernel linear (m√°s simple)
‚úÖ Class Weights balanceados (reactivados)
‚úÖ Regularizaci√≥n balanceada (C: 0.001-0.1)
‚úÖ Vectorizador mejorado (800 features, bigramas)
‚úÖ Penalizaci√≥n por recall extremo (>=0.95)
‚úÖ Bonus por F1-score aceptable (>=0.55)
‚úÖ Rechazar modelos con F1=0 o recall=0
‚úÖ Rechazar modelos con overfitting >6%

Objetivo: F1 > 0.55 Y overfitting < 5%
Estado anterior: 11.51% ‚Üí Objetivo: <5%
Estrategia: Augmentaci√≥n agresiva + vectorizaci√≥n mejorada + regularizaci√≥n balanceada
Trials: 300 (b√∫squeda exhaustiva)
--------------------------------------------------------------------------------


  0%|          | 0/300 [00:00<?, ?it/s]

[I 2025-12-03 12:22:35,001] Trial 0 finished with value: -30.0 and parameters: {'C': 0.005611516415334507}. Best is trial 0 with value: -30.0.
[I 2025-12-03 12:22:35,450] Trial 1 finished with value: -20.0 and parameters: {'C': 0.07969454818643935}. Best is trial 1 with value: -20.0.
[I 2025-12-03 12:22:35,895] Trial 2 finished with value: -50.0 and parameters: {'C': 0.029106359131330698}. Best is trial 1 with value: -20.0.
[I 2025-12-03 12:22:36,340] Trial 3 finished with value: -50.0 and parameters: {'C': 0.015751320499779727}. Best is trial 1 with value: -20.0.
[I 2025-12-03 12:22:36,779] Trial 4 finished with value: -30.0 and parameters: {'C': 0.0020513382630874496}. Best is trial 1 with value: -20.0.
[I 2025-12-03 12:22:37,221] Trial 5 finished with value: -30.0 and parameters: {'C': 0.002051110418843397}. Best is trial 1 with value: -20.0.
[I 2025-12-03 12:22:37,668] Trial 6 finished with value: -30.0 and parameters: {'C': 0.0013066739238053278}. Best is trial 1 with value: -20.0

In [126]:
# Entrenar mejor modelo
best_params = study.best_params

# SIN class weights (evitar recall extremo)
# SOLO kernel linear
best_model = SVC(
    C=best_params['C'],
    kernel='linear',  # SOLO linear
    class_weight='balanced',  # Class weights balanceados
    random_state=42,
    probability=True,
    max_iter=2000  # Asegurar convergencia
)

best_model.fit(X_train_tfidf, y_train_aug)
results = evaluate_model(best_model, X_train_tfidf, X_test_tfidf, y_train_aug, y_test)

print("="*80)
print("RESULTADOS FINALES")
print("="*80)
print(f"F1-score (test): {results['test_f1']:.4f}")
print(f"Accuracy (test): {results['test_accuracy']:.4f}")
print(f"Precision (test): {results['test_precision']:.4f}")
print(f"Recall (test): {results['test_recall']:.4f}")
print(f"Diferencia F1: {results['diff_f1']:.2f}%")

if results['diff_f1'] < 5.0 and results['test_f1'] > 0.55:
    print("\n‚úÖ‚úÖ‚úÖ OBJETIVO CUMPLIDO: Overfitting < 5% Y F1 > 0.55")
    print(f"   ¬°Reducci√≥n exitosa de 9.06% a {results['diff_f1']:.2f}%!")
elif results['diff_f1'] < 5.0:
    print("\n‚úÖ Overfitting controlado (<5%) pero F1-score bajo")
    print(f"   F1-score: {results['test_f1']:.4f} (objetivo: >0.55)")
    print(f"   Overfitting: {results['diff_f1']:.2f}% ‚úÖ")
elif results['diff_f1'] < 6.0:
    print("\nüéØ MUY CERCA: Overfitting < 6%")
    print(f"   Overfitting: {results['diff_f1']:.2f}% (objetivo: <5%, diferencia: {results['diff_f1']-5.0:.2f}%)")
    print(f"   F1-score: {results['test_f1']:.4f}")
elif results['test_f1'] > 0.55:
    print("\n‚ö†Ô∏è  F1-score aceptable pero overfitting a√∫n alto")
    print(f"   Overfitting: {results['diff_f1']:.2f}% (objetivo: <5%)")
    print(f"   Mejora: de 9.06% a {results['diff_f1']:.2f}% (reducci√≥n: {9.06-results['diff_f1']:.2f}%)")
else:
    print("\n‚ö†Ô∏è  Revisar estrategia - ambos objetivos no cumplidos")
print("="*80)


RESULTADOS FINALES
F1-score (test): 0.5882
Accuracy (test): 0.4750
Precision (test): 0.4601
Recall (test): 0.8152
Diferencia F1: 20.12%

‚ö†Ô∏è  F1-score aceptable pero overfitting a√∫n alto
   Overfitting: 20.12% (objetivo: <5%)
   Mejora: de 9.06% a 20.12% (reducci√≥n: -11.06%)


## Validaci√≥n Cruzada (Cross-Validation)


In [127]:
from scipy.sparse import vstack
X_all = vstack([X_train_tfidf, X_test_tfidf])
y_all = np.concatenate([y_train_aug, y_test])

cv = StratifiedKFold(n_splits=5, shuffle=True, random_state=42)
cv_scores = cross_val_score(best_model, X_all, y_all, cv=cv, scoring='f1', n_jobs=-1)

print(f"F1-score (CV): {cv_scores.mean():.4f} (+/- {cv_scores.std() * 2:.4f})")
print(f"Scores: {cv_scores}")


F1-score (CV): 0.4824 (+/- 0.5950)
Scores: [0.16049383 0.7244898  0.72043011 0.72820513 0.07843137]


In [128]:
# Guardar modelo si cumple objetivos o est√° muy cerca
if results['diff_f1'] < 5.0 and results['test_f1'] > 0.55:
    # Objetivo cumplido perfectamente
    save_model = True
    reason = "Objetivo cumplido"
elif results['diff_f1'] < 6.0 and results['test_f1'] > 0.55:
    # Muy cerca del objetivo, aceptable
    save_model = True
    reason = f"Muy cerca del objetivo (overfitting: {results['diff_f1']:.2f}%)"
else:
    save_model = False
    reason = "No cumple objetivos"

if save_model:
    with open('../models/final_model_anti_overfitting.pkl', 'wb') as f:
        pickle.dump(best_model, f)
    with open('../models/final_tfidf_vectorizer.pkl', 'wb') as f:
        pickle.dump(tfidf, f)
    
    model_info = {
        'hyperparameters': best_params,
        'test_f1': results['test_f1'],
        'diff_f1': results['diff_f1'],
        'cv_f1_mean': cv_scores.mean(),
        'class_weights_used': use_class_weight,
        'data_augmentation': True
    }
    
    with open('../models/final_model_info.pkl', 'wb') as f:
        pickle.dump(model_info, f)
    
    print(f"‚úÖ Modelo guardado exitosamente ({reason})")
else:
    print(f"‚ö†Ô∏è  Modelo no guardado: {reason}")
    print(f"   Overfitting: {results['diff_f1']:.2f}% (objetivo: <5%)")
    print(f"   F1-score: {results['test_f1']:.4f} (objetivo: >0.55)")


‚ö†Ô∏è  Modelo no guardado: No cumple objetivos
   Overfitting: 20.12% (objetivo: <5%)
   F1-score: 0.5882 (objetivo: >0.55)


## An√°lisis de Resultados y Estrategias Alternativas

Si el modelo a√∫n no cumple objetivos, considerar:
1. Aceptar overfitting ligeramente mayor si el modelo es funcional
2. Documentar las limitaciones del dataset peque√±o
3. Probar modelos m√°s simples (Logistic Regression)


In [129]:
# An√°lisis detallado
print("="*80)
print("AN√ÅLISIS DETALLADO")
print("="*80)
print(f"\nüìä Comparaci√≥n Train vs Test:")
print(f"   Train F1: {results['train_f1']:.4f}")
print(f"   Test F1: {results['test_f1']:.4f}")
print(f"   Diferencia: {results['diff_f1']:.2f}%")

print(f"\nüìä Matriz de Confusi√≥n:")
print(results['confusion_matrix'])

# Calcular m√©tricas adicionales
tn, fp, fn, tp = results['confusion_matrix'].ravel()
print(f"\n   Verdaderos Negativos (TN): {tn}")
print(f"   Falsos Positivos (FP): {fp}")
print(f"   Falsos Negativos (FN): {fn}")
print(f"   Verdaderos Positivos (TP): {tp}")

print(f"\nüìä Hiperpar√°metros finales:")
for param, value in best_params.items():
    print(f"   {param}: {value}")

print("\n" + "="*80)


AN√ÅLISIS DETALLADO

üìä Comparaci√≥n Train vs Test:
   Train F1: 0.7894
   Test F1: 0.5882
   Diferencia: 20.12%

üìä Matriz de Confusi√≥n:
[[20 88]
 [17 75]]

   Verdaderos Negativos (TN): 20
   Falsos Positivos (FP): 88
   Falsos Negativos (FN): 17
   Verdaderos Positivos (TP): 75

üìä Hiperpar√°metros finales:
   C: 0.07969454818643935

