# SEMANA 2 - MLP 1 MINIMALISTA

**Arquitetura:** Input(15) ‚Üí Dense(16, ReLU) ‚Üí Output(1, Sigmoid)  
**Par√¢metros:** 273 (rela√ß√£o 3.26 amostras/par√¢metro - SEGURO!)

## Objetivos desta semana:
1. ‚úÖ Carregar dados preprocessados (Semana 1)
2. ‚úÖ Implementar arquitetura MLP 1
3. ‚úÖ Definir hiperpar√¢metros (learning rate, batch size, epochs)
4. ‚úÖ Treinar modelo com callbacks (EarlyStopping, ModelCheckpoint)
5. ‚úÖ Avaliar performance (acur√°cia, loss, curvas de aprendizado)
6. ‚úÖ Salvar modelo e m√©tricas

## Por que esta arquitetura?
- **Simples e interpret√°vel** - apenas 1 camada oculta
- **Baixo risco de overfitting** - 273 par√¢metros para 712 amostras de treino
- **Baseline s√≥lido** - captura padr√µes principais sem decorar ru√≠dos

## üì¶ Importar Bibliotecas

In [None]:
import pandas as pd
import numpy as np
import json
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.metrics import accuracy_score, classification_report, confusion_matrix

# TensorFlow/Keras
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout
from tensorflow.keras.callbacks import EarlyStopping, ModelCheckpoint, ReduceLROnPlateau
from tensorflow.keras.regularizers import l2

# Configura√ß√µes de visualiza√ß√£o
sns.set_style('whitegrid')
plt.rcParams['figure.figsize'] = (12, 6)

print(f"TensorFlow vers√£o: {tf.__version__}")
print(f"Keras vers√£o: {keras.__version__}")

## üé≤ Fixar Seeds (Reprodutibilidade)

In [None]:
SEED = 42

# Fixar seeds
np.random.seed(SEED)
tf.random.set_seed(SEED)
import random
random.seed(SEED)

print("‚úÖ Seeds fixadas! SEED =", SEED)

## üìÇ Carregar Dados (Semana 1)

Carregamos os dados j√° separados e os √≠ndices salvos na Semana 1.

In [None]:
# Carregar datasets separados
X_train = pd.read_csv('../data/processed/X_train.csv')
X_val = pd.read_csv('../data/processed/X_val.csv')
y_train = pd.read_csv('../data/processed/y_train.csv').values.ravel()
y_val = pd.read_csv('../data/processed/y_val.csv').values.ravel()

# Carregar informa√ß√µes do split
with open('../splits/split_indices.json', 'r') as f:
    split_info = json.load(f)

print(f"‚úÖ Dados carregados:")
print(f"   - X_train: {X_train.shape}")
print(f"   - X_val: {X_val.shape}")
print(f"   - y_train: {y_train.shape}")
print(f"   - y_val: {y_val.shape}")
print(f"\nüìä Distribui√ß√£o (treino): Classe 0={split_info['train_class_distribution']['class_0']}, Classe 1={split_info['train_class_distribution']['class_1']}")
print(f"üìä Distribui√ß√£o (val): Classe 0={split_info['val_class_distribution']['class_0']}, Classe 1={split_info['val_class_distribution']['class_1']}")

## üèóÔ∏è Definir Arquitetura MLP 1

**Estrutura:**
```
Input (15) ‚Üí Dense(16, ReLU) + Dropout(0.2) + L2(0.01) ‚Üí Output(1, Sigmoid)
```

**Contagem de par√¢metros:**
- Camada 1: 15 √ó 16 + 16 (bias) = **256 par√¢metros**
- Camada 2: 16 √ó 1 + 1 (bias) = **17 par√¢metros**
- **TOTAL: 273 par√¢metros**

**Rela√ß√£o:** 712 amostras / 273 params = **2.61 amostras por par√¢metro** ‚úÖ

In [None]:
def criar_mlp1(input_dim=15, learning_rate=0.001):
    """
    MLP 1 - Minimalista
    Input(15) ‚Üí Dense(16, ReLU) ‚Üí Output(1, Sigmoid)
    """
    model = Sequential([
        Dense(16, activation='relu', 
              input_dim=input_dim,
              kernel_regularizer=l2(0.01),
              name='hidden_layer'),
        Dropout(0.2, name='dropout'),
        Dense(1, activation='sigmoid', name='output_layer')
    ])
    
    # Compilar modelo
    model.compile(
        optimizer=keras.optimizers.Adam(learning_rate=learning_rate),
        loss='binary_crossentropy',
        metrics=['accuracy', 
                 keras.metrics.Precision(name='precision'),
                 keras.metrics.Recall(name='recall')]
    )
    
    return model

# Criar modelo
model = criar_mlp1(input_dim=X_train.shape[1])

# Resumo da arquitetura
print("="*70)
print("üèóÔ∏è ARQUITETURA MLP 1 - MINIMALISTA")
print("="*70)
model.summary()
print("="*70)

## ‚öôÔ∏è Configurar Hiperpar√¢metros e Callbacks

**Hiperpar√¢metros escolhidos:**
- **Learning Rate:** 0.001 (padr√£o Adam, funciona bem para maioria dos casos)
- **Batch Size:** 32 (equil√≠brio entre estabilidade e velocidade)
- **Epochs:** 200 (com Early Stopping para evitar overfitting)

**Callbacks:**
- **EarlyStopping:** Para se val_loss n√£o melhorar por 20 epochs
- **ModelCheckpoint:** Salva melhor modelo (menor val_loss)
- **ReduceLROnPlateau:** Reduz learning rate se val_loss estagnar

In [None]:
# Hiperpar√¢metros
BATCH_SIZE = 32
EPOCHS = 200
LEARNING_RATE = 0.001

# Callbacks
callbacks = [
    EarlyStopping(
        monitor='val_loss',
        patience=20,
        restore_best_weights=True,
        verbose=1
    ),
    ModelCheckpoint(
        filepath='../artifacts/mlp1_best_model.keras',
        monitor='val_loss',
        save_best_only=True,
        verbose=1
    ),
    ReduceLROnPlateau(
        monitor='val_loss',
        factor=0.5,
        patience=10,
        min_lr=1e-6,
        verbose=1
    )
]

print("‚úÖ Hiperpar√¢metros configurados:")
print(f"   - Batch Size: {BATCH_SIZE}")
print(f"   - Epochs: {EPOCHS}")
print(f"   - Learning Rate: {LEARNING_RATE}")
print(f"   - Callbacks: EarlyStopping, ModelCheckpoint, ReduceLROnPlateau")

## üöÄ Treinar Modelo

Iniciamos o treinamento com valida√ß√£o cont√≠nua para monitorar overfitting.

In [None]:
print("üöÄ Iniciando treinamento...")
print("="*70)

history = model.fit(
    X_train, y_train,
    batch_size=BATCH_SIZE,
    epochs=EPOCHS,
    validation_data=(X_val, y_val),
    callbacks=callbacks,
    verbose=1
)

print("="*70)
print("‚úÖ Treinamento completo!")

## üìä Visualizar Curvas de Aprendizado

Analisamos Loss e Acur√°cia para detectar overfitting/underfitting.

In [None]:
fig, axes = plt.subplots(1, 2, figsize=(15, 5))

# Loss
axes[0].plot(history.history['loss'], label='Train Loss', linewidth=2)
axes[0].plot(history.history['val_loss'], label='Val Loss', linewidth=2)
axes[0].set_xlabel('Epoch', fontsize=12)
axes[0].set_ylabel('Loss', fontsize=12)
axes[0].set_title('MLP 1 - Loss por Epoch', fontsize=14, fontweight='bold')
axes[0].legend(fontsize=11)
axes[0].grid(True, alpha=0.3)

# Acur√°cia
axes[1].plot(history.history['accuracy'], label='Train Accuracy', linewidth=2)
axes[1].plot(history.history['val_accuracy'], label='Val Accuracy', linewidth=2)
axes[1].set_xlabel('Epoch', fontsize=12)
axes[1].set_ylabel('Accuracy', fontsize=12)
axes[1].set_title('MLP 1 - Acur√°cia por Epoch', fontsize=14, fontweight='bold')
axes[1].legend(fontsize=11)
axes[1].grid(True, alpha=0.3)

plt.tight_layout()
plt.savefig('../reports/figures/mlp1_learning_curves.png', dpi=300, bbox_inches='tight')
plt.show()

print("‚úÖ Gr√°ficos salvos em: reports/figures/mlp1_learning_curves.png")

## üéØ Avaliar Performance no Conjunto de Valida√ß√£o

In [None]:
# Predi√ß√µes
y_pred_proba = model.predict(X_val)
y_pred = (y_pred_proba > 0.5).astype(int).ravel()

# M√©tricas
val_accuracy = accuracy_score(y_val, y_pred)

print("="*70)
print("üìä M√âTRICAS DE VALIDA√á√ÉO - MLP 1")
print("="*70)
print(f"\nüéØ Acur√°cia: {val_accuracy:.4f} ({val_accuracy*100:.2f}%)")
print("\n" + "="*70)
print("üìã Classification Report:")
print("="*70)
print(classification_report(y_val, y_pred, target_names=['Morreu (0)', 'Sobreviveu (1)']))
print("="*70)

## üîç Matriz de Confus√£o

In [None]:
cm = confusion_matrix(y_val, y_pred)

plt.figure(figsize=(8, 6))
sns.heatmap(cm, annot=True, fmt='d', cmap='Blues', 
            xticklabels=['Morreu (0)', 'Sobreviveu (1)'],
            yticklabels=['Morreu (0)', 'Sobreviveu (1)'],
            cbar_kws={'label': 'Quantidade'})
plt.xlabel('Predi√ß√£o', fontsize=12)
plt.ylabel('Real', fontsize=12)
plt.title('MLP 1 - Matriz de Confus√£o (Valida√ß√£o)', fontsize=14, fontweight='bold')
plt.tight_layout()
plt.savefig('../reports/figures/mlp1_confusion_matrix.png', dpi=300, bbox_inches='tight')
plt.show()

print("‚úÖ Matriz de confus√£o salva em: reports/figures/mlp1_confusion_matrix.png")

## üíæ Salvar M√©tricas e Modelo Final

In [None]:
# Salvar hist√≥rico de treino
history_dict = {
    'loss': [float(x) for x in history.history['loss']],
    'val_loss': [float(x) for x in history.history['val_loss']],
    'accuracy': [float(x) for x in history.history['accuracy']],
    'val_accuracy': [float(x) for x in history.history['val_accuracy']],
    'precision': [float(x) for x in history.history['precision']],
    'val_precision': [float(x) for x in history.history['val_precision']],
    'recall': [float(x) for x in history.history['recall']],
    'val_recall': [float(x) for x in history.history['val_recall']]
}

with open('../reports/mlp1_history.json', 'w') as f:
    json.dump(history_dict, f, indent=4)

# Salvar m√©tricas finais
metrics = {
    'model_name': 'MLP1_Minimalista',
    'architecture': 'Input(15) -> Dense(16, ReLU) -> Output(1, Sigmoid)',
    'total_params': 273,
    'trainable_params': 273,
    'hyperparameters': {
        'batch_size': BATCH_SIZE,
        'epochs_run': len(history.history['loss']),
        'learning_rate': LEARNING_RATE,
        'dropout': 0.2,
        'l2_regularization': 0.01
    },
    'val_metrics': {
        'accuracy': float(val_accuracy),
        'final_val_loss': float(history.history['val_loss'][-1]),
        'best_val_loss': float(min(history.history['val_loss'])),
        'best_val_accuracy': float(max(history.history['val_accuracy']))
    },
    'confusion_matrix': cm.tolist()
}

with open('../reports/mlp1_metrics.json', 'w') as f:
    json.dump(metrics, f, indent=4)

# Salvar modelo final
model.save('../artifacts/mlp1_final_model.keras')

print("="*70)
print("üíæ ARQUIVOS SALVOS")
print("="*70)
print("‚úÖ Modelo: artifacts/mlp1_best_model.keras (melhor val_loss)")
print("‚úÖ Modelo: artifacts/mlp1_final_model.keras (final)")
print("‚úÖ Hist√≥rico: reports/mlp1_history.json")
print("‚úÖ M√©tricas: reports/mlp1_metrics.json")
print("‚úÖ Gr√°ficos: reports/figures/mlp1_*.png")
print("="*70)

## üéâ RESUMO FINAL - MLP 1

In [None]:
print("\n" + "="*70)
print("üéâ SEMANA 2 COMPLETA - MLP 1 MINIMALISTA")
print("="*70)
print(f"""
‚úÖ Arquitetura implementada: Input(15) ‚Üí Dense(16) ‚Üí Output(1)
‚úÖ Total de par√¢metros: 273
‚úÖ Rela√ß√£o amostras/par√¢metros: {712/273:.2f}

üìä RESULTADOS:
   - Acur√°cia Valida√ß√£o: {val_accuracy*100:.2f}%
   - Melhor Val Loss: {min(history.history['val_loss']):.4f}
   - Melhor Val Accuracy: {max(history.history['val_accuracy'])*100:.2f}%
   - Epochs executados: {len(history.history['loss'])}

üìà AN√ÅLISE:
   - Baseline estabelecido com sucesso
   - Modelo simples e interpret√°vel
   - Baixo risco de overfitting confirmado

üéØ PR√ìXIMOS PASSOS:
   - Semana 3: Implementar MLP 2 (Moderada) ou Ensemble 3
   - Comparar performance entre arquiteturas
   - Decidir melhor modelo para produ√ß√£o
""")
print("="*70)