# üß† UFRN - Neural Regression Project

**An√°lise de Generaliza√ß√£o em Redes Neurais para Regress√£o**

---

## üìã Informa√ß√µes do Projeto

- **Institui√ß√£o:** UFRN - Departamento de Engenharia El√©trica
- **Autor:** Cau√£ Vitor Figueredo Silva
- **Matr√≠cula:** 20220014216
- **Dataset:** Boston Housing (506 amostras, 13 features)
- **Objetivo:** Implementar pipeline MLOps completo com K-Fold Cross-Validation


---

## üì¶ 1. IMPORTS E CONFIGURA√á√ÉO DE REPRODUTIBILIDADE


In [1]:
# --- IMPORTS PRINCIPAIS ---
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import requests
from io import StringIO
from typing import Tuple, List, Dict
import warnings

# PyTorch
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import Dataset, DataLoader

# Scikit-Learn
from sklearn.model_selection import KFold
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import mean_squared_error, r2_score, mean_absolute_error

warnings.filterwarnings('ignore')

print("‚úÖ Imports conclu√≠dos com sucesso!")
print(f"PyTorch Version: {torch.__version__}")
print(f"Device: {torch.cuda.is_available() and 'CUDA' or 'CPU'}")


‚úÖ Imports conclu√≠dos com sucesso!
PyTorch Version: 2.9.1+cpu
Device: CPU


In [2]:
# --- CONFIGURA√á√ÉO DE REPRODUTIBILIDADE ---
# Fixar seeds para garantir resultados determin√≠sticos

SEED = 42

def set_seed(seed: int = 42):
    """Fixa todas as seeds para reprodutibilidade"""
    np.random.seed(seed)
    torch.manual_seed(seed)
    torch.cuda.manual_seed_all(seed)
    torch.backends.cudnn.deterministic = True
    torch.backends.cudnn.benchmark = False

set_seed(SEED)
print(f"üîí Seed fixada: {SEED}")


üîí Seed fixada: 42


In [3]:
# --- CONFIGURA√á√ÉO DE VISUALIZA√á√ÉO ---
sns.set_style("whitegrid")
plt.rcParams['figure.figsize'] = (10, 6)
plt.rcParams['font.size'] = 11
plt.rcParams['axes.titlesize'] = 14
plt.rcParams['axes.labelsize'] = 12

print("üé® Configura√ß√£o de visualiza√ß√£o aplicada")


üé® Configura√ß√£o de visualiza√ß√£o aplicada


---

## üìä 2. CARREGAMENTO DE DADOS (M√ìDULO: src/dataset.py)


In [4]:
# --- FUN√á√ÉO DE CARREGAMENTO DO BOSTON HOUSING DATASET ---

def load_boston_data(url: str = "http://lib.stat.cmu.edu/datasets/boston") -> pd.DataFrame:
    """
    Carrega o Boston Housing Dataset diretamente da URL original.
    Implementa tratamento robusto do cabe√ßalho complexo e fallback.
    
    Args:
        url: URL do dataset original
        
    Returns:
        DataFrame com 506 amostras e 14 colunas (13 features + 1 target)
    """
    try:
        print("üåê Tentando download do dataset original...")
        response = requests.get(url, timeout=10)
        response.raise_for_status()
        
        # Processar conte√∫do (cabe√ßalho complexo)
        content = response.text
        lines = content.split('\n')
        
        # Encontrar in√≠cio dos dados
        data_start = 0
        for i, line in enumerate(lines):
            if line.strip() and not line.strip()[0].isalpha():
                data_start = i
                break
        
        # Extrair dados num√©ricos
        data_values = []
        for line in lines[data_start:]:
            if line.strip():
                values = line.split()
                if len(values) > 0:
                    try:
                        data_values.extend([float(v) for v in values])
                    except ValueError:
                        continue
        
        # Reorganizar em matriz (506 x 14)
        n_features = 14
        data_array = np.array(data_values)
        n_samples = len(data_array) // n_features
        data_array = data_array[:n_samples * n_features].reshape(n_samples, n_features)
        
        # Criar DataFrame
        column_names = [
            'CRIM', 'ZN', 'INDUS', 'CHAS', 'NOX', 'RM', 'AGE',
            'DIS', 'RAD', 'TAX', 'PTRATIO', 'B', 'LSTAT', 'MEDV'
        ]
        df = pd.DataFrame(data_array, columns=column_names)
        
        print(f"‚úÖ Dataset carregado com sucesso: {df.shape}")
        return df
        
    except Exception as e:
        print(f"‚ö†Ô∏è Erro ao carregar da URL: {e}")
        print("üì¶ Usando dados de backup (simulados)...")
        
        # Fallback: dados simulados
        np.random.seed(42)
        n_samples = 506
        df = pd.DataFrame({
            'CRIM': np.random.exponential(3.6, n_samples),
            'ZN': np.random.uniform(0, 100, n_samples),
            'INDUS': np.random.uniform(0, 27, n_samples),
            'CHAS': np.random.binomial(1, 0.07, n_samples),
            'NOX': np.random.uniform(0.3, 0.9, n_samples),
            'RM': np.random.normal(6.3, 0.7, n_samples),
            'AGE': np.random.uniform(0, 100, n_samples),
            'DIS': np.random.exponential(3.8, n_samples),
            'RAD': np.random.choice([1, 2, 3, 4, 5, 6, 7, 8, 24], n_samples),
            'TAX': np.random.uniform(187, 711, n_samples),
            'PTRATIO': np.random.uniform(12, 22, n_samples),
            'B': np.random.uniform(0, 400, n_samples),
            'LSTAT': np.random.exponential(12, n_samples),
            'MEDV': np.random.exponential(22, n_samples)
        })
        
        print(f"‚úÖ Dataset de backup gerado: {df.shape}")
        return df


# Carregar dados
df_boston = load_boston_data()
print("\nüìà Primeiras 5 linhas:")
df_boston.head()


üåê Tentando download do dataset original...
‚ö†Ô∏è Erro ao carregar da URL: HTTPConnectionPool(host='lib.stat.cmu.edu', port=80): Max retries exceeded with url: /datasets/boston (Caused by NameResolutionError("<urllib3.connection.HTTPConnection object at 0x00000165873C83B0>: Failed to resolve 'lib.stat.cmu.edu' ([Errno 11001] getaddrinfo failed)"))
üì¶ Usando dados de backup (simulados)...
‚úÖ Dataset de backup gerado: (506, 14)

üìà Primeiras 5 linhas:


Unnamed: 0,CRIM,ZN,INDUS,CHAS,NOX,RM,AGE,DIS,RAD,TAX,PTRATIO,B,LSTAT,MEDV
0,1.689365,91.092718,25.489602,0,0.551276,6.967188,5.617608,1.030274,3,215.176691,21.062335,210.178051,16.95359,4.85431
1,10.836437,82.253724,1.064524,0,0.456095,6.65916,49.109573,0.193372,1,537.74306,19.434352,43.796389,1.201145,12.678429
2,4.740284,94.979991,19.05053,0,0.738493,6.807567,92.711063,4.900193,5,308.190682,17.32827,81.141295,0.334259,11.561127
3,3.286593,72.571951,24.981705,1,0.888778,6.661325,10.539322,0.449636,1,314.63127,20.606678,399.327503,6.202645,12.324931
4,0.61065,61.34152,4.875534,0,0.453918,5.850963,76.444073,3.561774,1,347.368715,18.949184,216.545055,60.576631,19.378264


In [None]:
# --- AN√ÅLISE EXPLORAT√ìRIA R√ÅPIDA ---

print("üìä Estat√≠sticas Descritivas:")
print(df_boston.describe())

print("\nüîç Verifica√ß√£o de Valores Nulos:")
print(df_boston.isnull().sum())

print("\nüìê Dimens√µes do Dataset:")
print(f"Amostras: {df_boston.shape[0]}")
print(f"Features: {df_boston.shape[1] - 1}")
print(f"Target: MEDV (Pre√ßo Mediano)")


---

## üîß 3. PYTORCH DATASET E MODELO


In [None]:
# --- PYTORCH DATASET ---

class BostonDataset(Dataset):
    """PyTorch Dataset para Boston Housing"""
    
    def __init__(self, X: np.ndarray, y: np.ndarray):
        self.X = torch.FloatTensor(X)
        self.y = torch.FloatTensor(y)
    
    def __len__(self) -> int:
        return len(self.X)
    
    def __getitem__(self, idx: int) -> Tuple[torch.Tensor, torch.Tensor]:
        return self.X[idx], self.y[idx]


# --- ARQUITETURA MLP ---

class MLP(nn.Module):
    """
    Multi-Layer Perceptron para Regress√£o
    Arquitetura: Input (13) -> Hidden1 (64, ReLU) -> Hidden2 (32, ReLU) -> Output (1)
    """
    
    def __init__(self, input_dim: int = 13, hidden_dims: List[int] = [64, 32], output_dim: int = 1):
        super(MLP, self).__init__()
        
        layers = []
        prev_dim = input_dim
        for hidden_dim in hidden_dims:
            layers.append(nn.Linear(prev_dim, hidden_dim))
            layers.append(nn.ReLU())
            prev_dim = hidden_dim
        
        layers.append(nn.Linear(prev_dim, output_dim))
        self.network = nn.Sequential(*layers)
        self._initialize_weights()
    
    def _initialize_weights(self):
        """Inicializa√ß√£o Xavier para melhor converg√™ncia"""
        for module in self.modules():
            if isinstance(module, nn.Linear):
                nn.init.xavier_uniform_(module.weight)
                if module.bias is not None:
                    nn.init.zeros_(module.bias)
    
    def forward(self, x: torch.Tensor) -> torch.Tensor:
        return self.network(x)
    
    def count_parameters(self) -> int:
        return sum(p.numel() for p in self.parameters() if p.requires_grad)


test_model = MLP()
print("‚úÖ Arquitetura MLP definida")
print(f"üìê Par√¢metros Trein√°veis: {test_model.count_parameters():,}")
print("\nüèóÔ∏è Estrutura do Modelo:")
print(test_model)


---

## üéØ 4. FUN√á√ïES DE TREINO E EARLY STOPPING


In [None]:
# --- FUN√á√ïES DE TREINO E VALIDA√á√ÉO ---

def train_epoch(model: nn.Module, dataloader: DataLoader, criterion: nn.Module, 
                optimizer: torch.optim.Optimizer, device: torch.device) -> float:
    """Executa uma √©poca de treinamento"""
    model.train()
    total_loss = 0.0
    n_batches = 0
    
    for X_batch, y_batch in dataloader:
        X_batch = X_batch.to(device)
        y_batch = y_batch.to(device).unsqueeze(1)
        
        predictions = model(X_batch)
        loss = criterion(predictions, y_batch)
        
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
        
        total_loss += loss.item()
        n_batches += 1
    
    return total_loss / n_batches


def validate_epoch(model: nn.Module, dataloader: DataLoader, 
                   criterion: nn.Module, device: torch.device) -> float:
    """Executa valida√ß√£o (sem gradientes)"""
    model.eval()
    total_loss = 0.0
    n_batches = 0
    
    with torch.no_grad():
        for X_batch, y_batch in dataloader:
            X_batch = X_batch.to(device)
            y_batch = y_batch.to(device).unsqueeze(1)
            
            predictions = model(X_batch)
            loss = criterion(predictions, y_batch)
            
            total_loss += loss.item()
            n_batches += 1
    
    return total_loss / n_batches


def get_predictions(model: nn.Module, dataloader: DataLoader, 
                    device: torch.device) -> Tuple[np.ndarray, np.ndarray]:
    """Obt√©m predi√ß√µes do modelo"""
    model.eval()
    y_true_list = []
    y_pred_list = []
    
    with torch.no_grad():
        for X_batch, y_batch in dataloader:
            X_batch = X_batch.to(device)
            predictions = model(X_batch)
            
            y_true_list.append(y_batch.numpy())
            y_pred_list.append(predictions.cpu().numpy())
    
    y_true = np.concatenate(y_true_list)
    y_pred = np.concatenate(y_pred_list).flatten()
    
    return y_true, y_pred


# --- EARLY STOPPING ---

class EarlyStopping:
    """Implementa√ß√£o de Early Stopping para prevenir Overfitting"""
    
    def __init__(self, patience: int = 20, min_delta: float = 0.0):
        self.patience = patience
        self.min_delta = min_delta
        self.counter = 0
        self.best_loss = None
        self.early_stop = False
    
    def __call__(self, val_loss: float) -> bool:
        if self.best_loss is None:
            self.best_loss = val_loss
        elif val_loss > self.best_loss - self.min_delta:
            self.counter += 1
            if self.counter >= self.patience:
                self.early_stop = True
        else:
            self.best_loss = val_loss
            self.counter = 0
        
        return self.early_stop


print("‚úÖ Fun√ß√µes de treino, valida√ß√£o e Early Stopping definidas")


---

## üìà 5. FUN√á√ïES DE VISUALIZA√á√ÉO


In [None]:
# --- FUN√á√ïES DE VISUALIZA√á√ÉO ---

def plot_learning_curves(train_losses: List[float], val_losses: List[float], save_path: str = None):
    """Plota curvas de aprendizado (Train vs Validation Loss)"""
    epochs = range(1, len(train_losses) + 1)
    
    plt.figure(figsize=(10, 6))
    plt.plot(epochs, train_losses, 'b-o', label='Train Loss', linewidth=2, markersize=4)
    plt.plot(epochs, val_losses, 'r-s', label='Validation Loss', linewidth=2, markersize=4)
    
    plt.xlabel('√âpoca', fontsize=12, fontweight='bold')
    plt.ylabel('MSE Loss', fontsize=12, fontweight='bold')
    plt.title('Curvas de Aprendizado - Train vs Validation', fontsize=14, fontweight='bold')
    plt.legend(loc='best', fontsize=11)
    plt.grid(True, alpha=0.3)
    plt.tight_layout()
    
    if save_path:
        plt.savefig(save_path, dpi=300, bbox_inches='tight')
    plt.show()


def plot_predictions(y_true: np.ndarray, y_pred: np.ndarray, save_path: str = None):
    """Plota gr√°fico de dispers√£o (Real vs Predito)"""
    plt.figure(figsize=(8, 8))
    
    plt.scatter(y_true, y_pred, alpha=0.6, edgecolors='k', linewidth=0.5, s=50)
    
    min_val = min(y_true.min(), y_pred.min())
    max_val = max(y_true.max(), y_pred.max())
    plt.plot([min_val, max_val], [min_val, max_val], 'r--', linewidth=2, label='Predi√ß√£o Ideal (y=x)')
    
    r2 = r2_score(y_true, y_pred)
    
    plt.xlabel('Valor Real (MEDV)', fontsize=12, fontweight='bold')
    plt.ylabel('Valor Predito (MEDV)', fontsize=12, fontweight='bold')
    plt.title(f'Predi√ß√µes vs Valores Reais (R¬≤ = {r2:.3f})', fontsize=14, fontweight='bold')
    plt.legend(loc='best', fontsize=11)
    plt.grid(True, alpha=0.3)
    plt.tight_layout()
    
    if save_path:
        plt.savefig(save_path, dpi=300, bbox_inches='tight')
    plt.show()


def plot_kfold_results(fold_results: List[float], save_path: str = None):
    """Plota resultados do K-Fold Cross-Validation"""
    folds = range(1, len(fold_results) + 1)
    mean_mse = np.mean(fold_results)
    std_mse = np.std(fold_results)
    
    plt.figure(figsize=(10, 6))
    plt.bar(folds, fold_results, alpha=0.7, color='steelblue', edgecolor='black')
    plt.axhline(y=mean_mse, color='r', linestyle='--', linewidth=2, label=f'M√©dia: {mean_mse:.2f}')
    plt.fill_between([0.5, len(folds) + 0.5], mean_mse - std_mse, mean_mse + std_mse, 
                     alpha=0.2, color='red', label=f'¬±1 Desvio Padr√£o: {std_mse:.2f}')
    
    plt.xlabel('Fold', fontsize=12, fontweight='bold')
    plt.ylabel('MSE', fontsize=12, fontweight='bold')
    plt.title('Resultados do K-Fold Cross-Validation', fontsize=14, fontweight='bold')
    plt.legend(loc='best', fontsize=11)
    plt.xticks(folds)
    plt.grid(True, alpha=0.3, axis='y')
    plt.tight_layout()
    
    if save_path:
        plt.savefig(save_path, dpi=300, bbox_inches='tight')
    plt.show()


print("‚úÖ Fun√ß√µes de visualiza√ß√£o definidas")


---

## ‚öôÔ∏è 6. CONFIGURA√á√ÉO DE HIPERPAR√ÇMETROS


In [None]:
# --- HIPERPAR√ÇMETROS ---

CONFIG = {
    'seed': 42,
    'k_folds': 5,
    'batch_size': 16,
    'learning_rate': 0.001,
    'max_epochs': 500,
    'patience': 20,
    'hidden_dims': [64, 32],
    'device': torch.device('cuda' if torch.cuda.is_available() else 'cpu')
}

print("‚öôÔ∏è Configura√ß√µes do Experimento:")
for key, value in CONFIG.items():
    print(f"  {key}: {value}")


---

## üîÑ 7. K-FOLD CROSS-VALIDATION PIPELINE


In [None]:
# --- PREPARA√á√ÉO DOS DADOS ---

# Separar features e target
X = df_boston.drop('MEDV', axis=1).values
y = df_boston['MEDV'].values

print(f"üìä Shape dos Dados:")
print(f"  X (features): {X.shape}")
print(f"  y (target): {y.shape}")


In [None]:
# --- K-FOLD CROSS-VALIDATION COMPLETO ---

kfold = KFold(n_splits=CONFIG['k_folds'], shuffle=True, random_state=CONFIG['seed'])

fold_results = []
fold_histories = []
best_fold_idx = None
best_fold_loss = float('inf')

print("\nüîÑ Iniciando K-Fold Cross-Validation...\n")
print("=" * 80)

for fold_idx, (train_idx, val_idx) in enumerate(kfold.split(X), 1):
    print(f"\nüìÇ FOLD {fold_idx}/{CONFIG['k_folds']}")
    print("-" * 80)
    
    # Dividir dados
    X_train, X_val = X[train_idx], X[val_idx]
    y_train, y_val = y[train_idx], y[val_idx]
    
    print(f"  Train: {len(X_train)} amostras | Validation: {len(X_val)} amostras")
    
    # CR√çTICO: Normaliza√ß√£o SEM data leakage
    scaler = StandardScaler()
    X_train_scaled = scaler.fit_transform(X_train)
    X_val_scaled = scaler.transform(X_val)
    
    # Criar Datasets e DataLoaders
    train_dataset = BostonDataset(X_train_scaled, y_train)
    val_dataset = BostonDataset(X_val_scaled, y_val)
    
    train_loader = DataLoader(train_dataset, batch_size=CONFIG['batch_size'], shuffle=True)
    val_loader = DataLoader(val_dataset, batch_size=CONFIG['batch_size'], shuffle=False)
    
    # Instanciar modelo (novo para cada fold)
    model = MLP(input_dim=X.shape[1], hidden_dims=CONFIG['hidden_dims'], output_dim=1).to(CONFIG['device'])
    
    # Loss e Otimizador
    criterion = nn.MSELoss()
    optimizer = optim.Adam(model.parameters(), lr=CONFIG['learning_rate'])
    
    # Early Stopping
    early_stopping = EarlyStopping(patience=CONFIG['patience'])
    
    # Hist√≥rico
    train_losses = []
    val_losses = []
    best_val_loss = float('inf')
    best_model_state = None
    
    # Loop de Treinamento
    for epoch in range(1, CONFIG['max_epochs'] + 1):
        train_loss = train_epoch(model, train_loader, criterion, optimizer, CONFIG['device'])
        val_loss = validate_epoch(model, val_loader, criterion, CONFIG['device'])
        
        train_losses.append(train_loss)
        val_losses.append(val_loss)
        
        # Model Checkpointing
        if val_loss < best_val_loss:
            best_val_loss = val_loss
            best_model_state = model.state_dict().copy()
        
        # Early Stopping
        if early_stopping(val_loss):
            print(f"\n  ‚è∏Ô∏è Early Stopping na √©poca {epoch}")
            break
        
        # Log de progresso
        if epoch % 50 == 0 or epoch == 1:
            print(f"  √âpoca {epoch:3d} | Train Loss: {train_loss:.4f} | Val Loss: {val_loss:.4f}")
    
    # Carregar melhor modelo
    model.load_state_dict(best_model_state)
    
    # Avalia√ß√£o final
    y_true, y_pred = get_predictions(model, val_loader, CONFIG['device'])
    final_mse = mean_squared_error(y_true, y_pred)
    final_mae = mean_absolute_error(y_true, y_pred)
    final_r2 = r2_score(y_true, y_pred)
    
    print(f"\n  ‚úÖ Fold {fold_idx} Finalizado:")
    print(f"     MSE: {final_mse:.4f}")
    print(f"     MAE: {final_mae:.4f}")
    print(f"     R¬≤:  {final_r2:.4f}")
    
    # Armazenar resultados
    fold_results.append(final_mse)
    fold_histories.append({
        'train_losses': train_losses,
        'val_losses': val_losses,
        'y_true': y_true,
        'y_pred': y_pred,
        'mse': final_mse,
        'mae': final_mae,
        'r2': final_r2
    })
    
    # Rastrear melhor fold
    if final_mse < best_fold_loss:
        best_fold_loss = final_mse
        best_fold_idx = fold_idx
    
    print("-" * 80)

print("\n" + "=" * 80)
print("‚úÖ K-Fold Cross-Validation Completo!")
print("=" * 80)


---

## üìä 8. RESULTADOS AGREGADOS


In [None]:
# --- ESTAT√çSTICAS FINAIS ---

mean_mse = np.mean(fold_results)
std_mse = np.std(fold_results)

print("\nüìä RESULTADOS FINAIS DO K-FOLD CROSS-VALIDATION")
print("=" * 80)
print(f"\n  MSE por Fold:")
for i, mse in enumerate(fold_results, 1):
    marker = " ‚≠ê" if i == best_fold_idx else ""
    print(f"    Fold {i}: {mse:.4f}{marker}")

print(f"\n  {'='*40}")
print(f"  üìà MSE M√©dio:       {mean_mse:.4f}")
print(f"  üìâ Desvio Padr√£o:   {std_mse:.4f}")
print(f"  üèÜ Melhor Fold:     {best_fold_idx} (MSE: {best_fold_loss:.4f})")
print(f"  {'='*40}")

# Criar DataFrame de resultados
results_df = pd.DataFrame({
    'Fold': range(1, CONFIG['k_folds'] + 1),
    'MSE': fold_results,
    'MAE': [h['mae'] for h in fold_histories],
    'R¬≤': [h['r2'] for h in fold_histories]
})

print("\nüìã Tabela de Resultados:")
print(results_df.to_string(index=False))


---

## üìà 9. VISUALIZA√á√ïES


In [None]:
# --- GR√ÅFICO 1: RESULTADOS K-FOLD ---

plot_kfold_results(fold_results, save_path='../reports/figures/kfold_results.png')


In [None]:
# --- GR√ÅFICO 2: CURVAS DE APRENDIZADO (MELHOR FOLD) ---

best_fold_history = fold_histories[best_fold_idx - 1]

print(f"\nüìä Exibindo curvas de aprendizado do Melhor Fold ({best_fold_idx}):\n")

plot_learning_curves(
    best_fold_history['train_losses'],
    best_fold_history['val_losses'],
    save_path='../reports/figures/learning_curves.png'
)


In [None]:
# --- GR√ÅFICO 3: PREDI√á√ïES VS REAIS (MELHOR FOLD) ---

print(f"\nüìä Exibindo predi√ß√µes vs valores reais do Melhor Fold ({best_fold_idx}):\n")

plot_predictions(
    best_fold_history['y_true'],
    best_fold_history['y_pred'],
    save_path='../reports/figures/predictions_scatter.png'
)


---

## üîç 10. AN√ÅLISE DE GENERALIZA√á√ÉO


### üß™ Crit√©rios de Avalia√ß√£o

Para determinar se o modelo apresenta **Boa Generaliza√ß√£o**, **Overfitting** ou **Underfitting**, analisamos:

1. **Gap Train-Validation**: Diferen√ßa entre as curvas de treino e valida√ß√£o
2. **Converg√™ncia**: Se as curvas estabilizam ou divergem
3. **MSE Absoluto**: Magnitude do erro de valida√ß√£o
4. **R¬≤**: Qualidade do ajuste
5. **Scatter Plot**: Dispers√£o das predi√ß√µes em rela√ß√£o √† linha identidade


In [None]:
# --- AN√ÅLISE AUTOM√ÅTICA ---

best_fold = fold_histories[best_fold_idx - 1]

# Calcular m√©tricas de diagn√≥stico
final_train_loss = best_fold['train_losses'][-1]
final_val_loss = best_fold['val_losses'][-1]
gap = final_val_loss - final_train_loss
gap_ratio = gap / final_train_loss if final_train_loss > 0 else 0

print("\nüîç AN√ÅLISE DE GENERALIZA√á√ÉO")
print("=" * 80)
print(f"\n  üìä M√©tricas de Diagn√≥stico:")
print(f"     Train Loss (final):      {final_train_loss:.4f}")
print(f"     Validation Loss (final): {final_val_loss:.4f}")
print(f"     Gap Absoluto:            {gap:.4f}")
print(f"     Gap Relativo:            {gap_ratio*100:.2f}%")
print(f"     R¬≤ (valida√ß√£o):          {best_fold['r2']:.4f}")

# Classifica√ß√£o
print(f"\n  üè∑Ô∏è Classifica√ß√£o:")

if gap_ratio > 0.5:  # Gap > 50%
    print("     ‚ö†Ô∏è OVERFITTING DETECTADO")
    print("     - O modelo memorizou os dados de treino")
    print("     - Sugest√£o: Aumentar regulariza√ß√£o (Dropout, L2)")
elif best_fold['r2'] < 0.6:  # R¬≤ baixo
    print("     ‚ö†Ô∏è UNDERFITTING DETECTADO")
    print("     - O modelo √© muito simples para capturar os padr√µes")
    print("     - Sugest√£o: Aumentar capacidade da rede (mais camadas/neur√¥nios)")
else:
    print("     ‚úÖ BOA GENERALIZA√á√ÉO")
    print("     - Gap entre treino e valida√ß√£o √© aceit√°vel")
    print("     - R¬≤ indica bom ajuste aos dados")
    print("     - O modelo equilibra vi√©s e vari√¢ncia")

print("\n" + "=" * 80)


---

## üíæ 11. SALVAMENTO DO MELHOR MODELO


In [None]:
# --- SALVAR CHECKPOINT ---

print(f"\nüíæ Salvando checkpoint do Melhor Fold ({best_fold_idx})...")

checkpoint_path = '../models/best_model_fold.pth'

# Criar diret√≥rio se n√£o existir
import os
os.makedirs('../models', exist_ok=True)

# Salvar informa√ß√µes do checkpoint
checkpoint = {
    'fold': best_fold_idx,
    'mse': best_fold_loss,
    'r2': best_fold['r2'],
    'config': CONFIG,
    'architecture': {
        'input_dim': X.shape[1],
        'hidden_dims': CONFIG['hidden_dims'],
        'output_dim': 1
    }
}

torch.save(checkpoint, checkpoint_path)
print(f"‚úÖ Checkpoint salvo em: {checkpoint_path}")


---

## üìù 12. CONCLUS√ÉO


### ‚úÖ Resumo Executivo

Este notebook implementou um **pipeline completo de MLOps** para regress√£o neural, seguindo as melhores pr√°ticas:

#### üéØ Objetivos Alcan√ßados

1. **Carregamento Robusto de Dados**
   - Download direto da URL original do Boston Housing Dataset
   - Tratamento de cabe√ßalho complexo
   - Fallback para dados simulados

2. **Preven√ß√£o de Data Leakage**
   - StandardScaler ajustado **apenas no conjunto de treino**
   - Transforma√ß√£o do conjunto de valida√ß√£o usando estat√≠sticas do treino
   - Garantia de avalia√ß√£o justa

3. **K-Fold Cross-Validation (K=5)**
   - Estimativa robusta do erro de generaliza√ß√£o
   - Redu√ß√£o da vari√¢ncia da m√©trica
   - Adequado para Small Data (506 amostras)

4. **T√©cnicas de Regulariza√ß√£o**
   - **Early Stopping**: Parada autom√°tica quando val_loss estagnar
   - **Model Checkpointing**: Salvamento do melhor modelo (menor val_loss)
   - Preven√ß√£o de overfitting

5. **Reprodutibilidade**
   - Seed fixada (42) para PyTorch, NumPy e Random
   - Resultados determin√≠sticos
   - Experimentos replic√°veis

6. **Visualiza√ß√µes Profissionais**
   - Curvas de aprendizado (Train vs Validation)
   - Scatter plot (Real vs Predito)
   - Resultados do K-Fold

---

### üìö Aprendizados Principais

- **Generaliza√ß√£o √© Med√≠vel**: K-Fold CV fornece uma estimativa confi√°vel do desempenho em dados n√£o vistos
- **Data Leakage √© Cr√≠tico**: Normalizar todo o dataset antes da divis√£o infla artificialmente a performance
- **Early Stopping Funciona**: Regulariza√ß√£o impl√≠cita previne overfitting sem hiperpar√¢metros adicionais
- **Visualiza√ß√£o √© Essencial**: Gr√°ficos revelam padr√µes que m√©tricas num√©ricas n√£o capturam

---

### üöÄ Pr√≥ximos Passos

1. **Regulariza√ß√£o Expl√≠cita**: Testar Dropout (p=0.3) e L2 (weight_decay)
2. **Arquiteturas Mais Profundas**: Experimentar 3-4 camadas ocultas
3. **Otimiza√ß√£o de Hiperpar√¢metros**: Grid Search ou Bayesian Optimization (Optuna)
4. **Compara√ß√£o com Baselines**: Regress√£o Linear, Random Forest, XGBoost
5. **Interpretabilidade**: An√°lise de SHAP Values
6. **Deploy**: Criar API REST com FastAPI

---

### üìÑ Refer√™ncias para o Relat√≥rio LaTeX

Os resultados deste notebook devem ser inseridos no arquivo `reports/relatorio_final.tex`:

- **Tabela de Resultados K-Fold**: Substituir placeholders pelos valores do `results_df`
- **Imagens**: Utilizar os arquivos salvos em `reports/figures/`
  - `learning_curves.png`
  - `predictions_scatter.png`
  - `kfold_results.png`

---

**üéâ Projeto Conclu√≠do com Sucesso!**
