# Detec√ß√£o de Anomalias em Transa√ß√µes de Cart√£o de Cr√©dito
## Projeto de Aprendizado de M√°quina - Tema 5: Detec√ß√£o de Anomalias (Fraude)

### Introdu√ß√£o

**Contextualiza√ß√£o do Problema:**
A detec√ß√£o de fraudes em transa√ß√µes de cart√£o de cr√©dito √© um problema cr√≠tico no setor financeiro. Com o aumento das transa√ß√µes digitais, os preju√≠zos causados por fraudes podem alcan√ßar bilh√µes de d√≥lares anualmente. O desafio principal est√° no **desbalanceamento extremo** dos dados, onde transa√ß√µes fraudulentas representam uma fra√ß√£o m√≠nima (< 0.2%) do total.

**Relev√¢ncia Pr√°tica:**
- **Impacto Financeiro:** Fraudes n√£o detectadas geram perdas significativas para institui√ß√µes financeiras e clientes
- **Experi√™ncia do Usu√°rio:** Falsos positivos bloqueiam transa√ß√µes leg√≠timas, causando insatisfa√ß√£o
- **Requisitos de Tempo Real:** Sistemas devem decidir em milissegundos se uma transa√ß√£o √© fraudulenta

**Objetivos do Trabalho:**
1. Realizar an√°lise explorat√≥ria e pr√©-processamento dos dados de transa√ß√µes
2. Implementar e comparar 3 algoritmos de detec√ß√£o de anomalias:
   - **Isolation Forest** (modelo probabil√≠stico)
   - **Local Outlier Factor - LOF** (modelo baseado em densidade)
   - **Autoencoder** (modelo de Deep Learning)
3. Avaliar m√©tricas adequadas para classes desbalanceadas
4. Realizar testes de signific√¢ncia estat√≠stica
5. Discutir aplicabilidade real da solu√ß√£o

**Dataset:** [Credit Card Fraud Detection - Kaggle](https://www.kaggle.com/datasets/mlg-ulb/creditcardfraud)

## 1. Importa√ß√£o das Bibliotecas Necess√°rias

In [None]:

import numpy as np
import pandas as pd
import warnings
warnings.filterwarnings('ignore')
import matplotlib.pyplot as plt
import seaborn as sns
plt.style.use('seaborn-v0_8-whitegrid')
from sklearn.preprocessing import StandardScaler, RobustScaler
from sklearn.model_selection import train_test_split, StratifiedKFold
from sklearn.ensemble import IsolationForest
from sklearn.neighbors import LocalOutlierFactor
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.models import Model, Sequential
from tensorflow.keras.layers import Dense, Input, Dropout, BatchNormalization
from tensorflow.keras.callbacks import EarlyStopping, ReduceLROnPlateau
from sklearn.metrics import (
    classification_report, confusion_matrix, roc_auc_score,
    precision_score, recall_score, f1_score, roc_curve,
    precision_recall_curve, average_precision_score, accuracy_score
)
from scipy import stats
from scipy.stats import wilcoxon, ttest_rel
import time
from collections import Counter

np.random.seed(42)
tf.random.set_seed(42)

print(f"TensorFlow version: {tf.__version__}")

## 2. Carregamento e Explora√ß√£o Inicial dos Dados

**Descri√ß√£o do Dataset:**
O dataset cont√©m transa√ß√µes de cart√µes de cr√©dito realizadas por titulares europeus em setembro de 2013. 
- **284.807 transa√ß√µes** em 2 dias
- **492 fraudes** (0.172% do total)
- Features V1-V28 s√£o resultado de transforma√ß√£o PCA (por confidencialidade)
- Features 'Time' e 'Amount' n√£o foram transformadas

In [None]:

# Link: https://www.kaggle.com/datasets/mlg-ulb/creditcardfraud

df = pd.read_csv('creditcard.csv')

print("INFORMA√á√ïES GERAIS DO DATASET")
print(f"\nShape do dataset: {df.shape}")
print(f"   - Total de transa√ß√µes: {df.shape[0]:,}")
print(f"   - Total de features: {df.shape[1]}")
print(f"\nTipos de dados:")
print(df.dtypes.value_counts())
print(f"\nPrimeiras 5 linhas:")
df.head()

In [None]:
print("ESTAT√çSTICAS DESCRITIVAS")
df.describe()

In [None]:
print("QUALIDADE DOS DADOS")

missing_values = df.isnull().sum()
print(f"\nValores Faltantes:")
if missing_values.sum() == 0:
    print("   N√£o h√° valores faltantes no dataset!")
else:
    print(missing_values[missing_values > 0])

duplicates = df.duplicated().sum()
print(f"\nLinhas Duplicadas: {duplicates:,}")
if duplicates > 0:
    print(f"   {duplicates} linhas duplicadas encontradas")
    # Remover duplicados
    df = df.drop_duplicates()
    print(f"   Duplicados removidos. Novo shape: {df.shape}")
else:
    print("   N√£o h√° linhas duplicadas!")

print(f"\nDistribui√ß√£o da Classe Target (Class):")
print(df['Class'].value_counts())
print(f"\n   Propor√ß√£o de Fraudes: {df['Class'].mean()*100:.4f}%")

## 3. An√°lise Explorat√≥ria de Dados (EDA)

### 3.1 An√°lise do Desbalanceamento de Classes

In [None]:
fig, axes = plt.subplots(1, 2, figsize=(14, 5))

class_counts = df['Class'].value_counts()
colors = ['#2ecc71', '#e74c3c']
bars = axes[0].bar(['Normal (0)', 'Fraude (1)'], class_counts.values, color=colors, edgecolor='black')
axes[0].set_title('Distribui√ß√£o das Classes', fontsize=14, fontweight='bold')
axes[0].set_ylabel('N√∫mero de Transa√ß√µes', fontsize=12)
axes[0].set_xlabel('Classe', fontsize=12)

for bar, count in zip(bars, class_counts.values):
    axes[0].text(bar.get_x() + bar.get_width()/2, bar.get_height() + 1000, 
                 f'{count:,}', ha='center', va='bottom', fontsize=11, fontweight='bold')

explode = (0, 0.1)
axes[1].pie(class_counts.values, explode=explode, labels=['Normal', 'Fraude'], 
            colors=colors, autopct='%1.3f%%', shadow=True, startangle=90,
            textprops={'fontsize': 12})
axes[1].set_title('Propor√ß√£o das Classes', fontsize=14, fontweight='bold')

plt.tight_layout()
plt.savefig('class_distribution.png', dpi=150, bbox_inches='tight')
plt.show()

print("AN√ÅLISE DO DESBALANCEAMENTO")
fraud_count = df[df['Class'] == 1].shape[0]
normal_count = df[df['Class'] == 0].shape[0]
print(f"\nTransa√ß√µes Normais: {normal_count:,} ({normal_count/len(df)*100:.3f}%)")
print(f"Transa√ß√µes Fraudulentas: {fraud_count:,} ({fraud_count/len(df)*100:.3f}%)")
print(f"Raz√£o de Desbalanceamento: 1:{normal_count//fraud_count}")

### 3.2 An√°lise das Features Time e Amount

In [None]:
fig, axes = plt.subplots(2, 2, figsize=(16, 12))

fraud_df = df[df['Class'] == 1]
normal_df = df[df['Class'] == 0]

axes[0, 0].hist(normal_df['Time'], bins=50, alpha=0.7, label='Normal', color='#2ecc71', density=True)
axes[0, 0].hist(fraud_df['Time'], bins=50, alpha=0.7, label='Fraude', color='#e74c3c', density=True)
axes[0, 0].set_title('Distribui√ß√£o de Time por Classe', fontsize=14, fontweight='bold')
axes[0, 0].set_xlabel('Time (segundos)')
axes[0, 0].set_ylabel('Densidade')
axes[0, 0].legend()

axes[0, 1].hist(normal_df['Amount'], bins=50, alpha=0.7, label='Normal', color='#2ecc71', density=True)
axes[0, 1].hist(fraud_df['Amount'], bins=50, alpha=0.7, label='Fraude', color='#e74c3c', density=True)
axes[0, 1].set_title('Distribui√ß√£o de Amount por Classe', fontsize=14, fontweight='bold')
axes[0, 1].set_xlabel('Amount ($)')
axes[0, 1].set_ylabel('Densidade')
axes[0, 1].legend()
axes[0, 1].set_xlim(0, 2000)  

df.boxplot(column='Amount', by='Class', ax=axes[1, 0])
axes[1, 0].set_title('Boxplot de Amount por Classe', fontsize=14, fontweight='bold')
axes[1, 0].set_xlabel('Classe')
axes[1, 0].set_ylabel('Amount ($)')
axes[1, 0].set_ylim(0, 500)
plt.suptitle('')

df['Hour'] = (df['Time'] / 3600) % 24
fraud_hours = df[df['Class'] == 1]['Hour']
normal_hours = df[df['Class'] == 0]['Hour']

axes[1, 1].hist(normal_hours, bins=24, alpha=0.7, label='Normal', color='#2ecc71', density=True)
axes[1, 1].hist(fraud_hours, bins=24, alpha=0.7, label='Fraude', color='#e74c3c', density=True)
axes[1, 1].set_title('Distribui√ß√£o por Hora do Dia', fontsize=14, fontweight='bold')
axes[1, 1].set_xlabel('Hora do Dia')
axes[1, 1].set_ylabel('Densidade')
axes[1, 1].legend()

plt.tight_layout()
plt.savefig('time_amount_analysis.png', dpi=150, bbox_inches='tight')
plt.show()

print("ESTAT√çSTICAS DE TIME E AMOUNT")
print("\nAmount - Transa√ß√µes Normais:")
print(f"   M√©dia: ${normal_df['Amount'].mean():.2f}")
print(f"   Mediana: ${normal_df['Amount'].median():.2f}")
print(f"   Desvio Padr√£o: ${normal_df['Amount'].std():.2f}")
print(f"   M√°ximo: ${normal_df['Amount'].max():.2f}")

print("\nAmount - Transa√ß√µes Fraudulentas:")
print(f"   M√©dia: ${fraud_df['Amount'].mean():.2f}")
print(f"   Mediana: ${fraud_df['Amount'].median():.2f}")
print(f"   Desvio Padr√£o: ${fraud_df['Amount'].std():.2f}")
print(f"   M√°ximo: ${fraud_df['Amount'].max():.2f}")

### 3.3 An√°lise das Features PCA (V1-V28)

In [None]:
v_features = [f'V{i}' for i in range(1, 29)]

fig, axes = plt.subplots(4, 4, figsize=(20, 16))
axes = axes.flatten()

for i, feature in enumerate(v_features[:16]):
    axes[i].hist(normal_df[feature], bins=50, alpha=0.6, label='Normal', color='#2ecc71', density=True)
    axes[i].hist(fraud_df[feature], bins=50, alpha=0.6, label='Fraude', color='#e74c3c', density=True)
    axes[i].set_title(f'Distribui√ß√£o de {feature}', fontsize=10)
    axes[i].legend(fontsize=8)
    
plt.tight_layout()
plt.savefig('v_features_distribution_1.png', dpi=150, bbox_inches='tight')
plt.show()

In [None]:
fig, axes = plt.subplots(4, 4, figsize=(20, 16))
axes = axes.flatten()

for i, feature in enumerate(v_features[12:28]):
    axes[i].hist(normal_df[feature], bins=50, alpha=0.6, label='Normal', color='#2ecc71', density=True)
    axes[i].hist(fraud_df[feature], bins=50, alpha=0.6, label='Fraude', color='#e74c3c', density=True)
    axes[i].set_title(f'Distribui√ß√£o de {feature}', fontsize=10)
    axes[i].legend(fontsize=8)

plt.tight_layout()
plt.savefig('v_features_distribution_2.png', dpi=150, bbox_inches='tight')
plt.show()

### 3.4 Matriz de Correla√ß√£o

In [None]:
plt.figure(figsize=(24, 20))
correlation_matrix = df.corr()

mask = np.triu(np.ones_like(correlation_matrix, dtype=bool))
sns.heatmap(correlation_matrix, mask=mask, annot=False, cmap='RdBu_r', 
            center=0, linewidths=0.5, fmt='.2f',
            cbar_kws={'shrink': 0.8})
plt.title('Matriz de Correla√ß√£o das Features', fontsize=16, fontweight='bold')
plt.tight_layout()
plt.savefig('correlation_matrix.png', dpi=150, bbox_inches='tight')
plt.show()

print("CORRELA√á√ÉO DAS FEATURES COM A CLASSE TARGET")
correlations_with_class = df.corr()['Class'].drop('Class').sort_values(key=abs, ascending=False)
print("\nTop 10 correla√ß√µes mais fortes (em valor absoluto):")
print(correlations_with_class.head(10))
print("\nTop 10 correla√ß√µes mais fracas:")
print(correlations_with_class.tail(10))

## 4. Pr√©-processamento dos Dados

### 4.1 Normaliza√ß√£o das Features

As features V1-V28 j√° est√£o normalizadas (resultado de PCA), mas **Time** e **Amount** precisam ser normalizadas.
Utilizaremos o **RobustScaler** para Amount (menos sens√≠vel a outliers) e **StandardScaler** para Time.

In [None]:
df_processed = df.copy()

if 'Hour' in df_processed.columns:
    df_processed = df_processed.drop('Hour', axis=1)

robust_scaler = RobustScaler()
standard_scaler = StandardScaler()

df_processed['Amount_scaled'] = robust_scaler.fit_transform(df_processed['Amount'].values.reshape(-1, 1))

df_processed['Time_scaled'] = standard_scaler.fit_transform(df_processed['Time'].values.reshape(-1, 1))

df_processed = df_processed.drop(['Time', 'Amount'], axis=1)

print("Normaliza√ß√£o conclu√≠da!")
print(f"\nShape do dataset processado: {df_processed.shape}")
print(f"\nColunas ap√≥s pr√©-processamento:")
print(df_processed.columns.tolist())

### 4.2 Separa√ß√£o dos Dados em Treino, Valida√ß√£o e Teste

Utilizaremos a seguinte divis√£o:
- **Treino:** 70% dos dados
- **Valida√ß√£o:** 15% dos dados  
- **Teste:** 15% dos dados

**Importante:** Para detec√ß√£o de anomalias em abordagem semi-supervisionada, treinaremos os modelos apenas com transa√ß√µes **normais**.

In [None]:
X = df_processed.drop('Class', axis=1)
y = df_processed['Class']

X_train, X_temp, y_train, y_temp = train_test_split(
    X, y, test_size=0.30, random_state=42, stratify=y
)

X_val, X_test, y_val, y_test = train_test_split(
    X_temp, y_temp, test_size=0.50, random_state=42, stratify=y_temp
)

print("DIVIS√ÉO DOS DADOS")

print(f"\nConjunto de TREINO:")
print(f"   Total: {len(X_train):,} amostras ({len(X_train)/len(X)*100:.1f}%)")
print(f"   Normal: {(y_train == 0).sum():,}")
print(f"   Fraude: {(y_train == 1).sum():,}")

print(f"\nConjunto de VALIDA√á√ÉO:")
print(f"   Total: {len(X_val):,} amostras ({len(X_val)/len(X)*100:.1f}%)")
print(f"   Normal: {(y_val == 0).sum():,}")
print(f"   Fraude: {(y_val == 1).sum():,}")

print(f"\nConjunto de TESTE:")
print(f"   Total: {len(X_test):,} amostras ({len(X_test)/len(X)*100:.1f}%)")
print(f"   Normal: {(y_test == 0).sum():,}")
print(f"   Fraude: {(y_test == 1).sum():,}")

In [None]:

X_train_normal = X_train[y_train == 0]
y_train_normal = y_train[y_train == 0]


print("CONJUNTO DE TREINO PARA ABORDAGEM SEMI-SUPERVISIONADA")

print(f"\nüìä Treino apenas com transa√ß√µes NORMAIS:")
print(f"   Total: {len(X_train_normal):,} amostras")
print(f"   (Usaremos este conjunto para treinar os modelos de anomalia)")

## 5. Feature Engineering e Sele√ß√£o de Features

### 5.1 An√°lise de Import√¢ncia das Features

As features V1-V28 j√° s√£o resultado de PCA, aplicado para manter a confidencialidade dos dados originais. Vamos analisar quais features apresentam maior diferen√ßa entre classes para auxiliar na interpreta√ß√£o.

In [None]:
feature_importance = []
features = X.columns.tolist()

for feature in features:
    fraud_mean = df_processed[df_processed['Class'] == 1][feature].mean()
    normal_mean = df_processed[df_processed['Class'] == 0][feature].mean()
    fraud_std = df_processed[df_processed['Class'] == 1][feature].std()
    normal_std = df_processed[df_processed['Class'] == 0][feature].std()
    

    pooled_std = np.sqrt((fraud_std**2 + normal_std**2) / 2)
    if pooled_std > 0:
        effect_size = abs(fraud_mean - normal_mean) / pooled_std
    else:
        effect_size = 0
    
    feature_importance.append({
        'Feature': feature,
        'Fraud_Mean': fraud_mean,
        'Normal_Mean': normal_mean,
        'Effect_Size': effect_size
    })

importance_df = pd.DataFrame(feature_importance)
importance_df = importance_df.sort_values('Effect_Size', ascending=False)

plt.figure(figsize=(14, 8))
top_features = importance_df.head(15)
bars = plt.barh(top_features['Feature'], top_features['Effect_Size'], color='steelblue')
plt.xlabel('Effect Size (Cohen\'s d)', fontsize=12)
plt.ylabel('Feature', fontsize=12)
plt.title('Top 15 Features por Diferen√ßa entre Classes (Effect Size)', fontsize=14, fontweight='bold')
plt.gca().invert_yaxis()
plt.tight_layout()
plt.savefig('feature_importance.png', dpi=150, bbox_inches='tight')
plt.show()

print("\nTop 10 Features com maior diferen√ßa entre classes:")
print(importance_df[['Feature', 'Effect_Size']].head(10).to_string(index=False))

### 5.2 Sele√ß√£o de Features

Para este projeto, utilizaremos **todas as features** dispon√≠veis, pois:
1. As features V1-V28 j√° s√£o componentes principais otimizados
2. Mesmo features com baixa correla√ß√£o individual podem contribuir em combina√ß√£o
3. Os algoritmos de detec√ß√£o de anomalias s√£o capazes de lidar com m√∫ltiplas dimens√µes

Caso seja necess√°rio reduzir dimensionalidade, as features com maior effect size seriam priorizadas.

In [None]:
print("FEATURES SELECIONADAS PARA MODELAGEM")

print(f"\nTotal de features: {X.shape[1]}")
print(f"\nLista de features:")
for i, col in enumerate(X.columns, 1):
    print(f"   {i:2d}. {col}")

## 6. Modelagem

### 6.1 Fun√ß√µes Auxiliares para Avalia√ß√£o

In [None]:
def evaluate_model(y_true, y_pred, y_scores=None, model_name="Model"):
    metrics = {
        'Model': model_name,
        'Accuracy': accuracy_score(y_true, y_pred),
        'Precision': precision_score(y_true, y_pred, zero_division=0),
        'Recall': recall_score(y_true, y_pred, zero_division=0),
        'F1-Score': f1_score(y_true, y_pred, zero_division=0),
    }
    
    if y_scores is not None:
        metrics['AUC-ROC'] = roc_auc_score(y_true, y_scores)
        metrics['AUC-PR'] = average_precision_score(y_true, y_scores)
    
    return metrics


def plot_confusion_matrix(y_true, y_pred, model_name="Model", ax=None):
    cm = confusion_matrix(y_true, y_pred)
    
    if ax is None:
        fig, ax = plt.subplots(figsize=(8, 6))
    
    sns.heatmap(cm, annot=True, fmt='d', cmap='Blues', ax=ax,
                xticklabels=['Normal', 'Fraude'],
                yticklabels=['Normal', 'Fraude'])
    ax.set_xlabel('Predito')
    ax.set_ylabel('Real')
    ax.set_title(f'Matriz de Confus√£o - {model_name}')
    
    return cm


def print_metrics(metrics):
    print(f"M√âTRICAS - {metrics['Model']}")
    print(f"  Acur√°cia:  {metrics['Accuracy']:.4f}")
    print(f"  Precis√£o:  {metrics['Precision']:.4f}")
    print(f"  Recall:    {metrics['Recall']:.4f}")
    print(f"  F1-Score:  {metrics['F1-Score']:.4f}")
    if 'AUC-ROC' in metrics:
        print(f"  AUC-ROC:   {metrics['AUC-ROC']:.4f}")
    if 'AUC-PR' in metrics:
        print(f"  AUC-PR:    {metrics['AUC-PR']:.4f}")

all_results = {}
all_predictions = {}
all_scores = {}
all_times = {}

### 6.2 Modelo 1: Isolation Forest (Modelo Probabil√≠stico)

**Por que Isolation Forest?**
- Baseado no princ√≠pio de que anomalias s√£o "isoladas" mais facilmente
- N√£o assume uma distribui√ß√£o espec√≠fica dos dados
- Eficiente computacionalmente (O(n log n))
- Funciona bem com dados de alta dimensionalidade
- Ideal para detec√ß√£o de outliers em datasets desbalanceados

**Funcionamento:**
1. Constr√≥i √°rvores de decis√£o aleat√≥rias
2. Anomalias requerem menos "cortes" para serem isoladas
3. O score de anomalia √© baseado no caminho m√©dio at√© o isolamento

In [None]:
contamination_rate = y_val.mean()
print(f"Taxa de contamina√ß√£o esperada: {contamination_rate:.4f} ({contamination_rate*100:.2f}%)")

print("\n" + "=" * 60)
print("TREINAMENTO - ISOLATION FOREST")
print("=" * 60)

if_params = {
    'n_estimators': 100,
    'contamination': contamination_rate,
    'max_samples': 'auto',
    'random_state': 42,
    'n_jobs': -1
}

start_time = time.time()
iso_forest = IsolationForest(**if_params)
iso_forest.fit(X_train_normal) 
train_time_if = time.time() - start_time

print(f"Modelo treinado em {train_time_if:.2f} segundos")

start_time = time.time()
if_val_pred_raw = iso_forest.predict(X_val)
if_val_pred = np.where(if_val_pred_raw == -1, 1, 0)  
if_val_scores = -iso_forest.score_samples(X_val) 
inference_time_if_val = time.time() - start_time

print(f"Tempo de infer√™ncia (valida√ß√£o): {inference_time_if_val:.2f} segundos")

if_val_metrics = evaluate_model(y_val, if_val_pred, if_val_scores, "Isolation Forest")
print_metrics(if_val_metrics)

plt.figure(figsize=(8, 6))
plot_confusion_matrix(y_val, if_val_pred, "Isolation Forest (Valida√ß√£o)")
plt.tight_layout()
plt.savefig('confusion_matrix_if_val.png', dpi=150, bbox_inches='tight')
plt.show()

#### 6.2.1 Tunagem de Hiperpar√¢metros - Isolation Forest

In [None]:
print("TUNAGEM DE HIPERPAR√ÇMETROS - ISOLATION FOREST")

n_estimators_list = [50, 100, 200]
max_samples_list = [256, 512, 'auto']
contamination_list = [0.001, 0.002, 0.005, contamination_rate]

best_if_score = 0
best_if_params = {}
if_tuning_results = []

for n_est in n_estimators_list:
    for max_samp in max_samples_list:
        for contam in contamination_list:
            try:
                model = IsolationForest(
                    n_estimators=n_est,
                    max_samples=max_samp,
                    contamination=contam,
                    random_state=42,
                    n_jobs=-1
                )
                model.fit(X_train_normal)
                
                pred_raw = model.predict(X_val)
                pred = np.where(pred_raw == -1, 1, 0)
                scores = -model.score_samples(X_val)
                
                f1 = f1_score(y_val, pred, zero_division=0)
                auc = roc_auc_score(y_val, scores)
                
                if_tuning_results.append({
                    'n_estimators': n_est,
                    'max_samples': max_samp,
                    'contamination': contam,
                    'F1-Score': f1,
                    'AUC-ROC': auc
                })
                
                if f1 > best_if_score:
                    best_if_score = f1
                    best_if_params = {
                        'n_estimators': n_est,
                        'max_samples': max_samp,
                        'contamination': contam
                    }
            except Exception as e:
                print(f"Erro com params {n_est}, {max_samp}, {contam}: {e}")

if_tuning_df = pd.DataFrame(if_tuning_results).sort_values('F1-Score', ascending=False)
print("\nTop 10 combina√ß√µes de hiperpar√¢metros:")
print(if_tuning_df.head(10).to_string(index=False))

print(f"\nMelhores hiperpar√¢metros encontrados:")
print(f"   n_estimators: {best_if_params['n_estimators']}")
print(f"   max_samples: {best_if_params['max_samples']}")
print(f"   contamination: {best_if_params['contamination']}")
print(f"   F1-Score: {best_if_score:.4f}")

In [None]:
print("MODELO FINAL - ISOLATION FOREST")

iso_forest_best = IsolationForest(
    n_estimators=best_if_params['n_estimators'],
    max_samples=best_if_params['max_samples'],
    contamination=best_if_params['contamination'],
    random_state=42,
    n_jobs=-1
)

start_time = time.time()
iso_forest_best.fit(X_train_normal)
train_time_if_best = time.time() - start_time

start_time = time.time()
if_test_pred_raw = iso_forest_best.predict(X_test)
if_test_pred = np.where(if_test_pred_raw == -1, 1, 0)
if_test_scores = -iso_forest_best.score_samples(X_test)
inference_time_if = time.time() - start_time

if_test_metrics = evaluate_model(y_test, if_test_pred, if_test_scores, "Isolation Forest (Best)")
print_metrics(if_test_metrics)
print(f"\nTempo de treino: {train_time_if_best:.2f}s")
print(f"Tempo de infer√™ncia (teste): {inference_time_if:.4f}s")
print(f"Tempo m√©dio por amostra: {inference_time_if/len(X_test)*1000:.4f}ms")

all_results['Isolation Forest'] = if_test_metrics
all_predictions['Isolation Forest'] = if_test_pred
all_scores['Isolation Forest'] = if_test_scores
all_times['Isolation Forest'] = {
    'train': train_time_if_best,
    'inference': inference_time_if,
    'per_sample': inference_time_if/len(X_test)*1000
}

### 6.3 Modelo 2: Local Outlier Factor - LOF (Modelo Baseado em Densidade)

**Por que LOF?**
- Algoritmo baseado em densidade local
- Detecta anomalias considerando a densidade dos vizinhos
- Captura anomalias locais que podem ser diferentes em diferentes regi√µes do espa√ßo
- N√£o assume forma espec√≠fica dos clusters

**Funcionamento:**
1. Calcula a densidade local de cada ponto baseado em k vizinhos
2. Compara a densidade de um ponto com a de seus vizinhos
3. Pontos com densidade significativamente menor s√£o anomalias
4. LOF > 1 indica potencial anomalia

In [None]:

lof_params = {
    'n_neighbors': 20,
    'contamination': contamination_rate,
    'novelty': True,
    'n_jobs': -1
}

start_time = time.time()
lof = LocalOutlierFactor(**lof_params)
lof.fit(X_train_normal)
train_time_lof = time.time() - start_time

print(f"Modelo treinado em {train_time_lof:.2f} segundos")

start_time = time.time()
lof_val_pred_raw = lof.predict(X_val)
lof_val_pred = np.where(lof_val_pred_raw == -1, 1, 0)
lof_val_scores = -lof.score_samples(X_val)
inference_time_lof_val = time.time() - start_time

print(f"Tempo de infer√™ncia (valida√ß√£o): {inference_time_lof_val:.2f} segundos")

lof_val_metrics = evaluate_model(y_val, lof_val_pred, lof_val_scores, "LOF")
print_metrics(lof_val_metrics)

plt.figure(figsize=(8, 6))
plot_confusion_matrix(y_val, lof_val_pred, "LOF (Valida√ß√£o)")
plt.tight_layout()
plt.savefig('confusion_matrix_lof_val.png', dpi=150, bbox_inches='tight')
plt.show()

#### 6.3.1 Tunagem de Hiperpar√¢metros - LOF

In [None]:

n_neighbors_list = [5, 10, 20, 30, 50]
contamination_list_lof = [0.001, 0.002, 0.005, contamination_rate]

best_lof_score = 0
best_lof_params = {}
lof_tuning_results = []

for n_neigh in n_neighbors_list:
    for contam in contamination_list_lof:
        try:
            model = LocalOutlierFactor(
                n_neighbors=n_neigh,
                contamination=contam,
                novelty=True,
                n_jobs=-1
            )
            model.fit(X_train_normal)
            
            pred_raw = model.predict(X_val)
            pred = np.where(pred_raw == -1, 1, 0)
            scores = -model.score_samples(X_val)
            
            f1 = f1_score(y_val, pred, zero_division=0)
            auc = roc_auc_score(y_val, scores)
            
            lof_tuning_results.append({
                'n_neighbors': n_neigh,
                'contamination': contam,
                'F1-Score': f1,
                'AUC-ROC': auc
            })
            
            if f1 > best_lof_score:
                best_lof_score = f1
                best_lof_params = {
                    'n_neighbors': n_neigh,
                    'contamination': contam
                }
        except Exception as e:
            print(f"Erro com params {n_neigh}, {contam}: {e}")

lof_tuning_df = pd.DataFrame(lof_tuning_results).sort_values('F1-Score', ascending=False)
print("\nTop 10 combina√ß√µes de hiperpar√¢metros:")
print(lof_tuning_df.head(10).to_string(index=False))

print(f"\nMelhores hiperpar√¢metros encontrados:")
print(f"   n_neighbors: {best_lof_params['n_neighbors']}")
print(f"   contamination: {best_lof_params['contamination']}")
print(f"   F1-Score: {best_lof_score:.4f}")

In [None]:

print("MODELO FINAL - LOF")

lof_best = LocalOutlierFactor(
    n_neighbors=best_lof_params['n_neighbors'],
    contamination=best_lof_params['contamination'],
    novelty=True,
    n_jobs=-1
)

start_time = time.time()
lof_best.fit(X_train_normal)
train_time_lof_best = time.time() - start_time

start_time = time.time()
lof_test_pred_raw = lof_best.predict(X_test)
lof_test_pred = np.where(lof_test_pred_raw == -1, 1, 0)
lof_test_scores = -lof_best.score_samples(X_test)
inference_time_lof = time.time() - start_time

lof_test_metrics = evaluate_model(y_test, lof_test_pred, lof_test_scores, "LOF (Best)")
print_metrics(lof_test_metrics)
print(f"\nTempo de treino: {train_time_lof_best:.2f}s")
print(f"Tempo de infer√™ncia (teste): {inference_time_lof:.4f}s")
print(f"Tempo m√©dio por amostra: {inference_time_lof/len(X_test)*1000:.4f}ms")

all_results['LOF'] = lof_test_metrics
all_predictions['LOF'] = lof_test_pred
all_scores['LOF'] = lof_test_scores
all_times['LOF'] = {
    'train': train_time_lof_best,
    'inference': inference_time_lof,
    'per_sample': inference_time_lof/len(X_test)*1000
}

### 6.4 Modelo 3: Autoencoder (Deep Learning)

**Por que Autoencoder?**
- Rede neural que aprende a reconstruir os dados de entrada
- Treinado apenas com dados normais, aprende a "representa√ß√£o" de normalidade
- Anomalias t√™m alto erro de reconstru√ß√£o (n√£o se encaixam no padr√£o aprendido)
- Capaz de capturar rela√ß√µes n√£o-lineares complexas entre features

**Arquitetura:**
- **Encoder:** Comprime os dados para uma representa√ß√£o latente
- **Decoder:** Reconstr√≥i os dados a partir da representa√ß√£o latente
- **Detec√ß√£o:** Baseada no erro de reconstru√ß√£o (MSE)

**Treinamento:**
- Treinar apenas com transa√ß√µes normais
- Validar com dados que incluem fraudes
- Threshold definido baseado no percentil do erro de reconstru√ß√£o

In [None]:
def build_autoencoder(input_dim, encoding_dim=14, hidden_layers=[28, 21]):

    input_layer = Input(shape=(input_dim,))
    
    x = input_layer
    for units in hidden_layers:
        x = Dense(units, activation='relu')(x)
        x = BatchNormalization()(x)
        x = Dropout(0.2)(x)
    
    latent = Dense(encoding_dim, activation='relu', name='latent')(x)
    
    x = latent
    for units in reversed(hidden_layers):
        x = Dense(units, activation='relu')(x)
        x = BatchNormalization()(x)
        x = Dropout(0.2)(x)
    
    output_layer = Dense(input_dim, activation='linear')(x)
    
    autoencoder = Model(inputs=input_layer, outputs=output_layer, name='autoencoder')
    encoder = Model(inputs=input_layer, outputs=latent, name='encoder')
    
    return autoencoder, encoder


def get_reconstruction_error(model, data):
    reconstructed = model.predict(data, verbose=0)
    mse = np.mean(np.power(data - reconstructed, 2), axis=1)
    return mse




In [None]:

input_dim = X_train_normal.shape[1]
encoding_dim = 14
hidden_layers = [28, 21]

autoencoder, encoder = build_autoencoder(input_dim, encoding_dim, hidden_layers)

autoencoder.compile(
    optimizer=keras.optimizers.Adam(learning_rate=0.001),
    loss='mse'
)

print("\nArquitetura do Autoencoder:")
autoencoder.summary()

early_stopping = EarlyStopping(
    monitor='val_loss',
    patience=10,
    restore_best_weights=True,
    verbose=1
)

reduce_lr = ReduceLROnPlateau(
    monitor='val_loss',
    factor=0.5,
    patience=5,
    min_lr=1e-6,
    verbose=1
)

print("\nIniciando treinamento...")
start_time = time.time()
history = autoencoder.fit(
    X_train_normal.values,
    X_train_normal.values,
    epochs=100,
    batch_size=256,
    validation_split=0.1,
    callbacks=[early_stopping, reduce_lr],
    verbose=1
)
train_time_ae = time.time() - start_time

print(f"\nTreinamento conclu√≠do em {train_time_ae:.2f} segundos")

In [None]:
fig, axes = plt.subplots(1, 2, figsize=(14, 5))

axes[0].plot(history.history['loss'], label='Treino', linewidth=2)
axes[0].plot(history.history['val_loss'], label='Valida√ß√£o', linewidth=2)
axes[0].set_title('Loss do Autoencoder', fontsize=14, fontweight='bold')
axes[0].set_xlabel('√âpoca')
axes[0].set_ylabel('MSE Loss')
axes[0].legend()
axes[0].grid(True, alpha=0.3)

if 'lr' in history.history:
    axes[1].plot(history.history['lr'], linewidth=2, color='green')
    axes[1].set_title('Learning Rate', fontsize=14, fontweight='bold')
    axes[1].set_xlabel('√âpoca')
    axes[1].set_ylabel('LR')
    axes[1].grid(True, alpha=0.3)
else:
    train_errors = get_reconstruction_error(autoencoder, X_train_normal.values)
    axes[1].hist(train_errors, bins=50, alpha=0.7, color='steelblue')
    axes[1].set_title('Distribui√ß√£o do Erro de Reconstru√ß√£o (Treino Normal)', fontsize=14, fontweight='bold')
    axes[1].set_xlabel('MSE')
    axes[1].set_ylabel('Frequ√™ncia')
    axes[1].axvline(np.percentile(train_errors, 95), color='red', linestyle='--', label='Percentil 95')
    axes[1].legend()

plt.tight_layout()
plt.savefig('autoencoder_training.png', dpi=150, bbox_inches='tight')
plt.show()

#### 6.4.1 Defini√ß√£o do Threshold para Detec√ß√£o

In [None]:

train_normal_errors = get_reconstruction_error(autoencoder, X_train_normal.values)
val_errors = get_reconstruction_error(autoencoder, X_val.values)

val_normal_errors = val_errors[y_val == 0]
val_fraud_errors = val_errors[y_val == 1]

print(f"\nEstat√≠sticas do Erro de Reconstru√ß√£o:")
print(f"\n   Treino (Normal):")
print(f"   - M√©dia: {np.mean(train_normal_errors):.6f}")
print(f"   - Mediana: {np.median(train_normal_errors):.6f}")
print(f"   - Std: {np.std(train_normal_errors):.6f}")
print(f"   - Percentil 95: {np.percentile(train_normal_errors, 95):.6f}")
print(f"   - Percentil 99: {np.percentile(train_normal_errors, 99):.6f}")

print(f"\n   Valida√ß√£o (Normal):")
print(f"   - M√©dia: {np.mean(val_normal_errors):.6f}")
print(f"   - Mediana: {np.median(val_normal_errors):.6f}")

print(f"\n   Valida√ß√£o (Fraude):")
print(f"   - M√©dia: {np.mean(val_fraud_errors):.6f}")
print(f"   - Mediana: {np.median(val_fraud_errors):.6f}")

fig, axes = plt.subplots(1, 2, figsize=(16, 5))

axes[0].hist(val_normal_errors, bins=50, alpha=0.7, label='Normal', color='#2ecc71', density=True)
axes[0].hist(val_fraud_errors, bins=50, alpha=0.7, label='Fraude', color='#e74c3c', density=True)
axes[0].set_title('Distribui√ß√£o do Erro de Reconstru√ß√£o por Classe', fontsize=14, fontweight='bold')
axes[0].set_xlabel('MSE (Erro de Reconstru√ß√£o)')
axes[0].set_ylabel('Densidade')
axes[0].legend()
axes[0].set_xlim(0, np.percentile(val_errors, 99))

val_errors_df = pd.DataFrame({
    'Error': val_errors,
    'Class': y_val.values
})
val_errors_df['Class'] = val_errors_df['Class'].map({0: 'Normal', 1: 'Fraude'})
val_errors_df.boxplot(column='Error', by='Class', ax=axes[1])
axes[1].set_title('Boxplot do Erro de Reconstru√ß√£o por Classe', fontsize=14, fontweight='bold')
axes[1].set_xlabel('Classe')
axes[1].set_ylabel('MSE')
axes[1].set_ylim(0, np.percentile(val_errors, 99))
plt.suptitle('')

plt.tight_layout()
plt.savefig('autoencoder_error_distribution.png', dpi=150, bbox_inches='tight')
plt.show()

In [None]:
percentiles = [90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 99.5]
threshold_results = []

for perc in percentiles:
    threshold = np.percentile(train_normal_errors, perc)
    
    val_pred = (val_errors > threshold).astype(int)
    
    prec = precision_score(y_val, val_pred, zero_division=0)
    rec = recall_score(y_val, val_pred, zero_division=0)
    f1 = f1_score(y_val, val_pred, zero_division=0)
    
    threshold_results.append({
        'Percentile': perc,
        'Threshold': threshold,
        'Precision': prec,
        'Recall': rec,
        'F1-Score': f1
    })

threshold_df = pd.DataFrame(threshold_results)
print("\nResultados por Threshold (Percentil):")
print(threshold_df.to_string(index=False))

best_threshold_idx = threshold_df['F1-Score'].idxmax()
best_threshold = threshold_df.loc[best_threshold_idx, 'Threshold']
best_percentile = threshold_df.loc[best_threshold_idx, 'Percentile']

print(f"\nMelhor threshold encontrado:")
print(f"   Percentil: {best_percentile}")
print(f"   Threshold: {best_threshold:.6f}")
print(f"   F1-Score: {threshold_df.loc[best_threshold_idx, 'F1-Score']:.4f}")

In [None]:
start_time = time.time()
test_errors = get_reconstruction_error(autoencoder, X_test.values)
inference_time_ae = time.time() - start_time

ae_test_pred = (test_errors > best_threshold).astype(int)
ae_test_scores = test_errors 

ae_test_metrics = evaluate_model(y_test, ae_test_pred, ae_test_scores, "Autoencoder (Best)")
print_metrics(ae_test_metrics)
print(f"\nTempo de treino: {train_time_ae:.2f}s")
print(f"Tempo de infer√™ncia (teste): {inference_time_ae:.4f}s")
print(f"Tempo m√©dio por amostra: {inference_time_ae/len(X_test)*1000:.4f}ms")

plt.figure(figsize=(8, 6))
plot_confusion_matrix(y_test, ae_test_pred, "Autoencoder (Teste)")
plt.tight_layout()
plt.savefig('confusion_matrix_ae_test.png', dpi=150, bbox_inches='tight')
plt.show()

all_results['Autoencoder'] = ae_test_metrics
all_predictions['Autoencoder'] = ae_test_pred
all_scores['Autoencoder'] = ae_test_scores
all_times['Autoencoder'] = {
    'train': train_time_ae,
    'inference': inference_time_ae,
    'per_sample': inference_time_ae/len(X_test)*1000
}

## 7. Compara√ß√£o de Resultados

### 7.1 Tabela Comparativa de M√©tricas

In [None]:
results_df = pd.DataFrame([
    all_results['Isolation Forest'],
    all_results['LOF'],
    all_results['Autoencoder']
])

results_df['Train Time (s)'] = [
    all_times['Isolation Forest']['train'],
    all_times['LOF']['train'],
    all_times['Autoencoder']['train']
]
results_df['Inference Time (s)'] = [
    all_times['Isolation Forest']['inference'],
    all_times['LOF']['inference'],
    all_times['Autoencoder']['inference']
]
results_df['Time/Sample (ms)'] = [
    all_times['Isolation Forest']['per_sample'],
    all_times['LOF']['per_sample'],
    all_times['Autoencoder']['per_sample']
]

results_df['Model'] = ['Isolation Forest', 'LOF', 'Autoencoder']
cols = ['Model', 'Accuracy', 'Precision', 'Recall', 'F1-Score', 'AUC-ROC', 'AUC-PR', 
        'Train Time (s)', 'Inference Time (s)', 'Time/Sample (ms)']
results_df = results_df[cols]

print("\nM√©tricas de Desempenho:")
print(results_df.to_string(index=False))

results_df.to_csv('model_comparison_results.csv', index=False)
print("\nResultados salvos em 'model_comparison_results.csv'")

### 7.2 Visualiza√ß√£o Comparativa

In [None]:

fig, axes = plt.subplots(2, 2, figsize=(16, 14))

metrics_to_plot = ['Precision', 'Recall', 'F1-Score', 'AUC-ROC']
models = ['Isolation Forest', 'LOF', 'Autoencoder']
colors = ['#3498db', '#e74c3c', '#2ecc71']

x = np.arange(len(metrics_to_plot))
width = 0.25

for i, model in enumerate(models):
    values = [results_df[results_df['Model'] == model][m].values[0] for m in metrics_to_plot]
    bars = axes[0, 0].bar(x + i*width, values, width, label=model, color=colors[i])
    
axes[0, 0].set_xlabel('M√©trica')
axes[0, 0].set_ylabel('Valor')
axes[0, 0].set_title('Compara√ß√£o de M√©tricas por Modelo', fontsize=14, fontweight='bold')
axes[0, 0].set_xticks(x + width)
axes[0, 0].set_xticklabels(metrics_to_plot)
axes[0, 0].legend()
axes[0, 0].set_ylim(0, 1)
axes[0, 0].grid(axis='y', alpha=0.3)

for model, color in zip(models, colors):
    scores = all_scores[model]
    fpr, tpr, _ = roc_curve(y_test, scores)
    auc = roc_auc_score(y_test, scores)
    axes[0, 1].plot(fpr, tpr, label=f'{model} (AUC={auc:.3f})', color=color, linewidth=2)

axes[0, 1].plot([0, 1], [0, 1], 'k--', linewidth=1, label='Random')
axes[0, 1].set_xlabel('Taxa de Falsos Positivos (FPR)')
axes[0, 1].set_ylabel('Taxa de Verdadeiros Positivos (TPR)')
axes[0, 1].set_title('Curvas ROC', fontsize=14, fontweight='bold')
axes[0, 1].legend(loc='lower right')
axes[0, 1].grid(True, alpha=0.3)

for model, color in zip(models, colors):
    scores = all_scores[model]
    precision, recall, _ = precision_recall_curve(y_test, scores)
    ap = average_precision_score(y_test, scores)
    axes[1, 0].plot(recall, precision, label=f'{model} (AP={ap:.3f})', color=color, linewidth=2)

axes[1, 0].set_xlabel('Recall')
axes[1, 0].set_ylabel('Precision')
axes[1, 0].set_title('Curvas Precision-Recall', fontsize=14, fontweight='bold')
axes[1, 0].legend(loc='upper right')
axes[1, 0].grid(True, alpha=0.3)

times = [all_times[m]['per_sample'] for m in models]
bars = axes[1, 1].bar(models, times, color=colors)
axes[1, 1].set_xlabel('Modelo')
axes[1, 1].set_ylabel('Tempo por Amostra (ms)')
axes[1, 1].set_title('Tempo de Infer√™ncia por Amostra', fontsize=14, fontweight='bold')
for bar, t in zip(bars, times):
    axes[1, 1].text(bar.get_x() + bar.get_width()/2, bar.get_height(), 
                    f'{t:.4f}ms', ha='center', va='bottom', fontsize=10)
axes[1, 1].grid(axis='y', alpha=0.3)

plt.tight_layout()
plt.savefig('model_comparison.png', dpi=150, bbox_inches='tight')
plt.show()

### 7.3 Matrizes de Confus√£o Comparativas

In [None]:
fig, axes = plt.subplots(1, 3, figsize=(18, 5))

for ax, model in zip(axes, models):
    plot_confusion_matrix(y_test, all_predictions[model], model, ax)

plt.tight_layout()
plt.savefig('confusion_matrices_comparison.png', dpi=150, bbox_inches='tight')
plt.show()


for model in models:
    cm = confusion_matrix(y_test, all_predictions[model])
    tn, fp, fn, tp = cm.ravel()
    
    print(f"\n{model}:")
    print(f"   True Negatives (TN): {tn:,} - Normais classificados corretamente")
    print(f"   False Positives (FP): {fp:,} - Normais classificados como fraude")
    print(f"   False Negatives (FN): {fn:,} - Fraudes n√£o detectadas")
    print(f"   True Positives (TP): {tp:,} - Fraudes detectadas")
    
    fraud_detection_rate = tp / (tp + fn) * 100 if (tp + fn) > 0 else 0
    false_alarm_rate = fp / (fp + tn) * 100 if (fp + tn) > 0 else 0
    
    print(f"\n   Taxa de Detec√ß√£o de Fraudes: {fraud_detection_rate:.2f}%")
    print(f"   Taxa de Falso Alarme: {false_alarm_rate:.2f}%")

## 8. Testes de Signific√¢ncia Estat√≠stica

Para validar se as diferen√ßas de desempenho entre os modelos s√£o estatisticamente significativas, utilizaremos o **Teste de McNemar**.

In [None]:
def mcnemar_test(y_true, pred1, pred2, model1_name, model2_name):

    correct1 = (pred1 == y_true)
    correct2 = (pred2 == y_true)
    
    b = np.sum(correct1 & ~correct2)  
    c = np.sum(~correct1 & correct2)  
    
    if b + c == 0:
        return None, None, "N√£o foi poss√≠vel calcular (b+c=0)"
    
    chi2 = (abs(b - c) - 1)**2 / (b + c)
    p_value = 1 - stats.chi2.cdf(chi2, df=1)
    
    return chi2, p_value, f"b={b}, c={c}"

model_pairs = [
    ('Isolation Forest', 'LOF'),
    ('Isolation Forest', 'Autoencoder'),
    ('LOF', 'Autoencoder')
]

significance_results = []

for model1, model2 in model_pairs:
    chi2, p_value, details = mcnemar_test(
        y_test.values,
        all_predictions[model1],
        all_predictions[model2],
        model1, model2
    )
    
    if chi2 is not None:
        significant = "Sim" if p_value < 0.05 else "N√£o"
        significance_results.append({
            'Modelo 1': model1,
            'Modelo 2': model2,
            'Chi¬≤': chi2,
            'p-value': p_value,
            'Significativo (Œ±=0.05)': significant,
            'Detalhes': details
        })
        
        print(f"\n{model1} vs {model2}:")
        print(f"   Chi¬≤ = {chi2:.4f}")
        print(f"   p-value = {p_value:.6f}")
        print(f"   Diferen√ßa significativa: {significant}")
    else:
        print(f"\n{model1} vs {model2}: {details}")

significance_df = pd.DataFrame(significance_results)
print("RESUMO DOS TESTES DE SIGNIFIC√ÇNCIA")
print(significance_df.to_string(index=False))

In [None]:
def bootstrap_auc(y_true, scores, n_bootstraps=1000, confidence=0.95):
    aucs = []
    rng = np.random.RandomState(42)
    
    for _ in range(n_bootstraps):
        indices = rng.randint(0, len(y_true), len(y_true))
        if len(np.unique(y_true[indices])) < 2:
            continue
        auc = roc_auc_score(y_true[indices], scores[indices])
        aucs.append(auc)
    
    alpha = (1 - confidence) / 2
    lower = np.percentile(aucs, alpha * 100)
    upper = np.percentile(aucs, (1 - alpha) * 100)
    mean_auc = np.mean(aucs)
    
    return mean_auc, lower, upper

bootstrap_results = []
for model in models:
    mean_auc, lower, upper = bootstrap_auc(y_test.values, all_scores[model])
    bootstrap_results.append({
        'Modelo': model,
        'AUC M√©dio': mean_auc,
        'IC 95% Inferior': lower,
        'IC 95% Superior': upper
    })
    print(f"\n{model}:")
    print(f"   AUC-ROC: {mean_auc:.4f}")
    print(f"   IC 95%: [{lower:.4f}, {upper:.4f}]")

bootstrap_df = pd.DataFrame(bootstrap_results)

plt.figure(figsize=(10, 6))
for i, model in enumerate(models):
    row = bootstrap_df[bootstrap_df['Modelo'] == model].iloc[0]
    plt.errorbar(i, row['AUC M√©dio'], 
                 yerr=[[row['AUC M√©dio'] - row['IC 95% Inferior']], 
                       [row['IC 95% Superior'] - row['AUC M√©dio']]],
                 fmt='o', markersize=10, capsize=5, capthick=2, color=colors[i],
                 label=model)

plt.xticks(range(len(models)), models)
plt.ylabel('AUC-ROC')
plt.title('Intervalos de Confian√ßa 95% para AUC-ROC', fontsize=14, fontweight='bold')
plt.legend()
plt.grid(axis='y', alpha=0.3)
plt.ylim(0.5, 1.0)
plt.tight_layout()
plt.savefig('auc_confidence_intervals.png', dpi=150, bbox_inches='tight')
plt.show()

## 9. Conclus√£o e Discuss√£o

### 9.1 Resumo dos Resultados

In [None]:

best_by_metric = {
    'AUC-ROC': results_df.loc[results_df['AUC-ROC'].idxmax(), 'Model'],
    'F1-Score': results_df.loc[results_df['F1-Score'].idxmax(), 'Model'],
    'Recall': results_df.loc[results_df['Recall'].idxmax(), 'Model'],
    'Precision': results_df.loc[results_df['Precision'].idxmax(), 'Model'],
}

print("\nMelhor modelo por m√©trica:")
for metric, model in best_by_metric.items():
    value = results_df[results_df['Model'] == model][metric].values[0]
    print(f"   {metric}: {model} ({value:.4f})")

print("TABELA RESUMO")
print(results_df.round(4).to_string(index=False))

### 9.2 An√°lise dos Modelos

#### **Isolation Forest**
**Pontos Fortes:**
- R√°pido para treinar e fazer infer√™ncia
- N√£o assume distribui√ß√£o espec√≠fica dos dados
- Escala bem para grandes datasets
- Bom desempenho geral com pouca tunagem

**Limita√ß√µes:**
- Sens√≠vel ao par√¢metro de contamina√ß√£o
- Pode n√£o capturar padr√µes locais complexos

#### **Local Outlier Factor (LOF)**
**Pontos Fortes:**
- Considera a densidade local dos dados
- Capaz de detectar anomalias em clusters de diferentes densidades
- Interpret√°vel (baseado em vizinhos)

**Limita√ß√µes:**
- Computacionalmente mais caro para infer√™ncia
- Sens√≠vel √† escolha de k (n√∫mero de vizinhos)
- N√£o escalona t√£o bem para datasets muito grandes

#### **Autoencoder**
**Pontos Fortes:**
- Captura rela√ß√µes n√£o-lineares complexas
- Flex√≠vel na arquitetura
- Aprende representa√ß√µes latentes √∫teis
- Pode ser adaptado para diferentes tipos de dados

**Limita√ß√µes:**
- Requer mais tempo de treino
- Necessita escolha do threshold de decis√£o
- Pode sofrer de overfitting
- Requer mais dados para treinar adequadamente

### 9.3 Discuss√£o sobre Aplicabilidade Real

**Cen√°rio de Produ√ß√£o:**
1. **Lat√™ncia:** Para sistemas em tempo real, Isolation Forest oferece melhor tempo de infer√™ncia
2. **Custo de Erros:** 
   - Falsos Negativos (fraudes n√£o detectadas) = preju√≠zo financeiro
   - Falsos Positivos (transa√ß√µes leg√≠timas bloqueadas) = insatisfa√ß√£o do cliente
3. **Trade-off Precision-Recall:** 
   - Priorizar Recall se o custo de fraude n√£o detectada for muito alto
   - Balancear com Precision para n√£o bloquear muitas transa√ß√µes leg√≠timas

**Recomenda√ß√µes:**
1. **Para alta taxa de detec√ß√£o:** Priorizar modelo com maior Recall
2. **Para minimizar falsos alarmes:** Priorizar modelo com maior Precision
3. **Para equil√≠brio:** Usar F1-Score como m√©trica principal
4. **Ensemble:** Combinar m√∫ltiplos modelos para melhor robustez

### 9.4 Limita√ß√µes do Estudo

1. **Dataset:**
   - Features originais anonimizadas (PCA) limitam interpretabilidade
   - Dados de apenas 2 dias podem n√£o capturar padr√µes sazonais
   
2. **Metodologia:**
   - Abordagem semi-supervisionada assume conhecimento da propor√ß√£o de fraudes
   - Threshold fixo pode n√£o ser √≥timo para diferentes cen√°rios
   
3. **Avalia√ß√£o:**
   - M√©tricas offline podem n√£o refletir desempenho em produ√ß√£o
   - Concept drift n√£o foi considerado

### 9.5 Trabalhos Futuros

1. **Ensemble de Modelos:** Combinar predi√ß√µes dos 3 modelos
2. **Aprendizado Online:** Adaptar modelos ao longo do tempo
3. **Features Adicionais:** Incluir padr√µes temporais e comportamentais
4. **Interpretabilidade:** Utilizar t√©cnicas como SHAP para explicar decis√µes
5. **Deep Learning:** Explorar arquiteturas mais complexas (LSTM, Transformer)
6. **Federated Learning:** Treinar modelos distribu√≠dos preservando privacidade

In [None]:
import joblib

import os
os.makedirs('models', exist_ok=True)

joblib.dump(iso_forest_best, 'models/isolation_forest_best.joblib')

joblib.dump(lof_best, 'models/lof_best.joblib')

autoencoder.save('models/autoencoder_best.h5')
autoencoder.save('models/autoencoder_best.h5')

joblib.dump(robust_scaler, 'models/robust_scaler.joblib')
joblib.dump(standard_scaler, 'models/standard_scaler.joblib')

with open('models/autoencoder_threshold.txt', 'w') as f:
    f.write(str(best_threshold))

print("Modelos salvos com sucesso em ./models/")
print("   - isolation_forest_best.joblib")
print("   - lof_best.joblib")
print("   - autoencoder_best.h5")
print("   - robust_scaler.joblib")
print("   - standard_scaler.joblib")
print("   - autoencoder_threshold.txt")

## 10. Refer√™ncias

1. **Dataset:**
   - Machine Learning Group - ULB. (2013). Credit Card Fraud Detection. Kaggle.
   - https://www.kaggle.com/datasets/mlg-ulb/creditcardfraud

2. **Isolation Forest:**
   - Liu, F. T., Ting, K. M., & Zhou, Z. H. (2008). Isolation forest. In 2008 eighth ieee international conference on data mining (pp. 413-422). IEEE.

3. **Local Outlier Factor:**
   - Breunig, M. M., Kriegel, H. P., Ng, R. T., & Sander, J. (2000). LOF: identifying density-based local outliers. In Proceedings of the 2000 ACM SIGMOD international conference on Management of data (pp. 93-104).

4. **Autoencoders for Anomaly Detection:**
   - An, J., & Cho, S. (2015). Variational autoencoder based anomaly detection using reconstruction probability. Special Lecture on IE, 2(1), 1-18.

5. **M√©tricas para Classes Desbalanceadas:**
   - He, H., & Garcia, E. A. (2009). Learning from imbalanced data. IEEE Transactions on knowledge and data engineering, 21(9), 1263-1284.

---

**Disciplina:** Aprendizado de M√°quina  
**Institui√ß√£o:** Centro de Inform√°tica - UFPE  
