# üîí FraudGuard Training Notebook

**AD-RL-GNN Fraud Detection** | Full training pipeline with mini-batch processing

This notebook trains the FraudGuard model on the IEEE-CIS fraud detection dataset using:
- **NeighborLoader** for memory-efficient mini-batch training
- **FAISS** for similarity graph construction (GPU if available, CPU fallback)
- **FocalLoss** for class-imbalanced learning

**Target Metrics:**
- Specificity: 98.72%
- G-Means Improvement: 18.11%
- P95 Latency: <100ms

## 1Ô∏è‚É£ Setup Environment

In [1]:
# Mount Google Drive for data storage
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [2]:
# Clone repository
!git clone https://github.com/govind104/fraudguard.git
%cd fraudguard

fatal: destination path 'fraudguard' already exists and is not an empty directory.
/content/fraudguard


In [3]:
# Install dependencies
# Note: faiss-gpu may not be available on Python 3.12
# The code will fallback to faiss-cpu automatically
# GNN training STILL runs on GPU - only graph building uses CPU FAISS
!pip install -q torch torch-geometric pandas numpy scikit-learn pyyaml structlog

# Try faiss-gpu first, fallback to faiss-cpu
import subprocess
result = subprocess.run(['pip', 'install', '-q', 'faiss-gpu'], capture_output=True)
if result.returncode != 0:
    print('‚ö†Ô∏è faiss-gpu not available, using faiss-cpu')
    print('   (Graph building on CPU, but GNN training still runs on GPU!)')
    !pip install -q faiss-cpu
else:
    print('‚úì faiss-gpu installed')

import torch

# 1. Get exact versions
pt_version = torch.__version__.split('+')[0]  # e.g., 2.5.1
cuda_version = "cu" + torch.version.cuda.replace('.', '')  # e.g., cu124
wheel_url = f"https://data.pyg.org/whl/torch-{pt_version}+{cuda_version}.html"

print(f"PyTorch: {pt_version}, CUDA: {cuda_version}")
print(f"Downloading from: {wheel_url}")

# 2. Install with visible output (force reinstall to fix broken partial installs)
!pip install --force-reinstall torch-scatter torch-sparse -f $wheel_url

# Install repo in editable mode
!pip install -e .

print('\n‚úì Environment setup complete')

‚ö†Ô∏è faiss-gpu not available, using faiss-cpu
   (Graph building on CPU, but GNN training still runs on GPU!)
PyTorch: 2.9.0, CUDA: cu126
Downloading from: https://data.pyg.org/whl/torch-2.9.0+cu126.html
Looking in links: https://data.pyg.org/whl/torch-2.9.0+cu126.html
Collecting torch-scatter
  Using cached torch_scatter-2.1.2-cp312-cp312-linux_x86_64.whl
Collecting torch-sparse
  Using cached torch_sparse-0.6.18-cp312-cp312-linux_x86_64.whl
Collecting scipy (from torch-sparse)
  Using cached scipy-1.17.0-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.metadata (62 kB)
Collecting numpy<2.7,>=1.26.4 (from scipy->torch-sparse)
  Using cached numpy-2.4.1-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.metadata (6.6 kB)
Using cached scipy-1.17.0-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (35.0 MB)
Using cached numpy-2.4.1-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (16.4 MB)
Installing collected packages: torch-scatter, numpy, sci

Obtaining file:///content/fraudguard
  Installing build dependencies ... [?25l[?25hdone
  Checking if build backend supports build_editable ... [?25l[?25hdone
  Getting requirements to build editable ... [?25l[?25hdone
  Preparing editable metadata (pyproject.toml) ... [?25l[?25hdone
Collecting numpy<2.0.0,>=1.24.0 (from fraudguard==0.1.0)
  Using cached numpy-1.26.4-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (61 kB)
Using cached numpy-1.26.4-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (18.0 MB)
Building wheels for collected packages: fraudguard
  Building editable for fraudguard (pyproject.toml) ... [?25l[?25hdone
  Created wheel for fraudguard: filename=fraudguard-0.1.0-py3-none-any.whl size=2801 sha256=abd00cd285315796bfb55d9faae784f5c5711c89a41847e0d54c9e072446ded1
  Stored in directory: /tmp/pip-ephem-wheel-cache-2bu4zjn6/wheels/c6/29/62/fb6d8d095576e7e3efddf4fdcb7dfc799af71ace273f1ee84c
Successfully built fraudguard
Installing col


‚úì Environment setup complete


In [3]:
import torch
try:
    import torch_scatter
    import torch_sparse
    import fraudguard
    print("‚úÖ Success! Libraries are installed and loaded.")
except ImportError as e:
    print(f"‚ùå Still missing libraries: {e}")
    # Only if you see this error should you go back and install again.

‚úÖ Success! Libraries are installed and loaded.


## 2Ô∏è‚É£ Configuration

In [4]:
import os

# ==============================================
# CONFIGURATION - UPDATE THESE PATHS AS NEEDED
# ==============================================

# Data paths - Point to your Google Drive folders
DATA_DIR = "/content/drive/MyDrive/ieee-fraud-detection"
MODELS_DIR = "/content/drive/MyDrive/fraudguard-models"
LOGS_DIR = "/content/drive/MyDrive/fraudguard-logs"

# Training parameters
SAMPLE_FRAC = 1.0      # Use full dataset (1.0 = 100%)
MAX_EPOCHS = 30
BATCH_SIZE = 4096      # Reduce to 2048 or 1024 if OOM
NUM_NEIGHBORS = [25, 10]  # 2-hop neighborhood sampling

# Create directories
os.makedirs(MODELS_DIR, exist_ok=True)
os.makedirs(LOGS_DIR, exist_ok=True)

print(f"Data: {DATA_DIR}")
print(f"Models: {MODELS_DIR}")
print(f"Logs: {LOGS_DIR}")
print(f"\nBatch size: {BATCH_SIZE}")
print(f"Sample fraction: {SAMPLE_FRAC*100:.0f}%")

Data: /content/drive/MyDrive/ieee-fraud-detection
Models: /content/drive/MyDrive/fraudguard-models
Logs: /content/drive/MyDrive/fraudguard-logs

Batch size: 4096
Sample fraction: 100%


## 3Ô∏è‚É£ Verify GPU and FAISS

In [5]:
import torch
import faiss

print(f"PyTorch version: {torch.__version__}")
print(f"CUDA available: {torch.cuda.is_available()}")

if torch.cuda.is_available():
    print(f"GPU: {torch.cuda.get_device_name(0)}")
    print(f"VRAM: {torch.cuda.get_device_properties(0).total_memory / 1e9:.1f} GB")
    print("\n‚úì GNN training will run on GPU")
else:
    print("\n‚ö†Ô∏è WARNING: No GPU detected. Go to Runtime > Change runtime type > GPU")

# Check FAISS GPU
faiss_gpus = faiss.get_num_gpus() if hasattr(faiss, 'get_num_gpus') else 0
print(f"\nFAISS GPUs: {faiss_gpus}")
if faiss_gpus == 0:
    print("   (Using CPU FAISS for graph building - this is OK)")

PyTorch version: 2.9.0+cu126
CUDA available: True
GPU: Tesla T4
VRAM: 15.8 GB

‚úì GNN training will run on GPU

FAISS GPUs: 0
   (Using CPU FAISS for graph building - this is OK)


## 4Ô∏è‚É£ Load and Preprocess Data

In [6]:
import sys
sys.path.insert(0, '/content/fraudguard')

from pathlib import Path
from src.data.loader import FraudDataLoader
from src.data.preprocessor import FeaturePreprocessor
from src.data.graph_builder import GraphBuilder
from src.utils.config import load_data_config
from src.utils.device_utils import set_seed, get_device

set_seed(42)
device = get_device()
print(f"Using device: {device}")

# Load config and override path with notebook variable
data_cfg = load_data_config()
data_cfg.paths.raw_data_dir = Path(DATA_DIR)

# Load data with corrected path
loader = FraudDataLoader(config=data_cfg)
df = loader.load_train_data(sample_frac=SAMPLE_FRAC)
train_df, val_df, test_df = loader.create_splits(df)

print(f"\nData loaded:")
print(f"  Train: {len(train_df):,}")
print(f"  Val: {len(val_df):,}")
print(f"  Test: {len(test_df):,}")
print(f"  Fraud rate: {df['isFraud'].mean()*100:.2f}%")

Loading faiss with CPU support (no GPU detected).
Using device: cuda

Data loaded:
  Train: 354,324
  Val: 118,108
  Test: 118,108
  Fraud rate: 3.50%


## 5Ô∏è‚É£ Build or Load Graph

In [7]:
import torch
import gc
import os
from pathlib import Path
from src.data.graph_builder import GraphBuilder
from src.data.preprocessor import FeaturePreprocessor
from src.utils.config import load_model_config, load_data_config

# =======================================================
# 1. MISSING STEP: Preprocess Data to create X_full
# =======================================================
print("‚öôÔ∏è Preprocessing features to create X_full...")
# Initialize Preprocessor
preprocessor = FeaturePreprocessor(data_config=load_data_config(), model_config=load_model_config())

# Fit on Train, Transform Val/Test
X_train = preprocessor.fit_transform(train_df)
X_val = preprocessor.transform(val_df)
X_test = preprocessor.transform(test_df)

# Create the global feature matrix
X_full = torch.cat([X_train, X_val, X_test])
print(f"‚úì Feature Matrix created: {X_full.shape}")

# =======================================================
# 2. Load or Build Graph
# =======================================================
GRAPH_CACHE = f"{MODELS_DIR}/edges_full.pt"

if os.path.exists(GRAPH_CACHE):
    print(f"Loading cached graph from {GRAPH_CACHE}...")
    edge_index = torch.load(GRAPH_CACHE)
    print(f"Loaded {edge_index.shape[1]:,} edges")

    # Move to device
    edge_index = edge_index.to(device)
    X_full = X_full.to(device)
else:
    print("üöÄ Starting Memory-Optimized Graph Build (Directed)...")

    # Configure GraphBuilder
    model_cfg = load_model_config()
    model_cfg.graph.similarity_threshold = 0.90
    model_cfg.graph.max_neighbors = 50
    model_cfg.graph.batch_size = 50000

    builder = GraphBuilder(config=model_cfg)

    # Note: We use the tensors we just created above
    # Train -> Train
    print("  Phase 1: Train -> Train...")
    builder.fit(X_train)

    # Val/Test -> Train
    print("  Phase 2: Val/Test -> Train...")
    # Concatenate Val and Test for the transform step
    X_val_test = torch.cat([X_val, X_test])

    # Use the length of X_train (n_train) to ensure correct indexing
    n_train = len(X_train)
    edge_index = builder.transform(X_val_test, train_size=n_train)

    # Verify
    builder.verify_no_leakage(edge_index, train_size=n_train)

    # Save
    torch.save(edge_index, GRAPH_CACHE)
    print(f"‚úì Saved to {GRAPH_CACHE}")

    # Cleanup builder to free RAM
    del builder
    gc.collect()

    # Move to device
    X_full = X_full.to(device)
    edge_index = edge_index.to(device)

print(f"\nFinal Graph ready on {device}")

‚öôÔ∏è Preprocessing features to create X_full...
‚úì Feature Matrix created: torch.Size([590540, 69])
Loading cached graph from /content/drive/MyDrive/fraudguard-models/edges_full.pt...
Loaded 28,972,713 edges

Final Graph ready on cuda


## 6Ô∏è‚É£ Setup Mini-Batch Training

In [8]:
from torch_geometric.loader import NeighborLoader
from torch_geometric.data import Data
from pathlib import Path
import torch.nn.functional as F
import gc
import sys

# 1. Reload Data briefly to get Labels & Lengths (since we deleted them)
# We need to re-import loader components if they were lost
sys.path.insert(0, '/content/fraudguard')
from src.data.loader import FraudDataLoader
from src.utils.config import load_data_config

print("Reloading data to extract labels...")
data_cfg = load_data_config()
data_cfg.paths.raw_data_dir = Path(DATA_DIR) # Ensure pointing to Drive
loader = FraudDataLoader(config=data_cfg)

# Load and split
df_temp = loader.load_train_data(sample_frac=SAMPLE_FRAC) # Use same sample_frac!
train_df, val_df, test_df = loader.create_splits(df_temp)

# 2. Extract Labels & Sizes
print("Extracting labels...")
train_labels = torch.tensor(train_df["isFraud"].values, dtype=torch.long)
val_labels = torch.tensor(val_df["isFraud"].values, dtype=torch.long)
test_labels = torch.tensor(test_df["isFraud"].values, dtype=torch.long)

n_train = len(train_df)
n_val = len(val_df)
n_test = len(test_df)

# 3. Aggressive Cleanup (Free RAM immediately)
del df_temp, train_df, val_df, test_df
gc.collect()
print("Dataframes deleted to free RAM.")

# 4. Prepare Masks & Labels
all_labels = torch.cat([train_labels, val_labels, test_labels]).to(device)

n_total = n_train + n_val + n_test
train_mask = torch.zeros(n_total, dtype=torch.bool)
val_mask = torch.zeros(n_total, dtype=torch.bool)
test_mask = torch.zeros(n_total, dtype=torch.bool)

# Set masks using the calculated lengths
train_mask[:n_train] = True
val_mask[n_train : n_train + n_val] = True
test_mask[n_train + n_val :] = True

print(f"Masks created: Train={train_mask.sum()}, Val={val_mask.sum()}, Test={test_mask.sum()}")

# 5. Create PyG Data object
# Ensure X_full and edge_index are on the correct device
if X_full.device != device:
    X_full = X_full.to(device)
if edge_index.device != device:
    edge_index = edge_index.to(device)

data = Data(x=X_full, edge_index=edge_index, y=all_labels)
data.train_mask = train_mask
data.val_mask = val_mask
data.test_mask = test_mask

# 6. Create NeighborLoaders
print(f"Initializing NeighborLoaders (Batch Size: {BATCH_SIZE})...")

train_loader = NeighborLoader(
    data,
    num_neighbors=NUM_NEIGHBORS,  # [25, 10]
    batch_size=BATCH_SIZE,
    input_nodes=train_mask,
    shuffle=True
)

val_loader = NeighborLoader(
    data,
    num_neighbors=NUM_NEIGHBORS,
    batch_size=BATCH_SIZE,
    input_nodes=val_mask,
    shuffle=False
)

print(f"‚úì Train batches: {len(train_loader)}")
print(f"‚úì Val batches: {len(val_loader)}")

Reloading data to extract labels...
Extracting labels...
Dataframes deleted to free RAM.
Masks created: Train=354324, Val=118108, Test=118108
Initializing NeighborLoaders (Batch Size: 4096)...


  neighbor_sampler = NeighborSampler(


‚úì Train batches: 87
‚úì Val batches: 29


## 7Ô∏è‚É£ Train Model

In [None]:
from src.models import FraudGNN, FocalLoss, compute_class_weights
from sklearn.metrics import confusion_matrix, f1_score
import numpy as np
import time

# Initialize model
model = FraudGNN(in_channels=X_full.shape[1]).to(device)

# 1. LOWER LEARNING RATE (Crucial for Stability)
# Reduced from 0.01 to 0.001 to prevent the "Panic" collapse
optimizer = torch.optim.Adam(model.parameters(), lr=0.001, weight_decay=1e-4)

# 2. SET BALANCED WEIGHTS
# 15 was too low, 50 was too high.
# The natural ratio is ~28. We use 30 to slightly favor recall.
print("‚öñÔ∏è Applied Balanced Class Weights: [1.0, 30.0]")
weights = torch.tensor([1.0, 30.0]).to(device)
criterion = torch.nn.CrossEntropyLoss(weight=weights)

# Training config
best_gmeans = 0
patience_ratio = 0.20
patience = max(5, int(MAX_EPOCHS * patience_ratio))
patience_counter = 0
history = []

print("Starting training...\n")
print(f"Dynamic Patience: {patience} epochs")
print(f"{'Epoch':>5} | {'Loss':>8} | {'Spec':>8} | {'Recall':>8} | {'F1':>8} | {'G-Means':>8}")
print("-" * 65)

start_time = time.time()

for epoch in range(MAX_EPOCHS):
    model.train()
    total_loss = 0
    for batch in train_loader:
        batch = batch.to(device)
        optimizer.zero_grad()
        out = model(batch.x, batch.edge_index)
        loss = criterion(out[:batch.batch_size], batch.y[:batch.batch_size])
        loss.backward()
        optimizer.step()
        total_loss += loss.item()

    avg_loss = total_loss / len(train_loader)

    # Validation EVERY EPOCH
    if epoch % 1 == 0:
        model.eval()
        all_preds, all_true = [], []
        with torch.no_grad():
            for batch in val_loader:
                batch = batch.to(device)
                out = model(batch.x, batch.edge_index)
                pred = out[:batch.batch_size].argmax(dim=1)
                all_preds.extend(pred.cpu().numpy())
                all_true.extend(batch.y[:batch.batch_size].cpu().numpy())

        cm = confusion_matrix(all_true, all_preds, labels=[0, 1])
        tn, fp, fn, tp = cm.ravel()
        tpr = tp / (tp + fn) if (tp + fn) > 0 else 0
        tnr = tn / (tn + fp) if (tn + fp) > 0 else 0
        gmeans = np.sqrt(tpr * tnr)
        f1 = f1_score(all_true, all_preds, zero_division=0)

        history.append({'epoch': epoch, 'loss': avg_loss, 'spec': tnr, 'recall': tpr, 'f1': f1, 'gmeans': gmeans})

        print(f"{epoch+1:>5} | {avg_loss:>8.4f} | {tnr*100:>7.2f}% | {tpr*100:>7.2f}% | {f1*100:>7.2f}% | {gmeans*100:>7.2f}%")

        if gmeans > best_gmeans:
            best_gmeans = gmeans
            patience_counter = 0
            torch.save(model.state_dict(), f"{MODELS_DIR}/best_model.pt")
        else:
            patience_counter += 1
            if patience_counter >= patience:
                print(f"\nEarly stopping at epoch {epoch+1}")
                break

train_time = time.time() - start_time
print(f"\nTraining complete in {train_time/60:.1f} minutes")
print(f"Best validation G-Means: {best_gmeans*100:.2f}%")

‚öñÔ∏è Applied Balanced Class Weights: [1.0, 30.0]
Starting training...

Dynamic Patience: 6 epochs
Epoch |     Loss |     Spec |   Recall |       F1 |  G-Means
-----------------------------------------------------------------
    1 |   0.8462 |   29.75% |   72.78% |    7.65% |   46.53%


## 8Ô∏è‚É£ Evaluate on Test Set

In [10]:
import time

# Load best model
model.load_state_dict(torch.load(f"{MODELS_DIR}/best_model.pt"))
model.eval()

# Full neighborhood for test evaluation
test_loader = NeighborLoader(
    data,
    num_neighbors=[-1, -1],  # Full neighborhood
    batch_size=BATCH_SIZE,
    input_nodes=test_mask,
    shuffle=False,
)

all_preds, all_true = [], []
latencies = []

with torch.no_grad():
    for batch in test_loader:
        batch = batch.to(device)
        start = time.perf_counter()
        out = model(batch.x, batch.edge_index)
        latencies.append((time.perf_counter() - start) * 1000)
        pred = out[:batch.batch_size].argmax(dim=1)
        all_preds.extend(pred.cpu().numpy())
        all_true.extend(batch.y[:batch.batch_size].cpu().numpy())

# Compute metrics
cm = confusion_matrix(all_true, all_preds, labels=[0, 1])
tn, fp, fn, tp = cm.ravel()
tpr = tp / (tp + fn) if (tp + fn) > 0 else 0
tnr = tn / (tn + fp) if (tn + fp) > 0 else 0
gmeans = np.sqrt(tpr * tnr)
f1 = f1_score(all_true, all_preds, zero_division=0)

print("=" * 60)
print("FINAL TEST RESULTS")
print("=" * 60)
print(f"\nConfusion Matrix:")
print(f"  TP: {tp:,}  |  FN: {fn:,}")
print(f"  FP: {fp:,}  |  TN: {tn:,}")
print(f"\nPerformance:")
print(f"  Specificity:  {tnr*100:.2f}%  (CV target: 98.72%)")
print(f"  Recall:       {tpr*100:.2f}%")
print(f"  F1 Score:     {f1*100:.2f}%")
print(f"  G-Means:      {gmeans*100:.2f}%")
print(f"\nLatency:")
print(f"  Mean: {np.mean(latencies):.1f}ms")
print(f"  P95:  {np.percentile(latencies, 95):.1f}ms  (CV target: <100ms)")
print(f"  P99:  {np.percentile(latencies, 99):.1f}ms")

  neighbor_sampler = NeighborSampler(


FINAL TEST RESULTS

Confusion Matrix:
  TP: 318  |  FN: 3,746
  FP: 7,120  |  TN: 106,924

Performance:
  Specificity:  93.76%  (CV target: 98.72%)
  Recall:       7.82%
  F1 Score:     5.53%
  G-Means:      27.09%

Latency:
  Mean: 2.1ms
  P95:  2.4ms  (CV target: <100ms)
  P99:  2.5ms


## 9Ô∏è‚É£ Save Final Model

In [11]:
# Save final model with metrics
torch.save({
    "model_state_dict": model.state_dict(),
    "config": {
        "in_channels": X_full.shape[1],
        "specificity": tnr,
        "recall": tpr,
        "gmeans": gmeans,
        "f1": f1,
    },
    "history": history,
}, f"{MODELS_DIR}/fraudguard_final.pt")

print(f"‚úì Model saved to {MODELS_DIR}/fraudguard_final.pt")
print(f"‚úì Best model saved to {MODELS_DIR}/best_model.pt")

‚úì Model saved to /content/drive/MyDrive/fraudguard-models/fraudguard_final.pt
‚úì Best model saved to /content/drive/MyDrive/fraudguard-models/best_model.pt


## üîü CV Claims Comparison

In [12]:
# CV Claims comparison
CV_CLAIMS = {
    "specificity": 98.72,
    "gmeans_improvement": 18.11,
    "p95_latency_ms": 100,
}

achieved_spec = tnr * 100
p95_latency = np.percentile(latencies, 95)

print("=" * 60)
print("CV CLAIMS COMPARISON")
print("=" * 60)
print(f"| {'Metric':<20} | {'Achieved':>12} | {'CV Claim':>12} | {'Status':>6} |")
print(f"|{'-'*22}|{'-'*14}|{'-'*14}|{'-'*8}|")

# Specificity
status_spec = "‚úì" if abs(achieved_spec - CV_CLAIMS['specificity']) <= 3 else "‚úó"
print(f"| {'Specificity':<20} | {achieved_spec:>11.2f}% | {CV_CLAIMS['specificity']:>11.2f}% | {status_spec:>6} |")

# Latency
status_lat = "‚úì" if p95_latency < CV_CLAIMS['p95_latency_ms'] else "‚úó"
print(f"| {'P95 Latency':<20} | {p95_latency:>10.1f}ms | {'<100':>10}ms | {status_lat:>6} |")

print("=" * 60)
print("\n‚úì = PASS (within tolerance) | ‚úó = INVESTIGATE")

CV CLAIMS COMPARISON
| Metric               |     Achieved |     CV Claim | Status |
|----------------------|--------------|--------------|--------|
| Specificity          |       93.76% |       98.72% |      ‚úó |
| P95 Latency          |        2.4ms |       <100ms |      ‚úì |

‚úì = PASS (within tolerance) | ‚úó = INVESTIGATE
