# 03_evaluation_for_imbalanced_data.ipynb

## Objective

Evaluate fraud detection models under **extreme class imbalance**, focusing on business-aligned metrics instead of misleading accuracy.


## Why Standard Metrics Fail

* Accuracy hides fraud miss-rate
* ROC-AUC can look good while fraud recall is poor
* Class imbalance distorts threshold-based decisions



## Key Evaluation Metrics

### 1. Precision, Recall, F1

* Precision: cost of false positives
* Recall: cost of missed fraud
* F1: balance signal (use cautiously)


```python
from sklearn.metrics import classification_report
```

### 2. Precision–Recall Curve

Preferred for rare-event detection.

```python
from sklearn.metrics import precision_recall_curve, average_precision_score
```

* PR-AUC reflects fraud detection quality
* Visualizes trade-off between investigation load and fraud capture

### 3. ROC Curve (Secondary)

Useful for model comparison, **not deployment**.

```python
from sklearn.metrics import roc_auc_score
```

## Threshold Analysis

### Why 0.5 Is Wrong

* Fraud models output risk, not truth
* Threshold must reflect business constraints

```python
import numpy as np
```

Analyze:

* Recall vs threshold
* Precision vs threshold
* Fraud caught vs alerts generated

## Cost-Sensitive Evaluation

Define costs:

* False Negative (missed fraud)
* False Positive (manual review)

# example cost function
```python
expected_cost = fn * cost_fn + fp * cost_fp

```

Select threshold minimizing expected loss.



## Confusion Matrix at Multiple Thresholds

```python
from sklearn.metrics import confusion_matrix
```

Compare operational impact at different cutoffs.

## Model Comparison Summary Table

Track:

* PR-AUC
* Recall @ fixed FP rate
* Expected cost

## Production Checklist

* Always evaluate on **time-based split**
* Lock threshold before deployment
* Monitor precision drift post-release

## Output of This Notebook

* Selected operating threshold
* Approved evaluation metrics
* Go/No-Go decision for deployment

---

## Next Notebook

- [04_pipeline_integration_for_imbalanced.ipynb](04_pipeline_integration_for_imbalanced.ipynb)

> End-to-end pipeline: preprocessing → features → model → threshold → inference