# Isolation Forest

Isolation Forest is used as a semi-supervised risk ranker trained on normal behavior and evaluated it using Precision@K under capacity constraints, focusing on ranking stability and failure modes rather than classification accuracy. It is used for finding global anomalies.


In [1]:
import pandas as pd
import numpy as np

from sklearn.ensemble import IsolationForest

In [4]:
X = pd.read_csv("data/processed/X_train_1.csv")
y = pd.read_csv("data/processed/y_train_1.csv")["isFraud"]

print(X.shape, y.shape)
print("Fraud rate:", y.mean())

(590540, 432) (590540,)
Fraud rate: 0.03499000914417313


## Why Isolation Forest?

Isolation Forest identifies anomalies by recursively partitioning feature space using random splits. Points that are isolated quickly (with fewer splits) are considered anomalous.

Unlike density-based or distance-based methods, Isolation Forest:
- Does not assume a specific data distribution
- Scales well to high-dimensional data
- Naturally produces a ranking of anomaly scores

This makes it suitable for risk ranking under weak labels, where the goal is to prioritize suspicious transactions rather than produce calibrated probabilities.


In [5]:
# Use only non-fraud for training (semi-supervised setup)
X_train_if = X[y == 0]

print("Training samples:", X_train_if.shape)

Training samples: (569877, 432)


In [6]:
iso = IsolationForest(
    n_estimators=200,
    max_samples=256,
    contamination="auto",
    random_state=42,
    n_jobs=-1
)

iso.fit(X_train_if)

0,1,2
,n_estimators,200
,max_samples,256
,contamination,'auto'
,max_features,1.0
,bootstrap,False
,n_jobs,-1
,random_state,42
,verbose,0
,warm_start,False


In [7]:
# Higher score = more anomalous
anomaly_score = -iso.score_samples(X)

df_scores = pd.DataFrame({
    "anomaly_score": anomaly_score,
    "isFraud": y.values
})

df_scores.head()

Unnamed: 0,anomaly_score,isFraud
0,0.321438,0
1,0.32506,0
2,0.323418,0
3,0.360741,0
4,0.384998,0


In [8]:
def precision_at_k(scores, labels, k_pct):
    k = int(len(scores) * k_pct / 100)
    top_k = scores.nlargest(k).index
    return labels.loc[top_k].mean()

for k in [0.1, 0.5, 1.0]:
    p_at_k = precision_at_k(
        df_scores["anomaly_score"],
        df_scores["isFraud"],
        k
    )
    print(f"Precision@{k}%: {p_at_k:.4f}")

Precision@0.1%: 0.0000
Precision@0.5%: 0.1497
Precision@1.0%: 0.2403


In [9]:
np.random.seed(42)
df_scores["random_score"] = np.random.rand(len(df_scores))

for k in [0.1, 0.5, 1.0]:
    p_rand = precision_at_k(
        df_scores["random_score"],
        df_scores["isFraud"],
        k
    )
    print(f"Random Precision@{k}%: {p_rand:.4f}")

Random Precision@0.1%: 0.0271
Random Precision@0.5%: 0.0274
Random Precision@1.0%: 0.0293


In [10]:
scores = []

for seed in [0, 1, 2]:
    iso_tmp = IsolationForest(
        n_estimators=200,
        max_samples=256,
        contamination="auto",
        random_state=seed,
        n_jobs=-1
    )
    iso_tmp.fit(X_train_if)
    scores.append(-iso_tmp.score_samples(X))

scores = np.vstack(scores)
np.std(scores, axis=0).mean()

np.float64(0.003600540668596486)

In [11]:
top_idx = df_scores["anomaly_score"].nlargest(1000).index
df_scores.loc[top_idx]["isFraud"].value_counts(normalize=True)

isFraud
0    1.0
Name: proportion, dtype: float64

Isolation Forest prioritizes a subset of transactions that are highly anomalous but not labeled as fraud. These may represent:
- Novel fraud patterns
- Label noise
- Legitimate but rare behavior

This highlights the trade-off between recall and false positives in anomaly-based risk ranking.

In [14]:
for k in [1000, 5000, 10000]:
    top_idx = df_scores["anomaly_score"].nlargest(k).index
    print(k, df_scores.loc[top_idx]["isFraud"].mean())

1000 0.0
5000 0.2254
10000 0.2534


Isolation Forest prioritizes extreme global outliers that are often not fraud. This motivated the use of LOF to capture local density anomalies where fraud is more likely to reside.”

In [12]:
df_scores

Unnamed: 0,anomaly_score,isFraud,random_score
0,0.321438,0,0.374540
1,0.325060,0,0.950714
2,0.323418,0,0.731994
3,0.360741,0,0.598658
4,0.384998,0,0.156019
...,...,...,...
590535,0.329527,0,0.275460
590536,0.319891,0,0.497783
590537,0.318928,0,0.674382
590538,0.387144,0,0.029167


ROC-AUC evaluates ranking across all thresholds, many of which are operationally irrelevant. In fraud detection, decisions are made under fixed review capacity, so Precision@K directly reflects the quality of the ranked list that analysts actually inspect.

In [13]:
df_scores.to_csv("data/processed/if_scores.csv", index=False)