# AI for Anti-Money Laundering (AML) using Graph Analytics

**Copyright (c) 2026 Shrikara Kaudambady. All rights reserved.**

This notebook demonstrates how to use graph-based analysis and unsupervised machine learning to detect potential money laundering activities. We will model financial transactions as a network, extract behavioral features for each account using graph algorithms, and then use an Isolation Forest model to identify the most anomalous accounts.

### 1. Setup and Library Imports

In [None]:
import pandas as pd
import numpy as np
import networkx as nx
from sklearn.ensemble import IsolationForest
import matplotlib.pyplot as plt
import seaborn as sns

sns.set_theme(style="white")

### 2. Data Simulation

We'll generate a synthetic dataset of financial transactions. The dataset will consist primarily of 'normal' transactions, but we will deliberately inject a small, tightly-knit cluster of accounts that represent a potential money laundering ring.

In [None]:
np.random.seed(42)
n_transactions = 2000
n_normal_accounts = 500

# Generate Normal Transactions
normal_accounts = [f'ACC_{i}' for i in range(n_normal_accounts)]
transactions = []
for _ in range(n_transactions):
    from_acc, to_acc = np.random.choice(normal_accounts, 2, replace=False)
    amount = np.random.uniform(10, 5000)
    transactions.append({'from': from_acc, 'to': to_acc, 'amount': amount})

# Inject a Money Laundering Ring (Anomaly)
ring_accounts = ['RING_A', 'RING_B', 'RING_C', 'RING_D']
# High-frequency, circular transactions within the ring
for _ in range(50):
    from_acc, to_acc = np.random.choice(ring_accounts, 2, replace=False)
    amount = np.random.uniform(5000, 9900) # Structuring amounts
    transactions.append({'from': from_acc, 'to': to_acc, 'amount': amount})

df = pd.DataFrame(transactions)

print(f"Generated {len(df)} transactions.")
print("Sample of transactions:")
df.head()

### 3. Graph Construction and Feature Engineering
We'll build a directed graph using `networkx` and then compute graph-based features for each account (node).

In [None]:
# Create a directed graph from the transactions
G = nx.from_pandas_edgelist(df, 'from', 'to', create_using=nx.DiGraph())

print(f"Graph created with {G.number_of_nodes()} nodes and {G.number_of_edges()} edges.")

# Engineer graph-based features for each node
features = pd.DataFrame(index=G.nodes())
features['degree'] = pd.Series(nx.degree_centrality(G))
features['in_degree'] = pd.Series(nx.in_degree_centrality(G))
features['out_degree'] = pd.Series(nx.out_degree_centrality(G))
features['clustering_coeff'] = pd.Series(nx.clustering(G.to_undirected())) # Clustering is for undirected graphs
features['pagerank'] = pd.Series(nx.pagerank(G))

print("\nGraph features computed. Sample:")
features.loc[ring_accounts].head()

Notice how the 'RING' accounts have a very high clustering coefficient, indicating they form a tight-knit community, a classic sign of suspicious activity.

### 4. Anomaly Detection with Isolation Forest

Now, we use an Isolation Forest model to find the accounts with the most unusual combination of these graph features.

In [None]:
model = IsolationForest(n_estimators=100, contamination='auto', random_state=42)

print("Training Isolation Forest model...")
model.fit(features)

features['anomaly_score'] = model.decision_function(features)
features['is_anomaly'] = model.predict(features)

print("Anomaly detection complete.")

### 5. Analysis and Visualization
Let's identify the accounts flagged as anomalies and then visualize the entire graph to see our findings.

In [None]:
anomalies = features[features['is_anomaly'] == -1].sort_values(by='anomaly_score')

print(f"Detected {len(anomalies)} suspicious accounts.")
print("\n--- Top Suspicious Accounts ---")
display(anomalies.head(10))

# Visualize the graph
plt.figure(figsize=(15, 15))
pos = nx.spring_layout(G, k=0.15, iterations=20)

# Draw nodes
node_colors = ['red' if node in anomalies.index else 'skyblue' for node in G.nodes()]
nx.draw_networkx_nodes(G, pos, node_size=50, node_color=node_colors)

# Draw edges
nx.draw_networkx_edges(G, pos, alpha=0.1, edge_color='gray')

# Highlight anomaly labels
labels = {node: node for node in G.nodes() if node in anomalies.index}
nx.draw_networkx_labels(G, pos, labels, font_size=10, font_color='red')

plt.title("Financial Transaction Graph with Detected Anomalies in Red")
plt.show()

print("\nThe visualization clearly shows a separate cluster of red nodes, representing the money laundering ring that our AI model successfully identified.")