# Fraud Investigation with LouieAI Agents

Detect and investigate fraud patterns using specialized agents for graph analysis and anomaly detection.

**Topics covered:**
- Transaction data analysis
- Network analysis for fraud rings
- Statistical anomaly detection
- Interactive investigation dashboards
- Risk assessment and scoring

## Setup and Authentication

In [1]:
from datetime import datetime, timedelta

import numpy as np
import pandas as pd

# For this demonstration, we'll use locally generated data
# To use with LouieAI, ensure you have proper credentials set
print("🔒 Running fraud detection demonstration")
DEMO_MODE = True

🔒 Running fraud detection demonstration


## Generate Transaction Data with Fraud Patterns

In [2]:
# Generate sample transaction data with embedded fraud patterns
np.random.seed(42)
base_time = datetime(2024, 1, 1)

# Create normal and fraudulent transactions
transactions = []
for i in range(500):
    is_fraud = np.random.random() < 0.05  # 5% fraud rate

    if is_fraud:
        # Fraudulent patterns
        amount = np.random.choice(
            [
                np.random.uniform(5000, 10000),  # Unusually high
                0.01,  # Testing transaction
                999.99,  # Just under reporting limit
            ]
        )
        merchant = np.random.choice(["ATM_Foreign", "Online_Casino", "Wire_Transfer"])
        velocity = np.random.randint(1, 10)  # Multiple rapid transactions
    else:
        # Normal patterns
        amount = np.random.lognormal(3.5, 1.5)
        merchant = np.random.choice(
            ["Grocery_Store", "Gas_Station", "Restaurant", "Online_Shop"]
        )
        velocity = 1

    user_id = f"USER_{np.random.randint(1, 101):03d}"
    timestamp = base_time + timedelta(minutes=i * 10 + np.random.randint(-5, 5))

    transactions.append(
        {
            "transaction_id": f"T{i + 1:04d}",
            "user_id": user_id,
            "amount": round(amount, 2),
            "merchant": merchant,
            "timestamp": timestamp,
            "is_suspicious": is_fraud,
        }
    )

transactions_df = pd.DataFrame(transactions)
print(
    f"✅ Generated {len(transactions_df)} transactions with {transactions_df['is_suspicious'].sum()} suspicious patterns"
)
print("\nSample data:")
transactions_df.head()

✅ Generated 500 transactions with 28 suspicious patterns

Sample data:


Unnamed: 0,transaction_id,user_id,amount,merchant,timestamp,is_suspicious
0,T0001,USER_087,6.25,Restaurant,2024-01-01 00:02:00,False
1,T0002,USER_024,53.43,Online_Shop,2024-01-01 00:07:00,False
2,T0003,USER_064,0.01,Online_Casino,2024-01-01 00:19:00,True
3,T0004,USER_042,34.24,Restaurant,2024-01-01 00:27:00,False
4,T0005,USER_064,17.43,Online_Shop,2024-01-01 00:43:00,False


## Basic Statistical Analysis

In [3]:
# Analyze transaction patterns
if DEMO_MODE:
    # Local analysis
    stats = (
        transactions_df.groupby("user_id")
        .agg({"amount": ["count", "sum", "mean", "std"], "is_suspicious": "sum"})
        .round(2)
    )

    # Find top suspicious users
    suspicious_users = (
        stats[stats[("is_suspicious", "sum")] > 0]
        .sort_values(("is_suspicious", "sum"), ascending=False)
        .head(10)
    )

    print("🔍 Top users with suspicious activity:")
    print(suspicious_users)
else:
    # Use LouieAI for analysis
    lui("Analyze transaction patterns and identify suspicious users", transactions_df)
    if lui.df is not None:
        print(f"Analysis complete: {lui.df.shape}")

🔍 Top users with suspicious activity:
         amount                            is_suspicious
          count      sum     mean      std           sum
user_id                                                 
USER_081      4  7350.98  1837.74  3520.41             2
USER_062      5  1222.61   244.52   425.04             2
USER_019      6  1621.20   270.20   387.46             1
USER_022      3   135.60    45.20    74.52             1
USER_023      6  9433.40  1572.23  3249.14             1
USER_025      7  1883.85   269.12   606.36             1
USER_031      1     0.01     0.01      NaN             1
USER_038      8   204.72    25.59    40.79             1
USER_006      5    83.78    16.76    23.97             1
USER_018      2  8116.36  4058.18  5627.92             1


## Anomaly Detection with Z-Scores

In [4]:
# Calculate Z-scores for anomaly detection without scipy
# Z-score = (value - mean) / std_dev
mean_amount = transactions_df["amount"].mean()
std_amount = transactions_df["amount"].std()

# Calculate Z-scores for amounts
transactions_df["amount_zscore"] = np.abs(
    (transactions_df["amount"] - mean_amount) / std_amount
)

# Flag outliers (Z-score > 3)
transactions_df["is_outlier"] = transactions_df["amount_zscore"] > 3

# Show top anomalies
anomalies = (
    transactions_df[transactions_df["is_outlier"]]
    .sort_values("amount_zscore", ascending=False)
    .head(10)
)

print(f"🚨 Found {transactions_df['is_outlier'].sum()} outlier transactions")
print("\nTop 10 anomalous transactions:")
anomalies[["transaction_id", "user_id", "amount", "amount_zscore", "merchant"]]

🚨 Found 7 outlier transactions

Top 10 anomalous transactions:


Unnamed: 0,transaction_id,user_id,amount,amount_zscore,merchant
106,T0107,USER_080,9849.39,9.792456,Wire_Transfer
206,T0207,USER_098,9153.27,9.083782,Online_Casino
21,T0022,USER_023,8182.05,8.095047,Online_Casino
63,T0064,USER_087,8139.47,8.051699,Online_Casino
5,T0006,USER_018,8037.72,7.948114,ATM_Foreign
14,T0015,USER_081,7117.01,7.0108,Wire_Transfer
32,T0033,USER_053,6725.36,6.612087,Wire_Transfer


## Velocity Analysis

In [5]:
# Detect rapid-fire transactions (velocity attacks)
transactions_df["timestamp"] = pd.to_datetime(transactions_df["timestamp"])

# Calculate time between transactions for each user
velocity_check = []
for user in transactions_df["user_id"].unique():
    user_trans = transactions_df[transactions_df["user_id"] == user].sort_values(
        "timestamp"
    )
    if len(user_trans) > 1:
        user_trans["time_diff"] = (
            user_trans["timestamp"].diff().dt.total_seconds() / 60
        )  # in minutes
        rapid = user_trans[user_trans["time_diff"] < 5]  # Less than 5 minutes
        if len(rapid) > 0:
            velocity_check.append(
                {
                    "user_id": user,
                    "rapid_transactions": len(rapid),
                    "total_rapid_amount": rapid["amount"].sum(),
                }
            )

if velocity_check:
    velocity_df = (
        pd.DataFrame(velocity_check)
        .sort_values("rapid_transactions", ascending=False)
        .head(5)
    )
    print("⚡ Users with rapid transaction patterns:")
    print(velocity_df)
else:
    print("No rapid transaction patterns detected")

⚡ Users with rapid transaction patterns:
    user_id  rapid_transactions  total_rapid_amount
0  USER_036                   1              547.65


## Fraud Ring Detection

In [6]:
# Identify potential fraud rings (users sharing suspicious merchants)
fraud_merchants = transactions_df[transactions_df["is_suspicious"]]["merchant"].unique()
fraud_connections = []

for merchant in fraud_merchants:
    users_at_merchant = transactions_df[transactions_df["merchant"] == merchant][
        "user_id"
    ].unique()

    if len(users_at_merchant) > 1:
        fraud_connections.append(
            {
                "merchant": merchant,
                "connected_users": len(users_at_merchant),
                "users": ", ".join(users_at_merchant[:5]),  # Show first 5
            }
        )

if fraud_connections:
    connections_df = pd.DataFrame(fraud_connections).sort_values(
        "connected_users", ascending=False
    )
    print("🕸️ Potential fraud rings detected:")
    print(connections_df)
else:
    print("No fraud rings detected")

🕸️ Potential fraud rings detected:
        merchant  connected_users  \
1    ATM_Foreign               12   
0  Online_Casino                9   
2  Wire_Transfer                7   

                                              users  
1  USER_018, USER_006, USER_057, USER_093, USER_038  
0  USER_064, USER_023, USER_087, USER_050, USER_084  
2  USER_081, USER_053, USER_069, USER_080, USER_062  


## Risk Scoring

In [7]:
# Calculate comprehensive risk scores
risk_scores = []

for user in transactions_df["user_id"].unique():
    user_trans = transactions_df[transactions_df["user_id"] == user]

    # Risk factors
    suspicious_count = user_trans["is_suspicious"].sum()
    outlier_count = (
        user_trans["is_outlier"].sum() if "is_outlier" in user_trans.columns else 0
    )
    high_amounts = (user_trans["amount"] > 5000).sum()
    unique_merchants = user_trans["merchant"].nunique()

    # Calculate risk score (0-100)
    risk_score = min(
        100,
        suspicious_count * 30
        + outlier_count * 20
        + high_amounts * 10
        + max(0, unique_merchants - 10) * 5,
    )

    if risk_score > 0:
        risk_scores.append(
            {
                "user_id": user,
                "risk_score": risk_score,
                "suspicious_transactions": suspicious_count,
                "outliers": outlier_count,
                "high_value_transactions": high_amounts,
            }
        )

if risk_scores:
    risk_df = (
        pd.DataFrame(risk_scores).sort_values("risk_score", ascending=False).head(10)
    )
    print("📊 User Risk Assessment (Top 10):")
    print(risk_df)
    print(
        f"\n✅ Fraud investigation complete! Analyzed {len(transactions_df)} transactions"
    )
else:
    print("✅ No high-risk users identified")

📊 User Risk Assessment (Top 10):
     user_id  risk_score  suspicious_transactions  outliers  \
6   USER_081          90                        2         1   
0   USER_087          60                        1         1   
8   USER_023          60                        1         1   
2   USER_018          60                        1         1   
5   USER_062          60                        2         0   
3   USER_053          60                        1         1   
13  USER_098          60                        1         1   
11  USER_080          60                        1         1   
1   USER_064          30                        1         0   
4   USER_006          30                        1         0   

    high_value_transactions  
6                         1  
0                         1  
8                         1  
2                         1  
5                         0  
3                         1  
13                        1  
11                        1  
1  

## Summary

### Investigation Results:
- **Total Transactions**: 500 records analyzed
- **Suspicious Patterns**: ~5% of transactions flagged
- **Detection Methods**:
  - Statistical anomaly detection (Z-scores)
  - Velocity analysis (rapid transactions)
  - Network analysis (shared merchants)
  - Comprehensive risk scoring

### Next Steps:
1. Review high-risk users for manual investigation
2. Set up real-time monitoring for flagged patterns
3. Adjust detection thresholds based on false positive rates
4. Implement automated blocking for confirmed fraud patterns