Wallet Risk Scoring – Overview


This notebook calculates a risk score (0–1000) for each wallet address based on lending and borrowing behavior. Higher scores indicate wallets with a higher probability of liquidation or risky borrowing patterns.

Synthetic data was generated to simulate Compound V2/V3 activity, including borrowed amount, supplied amount, repayment ratio, liquidation history, and asset volatility.

Key features considered were Borrow to Supply Ratio, Repay Ratio, Liquidation Count, Volatile Asset Ratio, and Wallet Activity Level. These features were normalized and combined using a weighted scoring formula to reflect risk accurately.

The final output is a CSV file containing each wallet_id and its corresponding risk score, which can help identify high-risk borrowers. This approach can be extended to real on-chain data for production use.

In [4]:
import pandas as pd
import numpy as np

wallet_df = pd.read_excel("Wallet id.xlsx")

num_wallets = len(wallet_df)
np.random.seed(42)

data = pd.DataFrame({
    "wallet_id": wallet_df["wallet_id"],
    "total_borrows": np.random.poisson(3, num_wallets),
    "borrowed_amount": np.random.uniform(100, 5000, num_wallets),
    "supplied_amount": np.random.uniform(100, 6000, num_wallets),
    "repaid_amount": np.random.uniform(50, 5000, num_wallets),
    "liquidation_count": np.random.poisson(1, num_wallets),
    "volatile_asset_ratio": np.random.uniform(0, 1, num_wallets),
    "last_active_days_ago": np.random.randint(1, 180, num_wallets)
})

data["borrow_to_supply_ratio"] = data["borrowed_amount"] / (data["supplied_amount"] + 1)
data["repay_ratio"] = data["repaid_amount"] / (data["borrowed_amount"] + 1)

data["borrow_to_supply_ratio"] = data["borrow_to_supply_ratio"].clip(0, 1.5)
data["repay_ratio"] = data["repay_ratio"].clip(0, 1.2)


def normalize(col):
    return (col - col.min()) / (col.max() - col.min() + 1e-9)

data["bsr_norm"] = normalize(data["borrow_to_supply_ratio"])
data["repay_norm"] = 1 - normalize(data["repay_ratio"])
data["liq_norm"] = normalize(data["liquidation_count"])
data["vol_norm"] = normalize(data["volatile_asset_ratio"])
data["activity_norm"] = normalize(data["last_active_days_ago"])


data["score"] = (
    0.3 * data["bsr_norm"] +
    0.25 * data["repay_norm"] +
    0.2 * data["liq_norm"] +
    0.15 * data["vol_norm"] +
    0.1 * data["activity_norm"]
) * 1000


data["score"] = data["score"].round().astype(int)

final_df = data[["wallet_id", "score"]]
final_df.to_csv("wallet_risk_scores.csv", index=False)

print("Risk Scoring Complete! CSV saved as 'wallet_risk_scores.csv'")
print(final_df.head())


Risk Scoring Complete! CSV saved as 'wallet_risk_scores.csv'
                                    wallet_id  score
0  0x0039f22efb07a647557c7c5d17854cfd6d489ef3    448
1  0x06b51c6882b27cb05e712185531c1f74996dd988    506
2  0x0795732aacc448030ef374374eaae57d2965c16c    546
3  0x0aaa79f1a86bc8136cd0d1ca0d51964f4e3766f9    676
4  0x0fe383e5abc200055a7f391f94a5f5d1f844b9ae    687
