**Setup**

In [28]:
import pandas as pd
import numpy as np
import string


**Read Wallet Data**

In [29]:
df = pd.read_csv("/content/wallet.csv")


In [30]:
df.head(5)

Unnamed: 0,wallet_id
0,0x0039f22efb07a647557c7c5d17854cfd6d489ef3
1,0x06b51c6882b27cb05e712185531c1f74996dd988
2,0x0795732aacc448030ef374374eaae57d2965c16c
3,0x0aaa79f1a86bc8136cd0d1ca0d51964f4e3766f9
4,0x0fe383e5abc200055a7f391f94a5f5d1f844b9ae


**Feature Functions**

In [31]:
def address_entropy(address):
    """Shannon entropy of address characters."""
    from collections import Counter
    chars = address[2:].lower()  # remove '0x'
    counts = Counter(chars)
    probs = [count / len(chars) for count in counts.values()]
    entropy = -sum(p * np.log2(p) for p in probs)
    return entropy

def hex_diversity(address):
    """Count unique hex characters used in address."""
    return len(set(address[2:].lower()))

def repeating_patterns(address):
    """Check for repeating characters or suspiciously simple patterns."""
    addr = address[2:].lower()
    if all(c == addr[0] for c in addr): return 1  # All same
    if addr.count('f') > 20 or addr.count('0') > 20: return 1
    return 0

**Applying Feature Engineering**

In [32]:
df["entropy"] = df["wallet_id"].apply(address_entropy)
df["diversity"] = df["wallet_id"].apply(hex_diversity)
df["is_suspicious"] = df["wallet_id"].apply(repeating_patterns)

**Normalization (Min-Max Scaling)**

In [33]:
df["entropy_norm"] = (df["entropy"] - df["entropy"].min()) / (df["entropy"].max() - df["entropy"].min())
df["diversity_norm"] = (df["diversity"] - df["diversity"].min()) / (df["diversity"].max() - df["diversity"].min())


**Final Score (Weighted)**

In [34]:
df["score"] = (
    0.5 * df["entropy_norm"] +
    0.4 * df["diversity_norm"] -
    0.9 * df["is_suspicious"]
)


**Scale to 0-1000**

In [35]:
df["score"] = (df["score"] - df["score"].min()) / (df["score"].max() - df["score"].min())
df["score"] = (df["score"] * 1000).astype(int)

In [37]:
df.head(56)

Unnamed: 0,wallet_id,entropy,diversity,is_suspicious,entropy_norm,diversity_norm,score
0,0x0039f22efb07a647557c7c5d17854cfd6d489ef3,3.856198,16,0,0.887961,1.0,937
1,0x06b51c6882b27cb05e712185531c1f74996dd988,3.728213,15,0,0.581942,0.666667,619
2,0x0795732aacc448030ef374374eaae57d2965c16c,3.725071,15,0,0.574428,0.666667,615
3,0x0aaa79f1a86bc8136cd0d1ca0d51964f4e3766f9,3.747085,15,0,0.627066,0.666667,644
4,0x0fe383e5abc200055a7f391f94a5f5d1f844b9ae,3.715957,15,0,0.552638,0.666667,603
5,0x104ae61d8d487ad689969a17807ddc338b445416,3.594588,14,0,0.262439,0.333333,293
6,0x111c7208a7e2af345d36b6d4aace8740d61a3078,3.768454,15,0,0.67816,0.666667,673
7,0x124853fecb522c57d9bd5c21231058696ca6d596,3.737326,16,0,0.603732,1.0,779
8,0x13b1c8b0e696aff8b4fee742119b549b605f3cbc,3.739823,15,0,0.609702,0.666667,635
9,0x1656f1886c5ab634ac19568cd571bc72f385fdf7,3.622574,14,0,0.329354,0.333333,331


**Final Output**


In [36]:
# Final Output
final_df = df[["wallet_id", "score"]]
df.head(15)
final_df.to_csv("wallet_risk_scores.csv", index=False)

print("✅ Done! File saved as wallet_risk_scores.csv")

✅ Done! File saved as wallet_risk_scores.csv


**So Here iS the Step By step Execution explanation of the  Above Procedure**

# Data Collection:
All Wallet IDs were pulled from one CSV file without the help of APIs or blockchain metadata.

#Feature Selection:
We needed a few high-value features. So we chose address entropy (which detects randomness), hex character diversity (how many unique characters), and pattern detection (to flag suspicious formats like all 0s/f's).

Entropy detects randomness, diversity explains unique characters, and patterns refers to flagging suspicious formats.

#Normalization:
We performed Min-Max Scaling on the features to scale them so that they are in the range of 0 - 1.

#Scoring:
 The weighted features are combined using one formula (with a penalty for suspicious patterns), and scaled for a 0 - 1000 risk score on the features.

#Justification:
 These indicators serve as estimates for whether a certain wallet may algorithmically be generated or suspicious – without requiring use of any transactions.