A full-stack AML detection platform that combines a trained machine learning fraud classifier with an Ethereum smart contract to flag and immutably log suspicious financial transactions. Built on Django with Web3.py integration for on-chain interaction.
This system mirrors how modern financial institutions are beginning to augment compliance operations with AI. It combines two independent detection mechanisms:
- ML-based fraud classifier — A Random Forest model trained on real Ethereum transaction data that scores wallet behavior and predicts fraudulent activity
- On-chain smart contract — A Solidity contract deployed to the Ethereum network that monitors live transfers and emits
SuspiciousTransactionevents when threshold violations are detected
Both layers feed into a Django backend that serves predictions and maintains a local compliance record.
Ethereum Network
│
▼
AntiMoneyLaundering.sol ──── SuspiciousTransaction events
│
▼
Web3.py Integration (Django)
│
├── ML Inference Layer (finalized_model.sav)
│ └── Random Forest Classifier
│ └── Trained on Ethereum Fraud Detection Dataset
│
└── Django REST API
└── SQLite / PostgreSQL
- Random Forest fraud classifier trained on 10,000+ Ethereum wallet transactions with feature engineering, outlier removal, and correlation analysis
- Solidity smart contract with deposit, withdraw, and transfer functions; emits on-chain events for transactions exceeding a configurable ETH threshold
- Smart contract compilation and deployment pipeline using
py-solc-xand Web3.py - Django backend serving ML predictions via REST API
- Model comparison — evaluated Logistic Regression, SVM (RBF kernel with GridSearchCV), and Random Forest; Random Forest selected as final model based on superior recall and F1 on fraud class
- PCA experimentation — dimensionality reduction tested against full feature set; full features retained for better performance
- ROC curve analysis for threshold tuning
| Layer | Technology |
|---|---|
| ML & Data | Python, scikit-learn, Pandas, NumPy, Matplotlib, Seaborn, Plotly |
| Blockchain | Solidity, Web3.py, py-solc-x, Infura |
| Backend | Django 4.1, Django REST Framework, Web3.py |
| Database | SQLite (dev), PostgreSQL-ready |
| Model Persistence | Pickle |
- Source: Ethereum Fraud Detection Dataset (Kaggle)
- Target:
FLAG— binary label indicating fraudulent wallet activity
- Dropped zero-variance features and highly correlated feature pairs (threshold > 0.9)
- Mean imputation for numeric nulls; mode imputation for categorical nulls
- Label encoding for ERC20 token type categorical features
- MinMax normalization (0–1 range) applied to train/test splits independently
- Outlier removal on extreme values in
Time DiffandAvg min between received tnx
| Model | Notes |
|---|---|
| Logistic Regression | Baseline; lower recall on fraud class |
| SVM (RBF, C=9) | GridSearchCV tuned; slower, marginal gains |
| Random Forest | Selected — best F1 and recall on fraud class |
RandomForestClassifier(n_estimators=25, max_features=5, random_state=101)- Persisted to
finalized_model.savvia Pickle - Loaded at runtime by Django for inference
File: Anti-Money-Laundering.sol
contract AntiMoneyLaundering {
address public admin;
uint public threshold; // Default: 100 ETH
mapping(address => uint) public balances;
function transfer(address to, uint amount) public { ... }
event SuspiciousTransaction(address from, address to, uint amount);
}The contract emits a SuspiciousTransaction event whenever a sender's post-transfer balance exceeds the configured threshold — creating an immutable on-chain audit trail that cannot be altered or deleted.
Compiled using py-solc-x (Solidity 0.8.0) and deployed via Web3.py to Infura.
├── model_training.ipynb # Full ML pipeline: EDA → preprocessing → training → evaluation
├── AML_Smart_Contract.ipynb # Smart contract compilation and Ethereum deployment
├── Anti-Money-Laundering.sol # Solidity contract source
├── compiled_code.json # ABI + bytecode output
├── finalized_model.sav # Trained Random Forest model (Pickle)
├── manage.py # Django entry point
├── requirements.txt # Python dependencies
└── db.sqlite3 # Local development database
- Python 3.9+
- Node.js (optional, for local Ethereum node)
- An Infura account for Ethereum RPC access
pip install -r requirements.txtpython manage.py migrate
python manage.py runserverOpen AML_Smart_Contract.ipynb and run all cells. Make sure to update the Infura provider URL with your own project key before deploying.
Open model_training.ipynb and run all cells. The final cell saves finalized_model.sav to disk.
- End-to-end ML pipeline — raw data ingestion through to a production-persisted model
- Blockchain + AI integration — two independent fraud detection mechanisms working in tandem
- On-chain immutability — suspicious transaction events written permanently to the Ethereum ledger
- Smart contract development — Solidity authoring, compilation, ABI extraction, and deployment via Python
- Compliance-aware design — threshold-based flagging mirrors real-world AML regulatory frameworks (FATF, FinCEN)
Vagif A. — Ethereum Fraud Detection Dataset on Kaggle.
MIT License