This repository focuses on clustering wallet addresses for cryptocurrencies. The project explores two distinct approaches based on different types of cryptocurrencies:
- UTXO-based cryptocurrencies
- Account-based cryptocurrencies
Currently, the work has begun on account-based cryptocurrencies, utilizing Ethereum data from a public BigQuery dataset.
🚧 Under Construction 🚧
This repository is a work in progress. The development is ongoing, and contributions or suggestions are welcome.
The primary goal of this project is to explore and demonstrate methods for clustering wallet addresses in the crypto ecosystem, providing insights into transaction patterns, user behavior, and network analysis.
- BigQuery: For data extraction from the public Ethereum dataset.
- Python: For data processing, analysis, and modeling.
- Google Cloud CLI: Used for authentication and interaction with Google services. Documentation available here.
The Ethereum data used in this project is sourced from the public dataset available on BigQuery. Access the dataset here.
account-based-clustering/: Directory containing the clustering process for account-based cryptocurrenciesutxo-based-clustering/: Directory containing the clustering process for utxo-based cryptocurrenciesrequirements.txt: Python requirements
- Data Preprocessing: Cleaning and preparing data from the Ethereum dataset.
- Feature Engineering: Creating variables to improve the clustering model.
- Clustering: Implementation of the K-means algorithm. The Elbow method will be used to determine the optimal number of clusters.
- Finalize feature engineering to enhance the K-means clustering model.
- Implement the Elbow method to identify the optimal number of clusters.
- Develop clustering heuristics for UTXO-based cryptocurrencies.
Contributions are welcome! Please feel free to open an issue or submit a pull request with your suggestions or improvements.
- BigQuery Public Datasets: For providing comprehensive and accessible blockchain data.
This project is licensed under the MIT License. See the LICENSE file for details.
Note: This README will be updated regularly as the project progresses.