A tool to detect Type I (Exact Clone), Type II (Renamed Clone), and Type III (Near-Miss Clone) code clones across smart contracts written in the Solidity programming language. CCD can handle complete as well as incomplete code (i.e., code snippets). This repository includes code, data, tools, and evaluation results from our paper on Analyzing the Impact of Copying-and-Pasting Vulnerable Solidity Code Snippets from Question-and-Answer Websites.
The figure above depicts the overall architecture of CCD. It generates fingerprints of Solidity source code snippets using ssdeep as its piecewise hashing function. It then follows a hybrid approach to match similar code fragments by first retrieving similar fingerprints indexed by an Elasticsearch database in terms of n-gram similarity and then computes on the returned records an order-independent similarity score to match similar code snippets to indexed smart contracts.
A container with all the dependencies can be found here.
To run the container, please install docker and run:
docker pull christoftorres/contract-clone-detector && docker run -it christoftorres/contract-clone-detectordocker build -t contract-clone-detector .
docker run -it contract-clone-detector:latestbrew install ssdeepsudo apt-get install build-essential libffi-dev python3 python3-dev python3-pip libfuzzy-dev
sudo apt-get install ssdeepbrew install antlrsudo apt-get install antlr4brew tap elastic/tap
brew install elastic/tap/elasticsearch-fullcurl -fsSL https://artifacts.elastic.co/GPG-KEY-elasticsearch | sudo gpg --dearmor -o /usr/share/keyrings/elastic.gpg
echo "deb [signed-by=/usr/share/keyrings/elastic.gpg] https://artifacts.elastic.co/packages/7.x/apt stable main" | sudo tee -a /etc/apt/sources.list.d/elastic-7.x.list
sudo apt-get update
sudo apt-get install elasticsearchcd CCD
python3 -m pip install -r requirements.txt# Example generate fingerprint
python3 CCD.py -g example.sol# Example store fingerprint
service elasticsearch start
python3 CCD.py -s example.sol --elasticsearch-index test# Example match fingerprint
service elasticsearch start
python3 CCD.py -m example.sol --elasticsearch-index testcd evaluation# Install SmartEmbed and Python dependencies
docker pull christoftorres/smartembed
python3 -m pip install -r requirements.txt# Evaluate SmartEmbed
python3 evaluate_smartembed.py# Evaluate CCD
python3 evaluate_ccd.py# Compare results
python3 compare_results.py# Compare parameters
python3 compare_parameters.pyIf using this repository for research, please cite as:
@inproceedings{
copypastesolidity,
address={Madrid, Spain},
title={Analyzing the Impact of Copying-and-Pasting Vulnerable Solidity Code Snippets from Question-and-Answer Websites},
ISBN={979-8-4007-0592-2/24/11},
DOI={10.1145/3646547.3688437},
booktitle={Proceedings of the 2024 ACM Internet Measurement Conference (IMC '24)},
publisher={Association for Computing Machinery},
author={Weiss, Konrad and Ferreira Torres, Christof and Wendland, Florian},
year={2024}
}
