A study on two EVM-based blockchains, namely Ethereum (ETH) and Binance Smart Chain (BSC). It explores the existence of vulnerabilities in deployed smart contracts in these two chains using smart contract automated analysis tools for EVM bytecode. Our codebase artifact is encapsulated into the Centaur framework. The framework also extends SmartBugs for analysing the dataset of smart contract bytecodes using multiple analysis tools and it is easily extendable to support other EVM chains.
- Prerequisites
- Installation
- Step-by-Step Analysis Procedure
- Running the SmartBugs Framework
- Parsing the Analysis Tools Results
- Centaur Usage
- Analysis Tools
- Vulnerability Categorisation
- Using the SQLite3 Database
- Experiment Setup
- License
Before you begin, ensure you have met the following requirements:
- You have installed all the required Python and Shell dependencies with:
pip install -r requirements.txt
and
apt-get install -y cowsay figlet
- You are using Python >= 3.8 and Golang == 1.17
- You have created an account on Etherscan and BscScan and generated an API key on both
- You have installed Docker and docker-compose
- You are using a UNIX-like OS
Note: We recommend using Centaur via its Docker image (Option 2) as it encapsulates all the required dependencies and allows running the framework without needing to install anything on your system.
Option 1: Once all the above prerequisites are met, you can clone this repository with:
git clone https://github.com/mchara01/centaur.git
Option 2: Use our Docker image
docker pull mchara01/centaur:1.0
Once installed, the Centaur
CLI framework will be available for usage.
The following sections constituted the steps required to replicate the process of analysing the EVM bytecode of smart contracts.
- Prepare the local MariaDB database running over a Docker container;
- Create the files db_password.txt and db_root_password.txt containing the passwords for a normal user and root respectively.
- Then, start the container using:
docker-compose -f build/database/docker-compose.yaml up -d
- After, create the two tables in the database where the collected data will be inserted
with:
docker exec -it <CONTAINER_ID> mysql -u root -p'<ROOT_PASSWORD>' -P 3306 -h 127.0.0.1 < scripts/database/schema.sql
-
Perform random sampling on the blocks of the desired EVM chain (ETH, BNB). Block numbers generated are stored in a file for the crawler to read from. Sampling size and output location are passed as arguments:
python scripts/utils/blockNumberGenerator.py --size 1000 --chain eth --output blockNumbersEth.txt
-
Run the blockchain crawling script that connects to the Ethereum and BSC archive nodes (their IP and ports are declared as constants in the scripts) and extracts the contract addresses and bytecodes from the transactions of the blocks provided. Client (eth, bsc), input file and whether to use the tracer or not are provided as arguments:
go mod tidy
go run go-src/*.go --client eth --input data/block_samples/<latest_date>/blockNumbersEth.txt --tracer
To check only the connection to the archive node and the local database execute:
go run go-src/*.go --client eth --check
-
Crawl Etherscan or BscScan to gather any other missing data for given smart contract addresses. An API key must be provided for this script to work:
python scripts/crawl/mainCrawl.py --chain eth --apikey <ENTER_API_KEY_HERE> --output data/logs/results_eth.json --invalid data/logs/exceptions_eth.json
-
At this point, the database is populated with all the required data. If you wish to perform a backup of the database, execute the following command: (mysqldump needs to be installed first )
bash scripts/database/backup/db_backup.sh
If you need restore the backup use:
bash scripts/database/backup/db_restore.sh
Before using the above two scripts, make sure first you change the DB_BACKUP_PATH variable to match the locations on your local file system. -
Extract the bytecodes from the database and write them in files on the file system. The smart contracts of the respective bytecodes that are selected must pass one of the following conditions:
- a balance > 0 or
- number of transactions > 0 or
- number of token transfers > 0
Execute the script that does this with:
python scripts/utils/bytecodeToFileCreator.py --chain eth
After finishing successfully with the above steps, we have everything we need ready to run the SmartBugs framework and execute
the EVM bytecode analysis tools on the EVM bytecodes we have written on the local file system. We can do this using:
python smartbugs_bytecode/smartBugs.py --tool all --dataset eth_bc --bytecode
Please check the official repository of SmartBugs for more details on how to run the framework.
Note: Bear in mind that SmartBugs will execute 9 tools on every single contract from the corpus of contracts you will provide to it. Thus, this particular step may take a significant amount of time to complete (in our case it took approximately three days for 334 contracts). We recommend using a tool such as tmux that enable keeping a session alive for long periods of time even when logging out of the machine running the framework.
Once SmartBugs has finished, a result.json is created for every contract at the
smartbugs_bytecode/results/<TOOL_NAME> directory. To parse these results, we use
parser.py file in the scripts directory. This file is used as the main
point to execute every tool result parsers that reside in the scripts/result_parsing directory.
To parse a tool's results use:
python3 scripts/parser.py -t <TOOL_OF_CHOICE> -d <RESULT_DIRECTORY>
You can replace the <TOOL_OF_CHOICE>
placeholder with all
if you want to parse the results of every
tool and print their results on the screen.
The amount of time taken to process all contracts by every tool can be found on the last line of results/logs/SmartBugs_<DATE>.log
As an attempt to make the above Step-by-Step Procedure easier, we created the Centaur framework which executes all the above steps automatically, printing relevant messages. The easiest way to run Centaur is with Docker. To do this, we must first make sure we have the respective image either by pulling it (see Option 2) or by building it with:
docker build --no-cache -t centaur:1.0 -f Dockerfile .
Then, we can run the Centaur script with:
docker run centaur ./run_main.sh <API_KEY>
Before running the above command, make sure you have added the desired values for the constants in the CONSTANTS DECLARATION section of the config file, as this file is sourced into the main script.
We have gathered information about plenty of smart contract security analysis tools but only a subset of these can be included in our study as we want these tools to fulfil some criteria. More specifically, we wanted tools that work on EVM bytecode (not source code only), are open source and can execute without the need of human interaction (e.g. no GUI). The list of tools that pass these requirements along with their open-source repository and paper link are the following:
Analysis Tool | Paper | |
---|---|---|
1 | Conkas | link |
2 | HoneyBadger | link |
3 | MadMax | link |
4 | Maian | link |
5 | Mythril | link |
6 | Osiris | link |
7 | Oyente | link |
8 | Securify | link |
9 | Vandal | link |
For categorising the vulnerabilities found by the smart contract analysis tools, we extended the DASP10
taxonomy to replace category Unknown Unknowns (10), which includes any vulnerabilities that do not
fall in any other category, with each one of this uncategorised vulnerabilities. Category Short Address Attack (9) is not discovered by any of the tools that where
used in this study.
To enrich the vulnerability categorisation, we have used also the Smart Contract Weakness Classification (SWC Registry) to map found vulnerabilities
to an SWC id, which can help users learn more (e.g. description, remediation, code examples) about a specific vulnerability.
For convenience, we migrated the data to an sqlite db file located in the database directory.
The schema of the database can be seen in the schema.pdf file. You can run the database file from the command line with:
sqlite3 analysis.db
We created also a script that automates the process of creating the sqlite db file from the smartbugs
results. The file is located in scripts/database/create_db.py
. For creating the db file, follow the steps
below:
rm -rf database/analysis.db
sqlite3 -init database/schema.sql database/analysis.db .quit
rm -rf csvs
python scripts/database/create_db.py csvs \
smartbugs_bytecode/results \
build/database/03_Jul_2022/sqlite/run1.sqlite3 \
build/database/02_Aug_2022/sqlite/run2.sqlite3
sqlite3 database/analysis.db < csvs/populate.sql
This artefact has been tested on a 64-bits 20.04 Ubuntu machine and an Apple M1 Mac mini 12.3.1, both with 8 cores and 16GB of RAM. However, our Docker image can be used on any machine that has Docker installed.
This project is licensed under the terms of the MIT license, which can be found at the LICENSE
file.
This license applies to the whole codebase except for the SmartBugs framework and .hex and .sol files found
in the data directory, which are publicly available and retain their original licenses.