Solidity Benchmark Suites for Evaluating EVM Code-Analysis tools, and Code to Run them

Table of Contents

Solidity Benchmark Suites for Evaluating EVM Code-Analysis tools, and Code to Run them

Introduction

This repo aims to be a collection of benchmarks suites for evaluating the precision of EVM code analysis tools.

If you just want to see the reports created as a result of running the various analyzers over the benchmark suites, you can find that here.

It started out as is a fork of Suhabe Bugara's excellent benchmark suite, and this link shows the results of running some tools on this benchmarks as of May 2018.

Another benchmark we add as a git submodule is Trail of Bits (Not So) Smart Contracts.

Reports from running runner/run.py and runner/report.py are here.

Cloning

Since there is a git submodule in this repository clone using the --recurse-submodules option. For example:

$ git clone --recurse-submodules https://github.com/EthereumAnalysisBenchmarks/evm-analyzer-bench-suites.git

Forgot to `--recurse-submodules` on clone?

Only If you forget the --recurse-submodules on the git clone do the following:

$ git submodule init
Submodule 'benchmarks/Suhabe' (https://github.com/ConsenSys/evm-analyzer-benchmark-suite.git) registered for path 'benchmarks/Suhabe'
Submodule 'benchmarks/nssc' (https://github.com/trailofbits/not-so-smart-contracts.git) registered for path 'benchmarks/nssc'
$ git submodule update
Cloning into '/src/external-vcs/github/EthereumAnalysisBenchmarks/evm-analyzer-bench-suites/benchmarks/Suhabe'...
Cloning into '/src/external-vcs/github/EthereumAnalysisBenchmarks/evm-analyzer-bench-suites/nchmarks/nssc'...
...

Benchmarks have changed?

If benchmarks change and you want to pull in the new benchmark code, use git submodule update.

Project setup

The reports programs are written in Python 3.6 or better. To install dependent Python packages, run:

$ pip install -r requirements.txt

Analysers setup

Analysers are not part of project dependencies and they should be installed manually. The reason for this was to make setup not dependent on analysers failures (there might be some) and to make it possible for user to select specific analysers to benchmark, instead of installing all of them.

Below, you will find a list of supported analysers with installation instructions and known bugs that prevents installation or makes analyser unworkable.

Mythril

Available in PyPi

$ pip install mythril

Manticore

Available in PyPi

$ pip install manticore

Known bugs:

ValueError: not allowed to raise maximum limit
- Description: Latest version in PyPi - 0.2.0 fails during analyser execution
- Workaround: Source code in master branch already contains fix. Thus, while the new version for PyPi is not released manticore must be installed manually:
```
$ git clone https://github.com/trailofbits/manticore.git
$ cd manticore/
$ pip install .
```
Installation fails on MacOS
- Description: trailofbits/manticore#1075
- Workaround: n/a. On some systems with MacOS it was possible to successfully install it, therefore try to install at firsts.

About Python Code to Run Benchmarks and Create Reports

We assme the benchmark suite repositories is set up using in git via the --recurse-submodules switch described above. With this in place, the two Python programs are run in sequence to:

run an analyzer over a benchmark suite, and
generate HTML reports for a benchmark suite that we have gathered data for in the previous step

runner/run.py

Executes specified benchmark suite. Input arguments:

-s, --suite Benchmark suite name. Default Suhabe. Currently supported: Suhabe, nssc
-a, --analyser Analyser to benchmark. If not set all supported analysers will be benchmarked. Currently supported: Mythril, Manticore
-v, --verbose More verbose output; use twice for the most verbose output
-t, --timeout Maximum time allowed on any single benchmark. Default 7 seconds
--files Print list of files in benchmark and exit

Description: The first program runner/run.py takes a number of command-line arguments; one of them is the name of a benchmark suite. From that it reads two YAML configuration files for the benchmark. The first YAML file has information about the benchmark suite: the names of the files in the benchmarks, whether the benchmark is supposed to succeed or fail with a vulnerability, and possibly other information. An example of such a YAML file is benchconf/Suhabe.yaml. The other YAML input configuration file is specific to the analyzer. For Mythril on the Suhabe benchmark, it is called benchconf/Suhabe-Mythril.yaml

For each new Benchmark suite, these two YAML files will need to exist. The second one you can start out with an empty file.

The output is a YAML file which is stored in the folder benchdata with a subfolder under that with the name of the benchmark. For example the output of run.py for the Suhabe benchmark suite will be a file called benchdata/Suhabe/Mythril.yaml.

runner/report.py

Takes the aforementioned data YAML files and creates a HTML report from that. Input arguments:

-s, --suite Benchmark suite name. Default Suhabe,

Here is an example of complete report generation using Mythril on the Suhabe benchmark giving Mythril 5 minutes maximum to analyze a single benchmark:

$ python runner/run.py --timeout 300 --suite Suhabe --analyser Mythril
$ python runner/report.py --suite Suhabe

Adding additional analyser to benchmark

Source code related to analysers is located in runner/analysers/ module. In order to add support of a new analyser:

Implement new class inherited from BaseAnalyser
New class must be imported in analyser/__init__.py
Create configuration files with expected output in benchconf/ Please check existing analysers as an example.

Name		Name	Last commit message	Last commit date
Latest commit History 78 Commits
benchconf		benchconf
benchdata		benchdata
benchmarks		benchmarks
jinja2		jinja2
runner		runner
scripts		scripts
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

License

muellerberndt/evm-analyzer-bench-suites

Folders and files

Latest commit

History

Repository files navigation

Solidity Benchmark Suites for Evaluating EVM Code-Analysis tools, and Code to Run them

Introduction

Cloning

Forgot to --recurse-submodules on clone?

Benchmarks have changed?

Project setup

Analysers setup

About Python Code to Run Benchmarks and Create Reports

Adding additional analyser to benchmark

See also

About

Resources

License

Stars

Watchers

Forks

Languages

Forgot to `--recurse-submodules` on clone?