This document describes the artifact for our OSDI '20 paper on Agamotto, a symbolic-execution based approach for systematically finding bugs in persistent memory applications and libraries. The remainder of this document describes how to run Agamotto and reproduce the key results from our paper.
If you are looking for some guidance on how to test new systems with Agamotto, look at the nvmbugs/testing_examples
README.
- A Google Sheet which contains our two appendices; the results of the bug study and the bugs we found in our evaluation.
- A link to the paper on the OSDI webpage.
VM Image: Already has all the dependencies installed. Runs on KVM QEMU.
- Username:
reviewer
- Password:
osdi2020_agamotto
(you should probably change this after you log in)
An easy method of downloading is to use gdown
(reference):
sudo -H pip3 install gdown
cd agamotto/artifact/vm_scripts
gdown 1UG29UOcxE6INndZqMX0G_9VSBrCd-xUg
targets/
: This directory contains the scripts required to run all the tests needed to reproduce the main results from the paper.
After running experiments, the results will be placed into the results/
directory
results/
: This directory contains the scripts required to parse the results generated from the main experiments.
vm_scripts/
: This directory contains scripts for building and running the evaluation VM.
Agamotto is open-source and is available at https://github.com/efeslab/agamotto.git.
We now provide an overview of how to build and run Agamotto. For a guide on how to compile applications to run on Agamotto, see Klee's tutorial on building coreutils.
The procedure for building Agamotto is substantially similar to building KLEE (the instructions for building KLEE can be found here.) The notable difference is that Agamotto requires LLVM 8 rather than LLVM 9.
sudo apt-get install libguestfs-tools
./vm_scripts/install-vm.sh
# configure
mkdir -p mnt; mkdir -p boot
sudo guestmount -a agamotto.qcow2 -i -o allow_other mnt
cp mnt/vmlinuz boot
cp mnt/initrd.img boot
sudo guestunmount mnt
# run and login
./run-vm.sh
# --- from inside the VM
cd ~/
git clone https://github.com/efeslab/agamotto.git agamotto
git submodule init
git submodule update
mkdir -p build
source build.env
cd build
# Proceed to the instructions on "Compiling" below
sudo apt-get install libguestfs-tools
# Download agamotto.qcow2 from the Google drive link
mv agamotto.qcow2 artifact/vm_scripts
# This is necessary to extract the vmlinuz and initrd.img for the run-vm.sh script
mkdir -p mnt; mkdir -p boot
sudo guestmount -a agamotto.qcow2 -i -o allow_other mnt
sudo cp mnt/vmlinuz boot
sudo cp mnt/initrd.img boot
sudo guestunmount mnt
./run-vm.sh
# The repo is in ~/agamotto
All the dependencies should be installed. You can then update the version of Agamotto and rebuild it by simply running:
cd ~/agamotto
source build.env
git checkout <version>
cd build/
make -j$(nproc)
run-vm.sh
will run the QEMU VM (use ^a-X
to kill QEMU). You can login directly
at the login prompt, but the rendering of the terminal may behave strangely.
To login using ssh (which is recommend), run the following:
ssh reviewer@localhost -p 5000 # this port can be configured in run-vm.sh
The following instructions are heavily adapted from the http://klee.github.io/build-llvm9/
# Install prerequisite packages:
sudo apt install -y python3-pip build-essential curl libcap-dev git cmake \
libncurses5-dev python-minimal python-pip unzip libtcmalloc-minimal4 \
libgoogle-perftools-dev libsqlite3-dev doxygen python3-pip libselinux1-dev \
clang-8 llvm-8 llvm-8-dev llvm-8-tools pandoc
sudo -H pip3 install wllvm tabulate lit
# Install STP (https://github.com/stp/stp)
sudo apt install -y cmake bison flex libboost-all-dev python perl minisat
git clone https://github.com/stp/stp
cd stp
git submodule init && git submodule update
mkdir build
cd build
cmake ..
cmake --build .
sudo cmake --install .
# Clone Agamotto, set up build directory
git clone https://github.com/efeslab/agamotto.git -b artifact_eval_osdi20
cd agamotto
git submodule init && git submodule update
mkdir -p build
# uCLibc
cd klee-uclibc
./configure --make-llvm-lib --with-llvm-config=$(which llvm-config-8)
make -j$(nproc)
cd ..
# LibCXX
LLVM_VERSION=8 SANITIZER_BUILD= BASE=$(realpath ./build/) REQUIRES_RTTI=1 \
DISABLE_ASSERTIONS=1 ENABLE_DEBUG=0 ENABLE_OPTIMIZED=1 ./scripts/build/build.sh libcxx
# Finally, build Agamotto
# -- build.env contains environment variables required for WLLVM
source build.env
cd build
cmake -DCMAKE_BUILD_TYPE=RelWithDebInfo \
-DENABLE_CRC32_SUPPORT=ON \
-DENABLE_SOLVER_STP=ON \
-DENABLE_POSIX_RUNTIME=ON \
-DENABLE_KLEE_UCLIBC=ON \
-DKLEE_UCLIBC_PATH=$(realpath ../klee-uclibc) \
-DENABLE_UNIT_TESTS=OFF \
-DLLVM_CONFIG_BINARY=$(which llvm-config-8) \
-DLLVMCC=$(which clang-8) \
-DLLVMCXX=$(which clang++-8) \
-DCMAKE_C_COMPILER=$(which wllvm) \
-DCMAKE_CXX_COMPILER=$(which wllvm++) \
-DENABLE_KLEE_LIBCXX=ON \
-DKLEE_LIBCXX_DIR=$(realpath .)/libc++-install-8/ \
-DKLEE_LIBCXX_INCLUDE_DIR=$(realpath .)/libc++-install-8/include/c++/v1/ \
..
make -j$(nproc)
Running Agamotto is very similar
to running Klee (see this tutorial for how to run general programs on Klee). We provide the following extensions via command line arguments below (these arguments are
visible via klee --help
)
General arguments:
--search=nvm --nvm-heuristic-type=static
: This causes Klee to use Agamotto's PM-aware heuristic search.--custom-checkers=true
: This enables the use of semantic bug oracles.
POSIX runtime arguments (part of the symbolic environment model):
-sym-pmem <FILE> <NBYTES>
: Provide a symbolic persistent memory file of the specified size. The application-under-test can then callmmap(FILE)
to access a region of persistment memory. The file contents are symbolic on initialization and can represent any value.-sym-pmem-zeroed <FILE> <NBYTES>
: Same as-sym-pmem
, but the initial values are all 0.-sym-pmem-delay <FILE> <NBYTES>
: Same as the above, but tricks the application into thinking that it createdFILE
.-sym-pmem-init-from <INIT_FILE_PATH>
: Initialize the symbolic persistent memory file (named the same asINIT_FILE_PATH
) to concrete values based on values from the real file at the specified path.
There are four main results from Agamotto:
- The number of new bugs found.
- The performance of Agamotto's search strategy compared to Klee's default search strategy.
- The overhead of Agamotto's static analysis.
- The number of bugs reproduced from prior work.
First, we run all of the test cases that we used to symbolically explore our test applications, as we did for the evaluation in our paper.
The run_*_eval.sh
scripts are wrappers which execute Agamotto
with the appropriate command line parameters required.
cd artifact/targets
./run_pmdk_eval.sh
./run_nvm_direct_eval.sh
./run_memcached_eval.sh
./run_recipe_eval.sh
./run_redis_eval.sh
This creates Klee output files under artifact/results
. These are the same as running standard Klee tests, but additionally outputs all_pmem_err.csv
and all.pmem.err
, which provide tabular and textual descriptions of all
encountered PM bugs, respectively. In the artifact/results
directory, we provide a number of scripts
to parse Agamotto's output automatically.
For identifying new bugs, we run this script:
cd artifact/results
sudo -H pip3 install -r requirements.txt
./parse_bugs_and_perf.py
This should provide the following output:
{ 'memcached': {'Correctness': 1, 'Performance': 20, 'Total': 21},
'nvm-direct': {'Correctness': 7, 'Performance': 16, 'Total': 23},
'pmdk': {'Correctness': 2, 'Performance': 11, 'Total': 13},
'recipe': {'Correctness': 1, 'Performance': 13, 'Total': 14},
'redis': {'Correctness': 3, 'Performance': 1, 'Total': 4}}
Overall:
Total: 75
Correctness: 14
Performance: 21
Transient: 40
Run the experiments as performed for finding new bugs. If already run, there is no need to re-run them.
Then, run:
cd artifact/results
sudo -H pip3 install -r requirements.txt
./graph_performance.py pmdk
./graph_performance.py recipe
./graph_performance.py memcached
./graph_performance.py nvm-direct
./graph_performance.py redis
This generates {pmdk,recipe,memcached,nvm-direct,redis}.pdf
in the same directory. These
graphs represent the performance of Agamotto's search strategy
versus Klee's default search strategy, and should be
similar to the results shown in the evaluation in our paper.
Note that this experiment is dependent on the underlying CPU for timing and may vary.
Run the experiments as performed for finding new bugs. If already run, there is no need to re-run them.
Then, run:
cd artifact/results
sudo -H pip3 install -r requirements.txt
./get_offline_overhead.py
This should provide output similar to the following:
memcached-pm 2.198278
redis-pmem 19.661336
nvm-direct 0.021961
recipe 0.549605
pmdk 0.596972
Name: Offline Overhead (minutes), dtype: float64
Note that this experiment is dependent on the underlying CPU for timing and may vary. In addition, we have modified our alias analysis algorithm from our original submission, so the precise numbers may vary.
We run a different set of test cases to reproduce the bugs found by prior work (specifically PMTest and XFDetector).
cd artifact/targets
./run_prior_art_repro.sh
Then, run the script to count the bugs that were reproduced.
cd artifacts/results
sudo -H pip3 install -r requirements.txt
./count_reproduced_bugs.py
This should provide output similar to the following:
btree_map_insert_empty at nvmbugs/000_pmdk_btree_map/btree_map_buggy.c:150
btree_map_insert_empty at nvmbugs/000_pmdk_btree_map/btree_map_buggy.c:150
btree_map_insert_item at nvmbugs/000_pmdk_btree_map/btree_map_buggy.c:267
create_hashmap at nvmbugs/hashmap_atomic/hashmap_atomic.c:135
create_hashmap at nvmbugs/hashmap_atomic/hashmap_atomic.c:132
hm_atomic_insert at nvmbugs/hashmap_atomic/hashmap_atomic.c:263
create_hashmap at nvmbugs/hashmap_atomic/hashmap_atomic.c:137
create_hashmap at nvmbugs/hashmap_atomic/hashmap_atomic.c:138
initPersistentMemory at server.c:4029
initPersistentMemory at server.c:4029
Number of unique reproduced bugs: 5
If you would like to inspect
For example, if we want to look at the bottom for each stack frame for all bugs found in memcached-pmem
, we could run the following commands (assumes you have already run the memcached-pmem
test):
cd artifacts/results
python3
> import pandas as pd
> df = pd.read_csv('klee-memcached-static/all_pmem_errs.csv')
> df['StackFrame0'].unique()
array(['do_item_link_q at items.c:423', 'lru_pull_tail at items.c:1176',
'do_item_unlink_q at items.c:469', 'do_item_update at items.c:626',
'assoc_delete at assoc.c:197', 'do_item_remove at items.c:591',
'do_item_link at items.c:538',
'pmem_clwb at ../../src/../src/libpmem2/x86_64/flush.h:48', # This is the perf bug, which is at StackFrame4 at "do_item_link at items.c:516"
'do_slabs_free at slabs.c:540', 'do_slabs_free at slabs.c:550',
'do_item_unlink_q at items.c:470', 'do_item_link at items.c:541',
'do_item_link_q at items.c:421', 'assoc_insert at assoc.c:167',
'do_item_get at items.c:1023'], dtype=object)