SymSan (Symbolic Sanitizer) is an efficient concolic execution engine based on the Data-Floow Sanitizer (DFSan) framework. By modeling forward symbolic execution as a dynamic data-flow analysis and leveraging the time and space efficient data-flow tracking infrastructure from DFSan, SymSan imposes much lower runtime overhead than previous symbolic execution engines.
Similar to other compilation-based symbolic executor like SymCC,
SymSan uses compile-time instrumentation to insert symbolic execution logic into
the target program, and a runtime supporting library to maintain symbolic states
during execution.
To learn more, checkout our paper at USENIX Security 2022.
Because SymSan leverages the shadow memory implementation from LLVM's sanitizers, it has more strict dependency on the LLVM version. Right now only LLVM 12 is tested.
- Linux-amd64 (Tested on Ubuntu 24.04)
- LLVM 14.0.6: clang, libc++, libc++abi
Create a build directory and execute the following commands in it:
$ CC=clang-14 CXX=clang++-14 cmake -DCMAKE_INSTALL_PREFIX=/path/to/install -DCMAKE_BUILD_TYPE=Release /path/to/symsan/source
$ make
$ make installdocker build -t symsan .
The repo contains instrumented libc++ and libc++abi to support C++ programs.
To rebuild these libraries from source, execute the rebuild.sh script in the
libcxx directory.
NOTE: because the in-process solving module (solver/z3.cpp) uses Z3's C++ API
and STL containers, so itself depends on the C++ libs. Due to such dependencies,
you'll see linking errors when building C++ targets when using this module.
Though it's possible to resolve these errors by not instrumenting the dependencies
(adding them to the ABIList,
then rebuild the C++ libs), we don't recommend using it for C++ targets.
Instead, it's much cleaner to use ann out-of-process solving module like Fastgen.
To verify the code works, try some simple tests (forked from Angora, adapted by @insuyun to lit):
$ pip install lit
$ cd your_build_dir
$ lit tests
-
KO_CCspecifies the clang to invoke, if the default version isn't clang-12, set this variable to allow the compiler wrapper to find the correct clang. -
KO_CXXspecifies the clang++ to invoke, if the default version isn't clang++-12, set this variable to allow the compiler wrapper to find the correct clang++. -
KO_USE_Z3enables the in-process Z3-based solver. By default, it is disabled, so SymSan will only perform symbolic constraint collection without solving. SymSan also supports out-of-process solving, which provides better compatiblility. Check FastGen. -
KO_USE_NATIVE_LIBCXXenables using the native uninstrumented libc++ and libc++abi. -
KO_DONT_OPTIMIZEdon't override the optimization level toO3.
SymSan can emit CTWM-friendly metadata and runtime traces. Configure these features at CMake time or per compilation:
SYMSAN_CTWM_ENABLE_INDEX(defaultON) builds the LLVMCTWMIndexPass, which records basic-block IDs and the SymSan branch IDs found by__taint_trace_cond. Each compile producesctwm_index.jsonbeside the compiler’s working directory. Override the output path withKO_CTWM_INDEX_PATH=/tmp/foo.json.SYMSAN_CTWM_ENABLE_BB_TRACE(defaultOFF) injects calls to__ctwm_trace_bbon every basic block. Per-build overrides are available viaKO_CTWM_ENABLE_BB_TRACEandKO_CTWM_DISABLE_BB_TRACE.- Use
KO_CTWM_ENABLE_INDEX/KO_CTWM_DISABLE_INDEXwhen you need to toggle the index pass without reconfiguring CMake. - Runtime traces are written by the new
ctwm_traceruntime component. SetSYMSAN_CTWM_TRACE_PATH=/tmp/trace.bin(defaults toctwm_trace.log) before running the instrumented binary. The log is a binary stream ofint32_tbasic-block IDs and aligns with the IDs stored in the JSON index. - When experimenting from the build tree,
ko-clangautomatically prefers the freshly builtinstrumentation/TaintPass.so, so you don’t need to runmake installjust to regeneratectwm_index.json. - For diagnostics you can export
SYMSAN_CTWM_DEBUG=1to have the pass report where it writes the index and whether trace hooks were injected.
SymSan needs a driver to perform hybrid fuzzing, like FastGen. It could also be used as a custom mutator for AFL++ (check the plugin readme).
Still under construction, unfortunately. DeepWiki seems okay.
To cite SymSan in scientific work, please use the following BibTeX:
@inproceedings {chen2022symsan,
author = {Ju Chen and Wookhyun Han and Mingjun Yin and Haochen Zeng and
Chengyu Song and Byoungyong Lee and Heng Yin and Insik Shin},
title = {SymSan: Time and Space Efficient Concolic Execution via Dynamic Data-Flow Analysis},
booktitle = {{USENIX} Security Symposium (Security)},
year = 2022,
url = {https://www.usenix.org/conference/usenixsecurity22/presentation/chen-ju},
publisher = {{USENIX} Association},
month = aug,
}