For LLVM see: compiler/llvm/LICENSE.txt
For our code: this code is dual-licensed under the LLVM license (University of Illinois/NCSA Open Source License) and the GPLv3. Choose whatever fits your requirements.
Building the Compiler
1. Clone this repo
Clone this repo into
There are a lot of build scripts to glue everything together and make the build reproducible. The scripts all assume you cloned in the above directory.
cd ~ mkdir research cd research git clone firstname.lastname@example.org:HexHive/datashield.git
2. Build the compiler
The compiler is built using the normal LLVM build process. Consult the LLVM documentation if you have trouble.
You may use my build scripts or change the options if you know what you're doing.
DataShield has 3 different configurations:
- debug - debug info, unoptimized, with instrumentation
- baseline - optimized, no instrumentation
- release - optimized, with instrumentation
If you just want to experiment with DataShield debug might be the best:
cd ~/research/datashield/compiler mkdir build-debug cd build-debug ../lto_cmake_debug.sh ninja install
You can build release if you care about compile times:
cd ~/research/datashield/compiler mkdir build-release cd build-release ../lto_cmake_release.sh ninja install
Baseline is the same as release for the compiler since the compiler itself is not instrumented, but you need to build it if you want a baseline comparison for benchmarking:
cd ~/research/datashield/compiler mkdir build-baseline cd build-baseline ../lto_cmake_baseline.sh ninja install
4. Build libc
To build a configuration, just run
cd $HOME/research/datashield/libc ./build-debug.py ./build-baseline.py ./build-release.py
The scripts create a different install directory for each configuration, so you don't have to rebuild every time you want to test a different configuration. They are:
$HOME/research/datashield/ds_sysroot_debug $HOME/research/datashield/ds_sysroot_baseline $HOME/research/datashield/ds_sysroot_release
4. Build libcxx
Building libcxx is basically the same as building libc. It has the same three configurations. Running
build.py <config> builds everything. Otherwise run
build.py with no arguments for a help message.
cd $HOME/research/datashield/libcxx ./build.py <config>
Compiling Instrumented Programs
You need a lot of options to be able to build with our custom libc, libcxx, and
various protections. There are scripts in
$HOME/research/datashield/bin that make this much easier.
The scripts directory (
$HOME/research/datashield/bin) needs to be in your
PATH for the scripts to work.
Build Hello World
First, you should build a "Hello World" program to make sure your build is sane.
cd $HOME/research/datashield/test/hand-written/hello_world make ./test
It should print out a whole bunch of log information and "hello world" and "good bye." If not, something is seriously wrong and you should create a GitHub issue.
If you look at
$HOME/research/datashield/test/hand-written/hello_world/Makefile you will see
that there are multiple options for the variable
CC. To chose which set of
options you want you just change
CC to one of the scripts in
$HOME/research/datashield/bin. They all start with
Build Hello World in C++
Building C++ (versus C) is basically the same but you need to use the C++ scripts. It's a good idea to build "Hello World" in C an C++ as a sanity check.
cd $HOME/research/tests/hand-written/hello_worldxx make ./test
You should use the scripts, but if you need/want to change something and know what you're doing you can give options manually. These are the options that the scripts setup:
-datashield-ltoenables the datashield pass (required)
-T../../linker/linker_script.ldsthis passes our script to the linker (required)
-datashield-debug-modeprints debug logs at runtime
-datashield-save-module-aftersaves the compiled module to a file after datashield's transformation
-datashield-save-module-beforesaves the compiled module to a file before datashield's transformation
-debug-only=datashieldprints debug logs at compile time
The following are mutually exclusive:
-datashield-use-maskuse the software mask coarse bounds check options
-datashield-use-prefix-checkgive this option if you want prefix or MPX
The following are mutually exclusive:
-datashield-use-prefixgive this option if you want prefix or MPX
-datashield-use-late-mpxgive this option if you want prefix or MPX
Must be used with
-datashield-intergity-only-modeonly protect stores
-datashield-confidentiality-only-modeonly protect loads
-datashield-separation-modebasic arithmetic does not propagate sensitivity
Two options for compiling system libraries:
-datashield-library-modefor compiling libraries with sandboxing only
-datashield-modularrun the pass without LTO