This package includes OCCAM software, available from: https://github.com/ashish-gehani/OCCAM/
The following terms apply to OCCAM:
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:
* Redistributions of source code must retain the above copyright notice, this
list of conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above copyright notice,
this list of conditions and the following disclaimer in the documentation
and/or other materials provided with the distribution.
* Neither the name of SRI International nor the names of its contributors may
be used to endorse or promote products derived from this software without
specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS
IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED
TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A
PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED
TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
Instructions for installing the OCCAM toolchain and building a bitcode version of the FreeBSD base distribution. The following instructions have been tested with FreeBSD 10.0 amd64.
The ideal way to start is with a clean install of FreeBSD 10.0 with sources installed. The simplest way to do this is to install from the Release 10.0 ISO image and and on the "Distribution Select" screen select just the following:
[*] ports Ports tree
[*] src System source code
If you are on an existing system that has either an old version of the source tree or is missing source, you can follow the instructions in the FreeBSD Handbook Chapter 24 to get the relevant sources.
Upgrade the ports collection (as 'root'):
su -
portsnap fetch
portsnap extract
cd /usr/ports/ports-mgmt/portupgrade
make -DBATCH install clean
portupgrade -a --batch
Install the following ports using the BSD port tree:
bash git subversion python27 protobuf py-protobuf sudo wget
(See the FreeBSD Handbook Chapter 5 for instructions.) The quick way to do this is:
su -
cd /usr/ports
cd shells/bash && make -DBATCH install clean && \
cd ../../devel/git && make -DBATCH install clean && \
cd ../../devel/subversion && make -DBATCH install clean && \
cd ../../devel/protobuf && make -DBATCH install clean && \
cd ../../devel/py-protobuf && make -DBATCH install clean && \
cd ../../security/sudo && make -DBATCH install clean && \
cd ../../ftp/wget && make -DBATCH install clean
(Python27 is installed as a prerequisite of Subversion.)
You will need to do the following to make Python work:
su -
cd /usr/local/bin
ln -s python2 python
Below we assume the shell being used is bash, that is:
chsh -s /usr/local/bin/bash
has been run. If you want to use another shell, replace bash-isms like 'export' with the appropriate equivalent.
We suggest installing 'sudo' and setting up your account as a sudoer, to make installing programs easier. You can do this, or modify the commands that use 'sudo' below.
Install LLVM and Clang version 3.5. (These instructions adapted from http://llvm.org/docs/GettingStarted.html) Decide where you want to install LLVM. If you have 'root' access, you can use the default '/usr/local', though any location is fine. You may then wish to add this to your shell startup (in '~/.profile' for bash):
export LLVM_HOME=/usr/local
Get LLVM and Clang version 3.5:
svn co http://llvm.org/svn/llvm-project/llvm/branches/release_35 llvm
cd llvm/tools
svn co http://llvm.org/svn/llvm-project/cfe/branches/release_35 clang
cd ../projects
svn co http://llvm.org/svn/llvm-project/compiler-rt/branches/release_35 compiler-rt
cd ../..
Now finish the build and install:
cd llvm
mkdir build
cd build
../configure --prefix=$LLVM_HOME --enable-assertions \
--enable-targets=host-only --enable-optimized
gmake -j8
sudo gmake install
Note that the FreeBSD 10.0 base includes Clang 3.3 (but does not include the complete LLVM framework).
git clone https://github.com/ashish-gehani/OCCAM.git
cd OCCAM
export OCCAM_SRC=`pwd`/OCCAM
With the prerequisites satisfied, we can install OCCAM. Decide where you want to install the tool. Below we assume it will be in your home directory. We recommend adding the below exports to your shell startup.
export CC=clang
export CXX=clang++
export CPP='clang -E'
export OCCAM_HOME=~/occam
Also, adjust the shell PATH so the LLVM tools built in step 3 above (that are in $LLVM_HOME/bin) will be picked up ahead of the system ones (typically in /usr/bin).
export PATH=${LLVM_HOME}/bin:${PATH}
cd $OCCAM_SRC
Edit the Makefile and uncomment the lines exporting LLVM_HOME and OCCAM_HOME at the top of the file (if they are not set in your environment). Change the paths to your choices. Now build and install the tool.
gmake
gmake install #Note: 'sudo' is used for non-root Python setup
If you make modifications to the Python scripts and want to install the new ones without rebuilding the C++ library, use "gmake install-occam".
Create an alias to the 'occam' toolchain wrapper. For bash, add the following to to your shell startup:
alias occam=$OCCAM_HOME/bin/occam
Part of the parallel build driver in the toolchain reads from the /proc filesystem. This isn't mounted by default on FreeBSD. We'll need to mount it before using the tool. You can mount it with the following command:
sudo mount -t procfs proc /proc
Alternatively, you can add this filesystem to '/etc/fstab':
proc /proc procfs rw 0 0
To build a lot of useful libraries and executables in a hurry, we'll first run the FreeBSD 'buildworld' process using the OCCAM toolchain. For details of the 'buildworld' process, see Chapter 24.6 of the FreeBSD Handbook.
First, we need to patch one of the 'buildworld' make files to prevent it from removing the OCCAM toolchain from the path after it compiles fresh versions of Clang and the rest of the FreeBSD toolchain. We'll also need to tell 'make' to use Clang (which our toolchain will masquerade as) by changing '/etc/src.conf'. While we're at it, we'll turn off some parts of the build to reduce how much time it takes.
cd /usr/src
sudo patch -p1 < $OCCAM_SRC/patches/Makefile.inc1-occam.patch
sudo cp $OCCAM_SRC/patches/src.conf /etc/src.conf
Before starting the build, you can choose to record logs from the OCCAM tool by setting the following variables:
export OCCAM_LOGFILE={absolute path to log location}
export OCCAM_LOGLEVEL={INFO, WARNING, or ERROR}
Now kick off the build. You will want to use -jN for some N, as this process takes a long time without the OCCAM toolchain, and even longer with (i.e. hours).
mkdir $OBJ_DIR {wherever you want to build everything}
MAKEOBJDIRPREFIX=$OBJ_DIR ; export MAKEOBJDIRPREFIX
occam make buildworld
This should go smoothly, but occasionally IOErrors related to reading the /proc file system may occur. It should be safe to continue the build in this case:
occam make -DNO_CLEAN buildworld
At this point, you have a few options. We have tested option 1 below, but the others should also work.
To be really safe, we'll build a new kernel with this source as well. You may be able to skip this step, along with 'mergemaster', but we haven't tried that. For more information, use 'man mergemaster'. The "-iU" options try to make this process as automatic as possible.
!!! You'll need non-ssh access to the box for this.
If you don't want to rebuild the kernel, only perform the third step of the below five:
cd /usr/src
make buildkernel
sudo shutdown now # drops into single user mode
make installkernel # don't use -j
mergemaster -p
Switching to single user mode results in a new shell, where environment variables and aliases must be set again:
export OCCAM_HOME=~/occam
export OCCAM_LOGFILE=/tmp/occam.log
export OCCAM_LOGLEVEL=INFO
export MAKEOBJDIRPREFIX=$OBJ_DIR
alias occam=$OCCAM_HOME/bin/occam
The new world can now be installed:
cd /usr/src
occam make installworld # don't use -j
mergemaster -iU
shutdown -r now
When the machine comes back up, you should have a full FreeBSD system with bitcode versions of all of the core libraries and executables.
development environment.
Choose somewhere to install everything to $WORLD_DIR.
occam make installworld DESTDIR=$WORLD_DIR
!Note, you cannot (according to the Handbook) use -j when using installworld. chroot into $WORLD_DIR and reinstall LLVM, Clang, and OCCAM.
Choose a folder to store all of your libraries, $LIB_DIR. Extract all the libraries you built into it.
find /usr/obj/usr/src/lib -name "*.bc.a" -exec cp {} $LIB_DIR \;
You can also extract manifests, partially built applications (pre-library linking), bitcode-compiled applications, and native applications compiled from the bitcode versions. The manifests may contain absolute paths to libraries that may need changing, however.
find /usr/obj/usr/src/usr.* -name "*.manifest" -exec cp {} $DEST \;
find /usr/obj/usr/src/usr.* -name "*_main.bc" -exec cp {} $DEST \;
find /usr/obj/usr/src/usr.* -name "*.bc" -exec cp {} $DEST \;
When trying to use the OCCAM toolchain with these libraries, you'll need to add $LIB_DIR to $OCCAM_LIB_PATH (described further below). When specializing or using "occam2 build" with a manifest, you'll need to add $LIB_DIR to the "search" entry in the manifest.
Once you have your base system installed and ready to go, you should be ready to compile and install other libraries or applications.
First, set the following environmental variables. (You may want to set these in
CC=clang
CXX=clang++
CPP='clang -E'
You should now be able to compile most well-behaved projects with:
occam ./configure <options>
occam make
occam make install
Occam has a few configuration options, set via environmental variables.
OCCAM_LOGFILE
Absolute path of file to log toOCCAM_LOGLEVEL
Logging detail, may be one of INFO, WARNING, or ERROROCCAM_LIB_PATH
Additional directories to include in the search path for finding bitcode versions of libraries. Not necessary if libraries are installed next to their native versions (e.g., /path/to/libs:/morelibs)OCCAM_PROTECT_PATH
Set/unset flag. If set, OCCAM will remove all non-OCCAM items from the path and intercept calls to common executables so that they can be logged at level INFO. Can be useful for debugging a build that isn't working properly.
Occam leverages WLLVM bitcode generation for building bitcode equivalents for libraries and executables. Install WLLVM:
https://github.com/SRI-CSL/whole-program-llvm/blob/master/wllvm
Make sure the WLLVM tools are added to the PATH variable.
To compile as much code as possible to bitcode, you should choose any available configure options for building or using static libraries rather than shared ones.
For apache, you'll first need to compile and install libpcre, libapr, and libapr-util. The configurations I used for each of these are:
- libpcre: built from pcre-8.31.tar.gz
occam ./configure --enable-shared=no
- libapr: built from apr-1.4.6.tar.gz
occam ./configure --enable-shared=no
- libapr-util:
occam ./configure --with-apr=/usr/local/apr --disable-util-dso --with-crypto
You may be able to drop --with-crypto, which could fix the configure issue for apache that I describe below. For apache, I used the following configure:
occam ./configure --prefix=$SOMEPREFIX --enable-modules=none \
--enable-mods-static=most --enable-static-support \
--with-apr=/usr/local/apr --with-apr-util=/usr/local/apr \
--with-pcre=/usr/local
You then need to patch build/config_vars.mk because something in the configure script seems to be detecting crypto options wrong. I'll debug this, but the workaround for now is to fix the following variables:
CRYPT_LIBS = -lcrypt
SSL_LIBS = -lssl -lcrypto -lcrypt -lpthread
To run apache using command line options (to make it easy to previrtualize), you'll need to set up a basic ServerRoot directory somewhere with the following configuration.
$SERVERROOT
|- conf
|- mime.types (an empty text file)
|- logs (an empty directory)
|- www
|- index.html (something to serve...)
|- httpd.conf (an empty text file)
The following set of arguments should then work to get apache serving files:
-d $SERVERROOT
-f httpd.conf
-C 'Listen 8181'
-C 'LogLevel debug'
-C 'User apache' -C 'Group apache'
-C 'ServerName localhost'
-C 'ErrorLog /home/moore/root/error_log'
-C 'DocumentRoot /home/moore/root/www'
-C 'AddType text/plain .html'
-k start
Once you have a manifest file and a bundle, you can use it to build a final bitcode version of an application or specialize it. To build from the manifest, just execute:
occam build $MANIFEST_FILE $NAME_FOR_EXECUTABLE
To specialize the application, add a "args": ["arg1", "arg2", ...] entry to the manifest file and call the previrt tool:
occam previrt --work-dir=$WORKDIR $MANIFEST_FILE
As before, you can set OCCAM configuration variables to control logging.
-- General -------------------------------------------------------------------
[ ] Dead code elimination on all of the repositories to clean up unused passes,
targets, #ifdef 0-ed regions, etc, etc, etc
-- Toolchain -------------------------------------------------------------------
[ ] Save object files off when bundling. The bundling process uses llvm-ld which
does not link ELF files into the resulting bitcode. We should save off the
necessary ELF object files from the inputs or library to link in later at
build or previrtualization time. (This causes the manifest build for at
least clang during the buildworld process, and probably others)
[ ] Link in said object bundles during occam-build or occam-previrt
-- Previrt -------------------------------------------------------------------
[ ] Debug previrtualized version of httpd. The previrtualized version of httpd
does not start correctly. It seems likely this is due to interactions with
pthreads (possibly some incorrectly eliminated symbols?).
[ ] Figure out how to avoid needing to relink in libc, etc. to get the symbols
we need for the C runtime. (and C++/pthreads?)
[ ] Add support for better call-graphs. Either improve Gregory's call graph,
choose a better one that's already in llvm, or port over the DSA call-graph
stuff in poolalloc.
[ ] Support for previrtualizing un-named functions
[ ] Support for previrtualizing internal function pointers accross module
boundaries for call backs.
[ ] To avoid creating multiple clones of the same specialized function, we check
if it already exists by looking it up in the module by name. This check
should probably be more robust. This is related to support for un-named
functions because we need to make sure we're using unique specializations
for the different functions.
[ ] For ease of understanding, give created globals for strings, etc., names so
they don't show up as unnamed in the symbol table.
[ ] Extract statistics about the previrtualization process.