A smarter tool to automatically generate FFI bindings for C libraries
Haskell JavaScript C
Latest commit 9cc0f7c Sep 8, 2014 @travitch Update the library interface parsing code (and aeson version)
The old library interface parser relied in the generically-derived JSON
parser from aeson.  Unfortunately, the format that aeson used for these
parsers changed in 0.7.  Instead of updating all of the data, we now use
manually-written ToJSON and FromJSON instances.  We can now safely use
modern versions of aeson.
Failed to load latest commit information.
doc Add some notes about future work. Jan 6, 2012
static Improve a highlighting regex May 23, 2013
stdlibs Add more reporting infrastructure May 14, 2013
tests Update two test inputs to the newer json format Aug 19, 2013
.ghci Add a .ghci file with some helpers for exploring generated reports Dec 17, 2012
.gitignore Use error information in nullable analysis Oct 27, 2013
LICENSE Initial import of a trivial analysis and driver Jun 28, 2011
README.md Minor README updates. Feb 27, 2013
Setup.hs Initial import of a trivial analysis and driver Jun 28, 2011
foreign-inference.cabal Update the library interface parsing code (and aeson version) Sep 8, 2014



This is a smarter automatic binding generator. It uses program analysis to infer most of the annotations that would normally be required for a tool like SWIG. The analysis frontend, IIGlue, takes LLVM bitcode as its input. This bitcode can be produced by either clang or the dragonegg plugin for gcc. My scripts for generating whole-program LLVM bitcode are useful here.

The analysis currently automatically identifies:

  • Output parameters
  • Array parameters
  • Allocator and finalizer functions
  • Nullable pointers

Many details of the analysis can be found in the PLDI 2009 paper.

This library contains only the analysis libraries. Tools built on top of the analysis are available in the iiglue repository.

Dependencies / Building

The analysis and code generator are written in Haskell. Currently, it requires GHC >= 7.2 (only tested with 7.4 and 7.6). It also depends on LLVM 3.0-3.2 and llvm-config must be in your PATH. The build system is currently set up to use the LLVM shared libraries (instead of the static libraries). The whole-program LLVM compiler wrapper depends on Python.

A few dependencies are not on Hackage:

  • llvm-base-types
  • llvm-data-interop
  • llvm-analysis
  • archive-inspection
  • hbgl
  • ifscs
  • itanium-abi

These are currently only available from my github account.

Installation would look something like:


# Download the repositories
    git clone git://github.com/travitch/$REPO.git

# Add ./ prefixes to each repository (for cabal)
TOINSTALL=`echo ./$REPOSITORIES  | sed 's: : ./:g'`

# Build the tools along with dependencies
cabal install $TOINSTALL

# Set up the whole-program LLVM wrapper
git clone git://github.com/travitch/whole-program-llvm.git
export PATH=$PATH:`pwd`/whole-program-llvm

This will put the IIGlue and IIGen binaries in ~/.cabal/bin. Binaries for i386 and x86_64 Linux are available on my research page.



tar xf gsl-1.15.tar.gz
cd gsl-1.15

# Compile using the whole-program llvm wrapper
CC=wllvm CFLAGS="-g" ./configure --prefix=`pwd`/root
make && make install
extract-bc root/lib/libgsl.so.0.16.0

# Analyze
IIGlue --repository=`pwd` root/lib/libgsl.so.0.16.0.bc

# Generate FFI binding
IIGen libgsl.so.0.16.0.json > gsl.py

First, the library must be converted to LLVM bitcode. This example uses my whole program LLVM wrapper to compile the library twice: once into a normal binary and once into bitcode (obtained with extract-bc).

Next, the bitcode is analyzed with IIGlue. The --repository flag tells the tool both where to put the summary for the input module, along with where to find dependency modules. The output is a summary of the input module (in this case, libgsl.so.0.16.0.json). Note that IIGlue requires the LLVM opt executable to be in your PATH.

Finally, the summary is fed to IIGen to produce a Python module that defines wrappers to call library functions.


The full set of options for IIGlue is:

 --dependency=DEPENDENCY   A dependency of the library being analyzed.

This option allows dependencies of a library to be specified. The summaries for dependencies are loaded and improve the precision of the analysis. Dependencies are specified by the name of the shared library that the input depends on (e.g., libcblas.so.0.0.0 is a dependency for gsl).

 --repository=DIRECTORY    The directory containing dependency summaries.
                           The summary of the input library will be stored
                           here. (Default: consult environment)

This option is required, but may also be set through the environment variable INFERENCE_REPOSITORY. This option tells the analysis the directory it should look in to find dependency summaries. It also tells the analysis where to write the summary for the input module.

 --diagnostics=DIAGNOSTIC  The level of diagnostics to show (Debug, Info,
                           Warning, Error).  Default: Warning

Control over diagnostic output

 --source=FILE             The source for the library being analyzed
                           (tarball or zip archive).  If provided, a report
                           will be generated

Tell the analysis the source tarball that was used to build the input library. This is used to provide detailed reports about the analysis results, including source lines that led the analysis to report each of its annotations. This option is only used if --reportDir is also specified.

 --reportDir=DIRECTORY     The directory in which the summary report will
                           be produced.  Defaults to the REPOSITORY.

Write an HTML report describing the annotations inferred to the given directory. If --source is also specified, the report includes hyperlinked breakdowns of individual functions.

 --annotations=FILE        An optional file containing annotations for
                           the library being analyzed

If the analysis cannot infer all of the annotations you think it should, you can use this option to provide it with some manual annotations that it will include in its output and analysis. This is most useful to annotate complicated allocation functions that the analysis cannot prove are actually allocators (due to memory pooling, for example).

The format is the same as the output.

 -? --help                    Display help message

IIGen currently does not support any options. It always writes its output to standard output. The modules it generates work with both Python 2.x and Python 3.x.