What is Cubix?
Cubix is a framework for language-parametric program transformation, i.e.: defining a single source-to-source program transformation tool that can be run on multiple languages. In Cubix, you can write transformations with a type signature like "This transformation works for any language that has assignments, loops, and a name-binding analysis," and instantly get separate tools for C, Java, etc. The goal is to radically reduce the cost of building sophisticated whole-program refactoring tools by allowing each tool to be built for a much larger market.
Cubix is based on the idea of incremental parametric syntax, a technique for defining families of representations of programming languages which share common components, and for defining them as a small modification to a pre-existing syntax definition. The name "Cubix" comes from the 2000s television show "Cubix: Robots for Everyone;" in that show, "Cubix" is a robot composed of modular pieces that can be reassembled for many purposes.
The Cubix system itself, and the general incremental parametric syntax approach, is described in the OOPSLA 2018 paper:
- One Tool, Many Languages: Language-Parametric Transformation with Incremental Parametric Syntax*; James Koppel et al
We also recommend reading the following papers to get the necessary background in generic programming to understand Cubix:
- Data Types à la Carte, Wouter Swierstra
- Compositional Data Types, Patrick Bahr and Thomas Hvitved
Transformations in Cubix use our
compstrat library of strategy
combinators. To understand strategy combinators, we recommend the
- The Essence of Strategic Programming, Ralf Lämmel et al
What Cubix is not
Cubix is the world's first framework that can build language-parametric source-to-source transformations. As the first of its kind, it often gets mistaken for a solution to more familiar problems. In particular, it is not:
- A tool for translating one language into another. Cubix allows you to create a single tool that can transform C programs into better C programs and Java programs into better Java programs. It is not designed for building tools that can transform C programs into Java programs. Indeed, much of its power comes from its ability to preserve all the information of the original program.
- A collection of ready-to-use refactoring tools. Thus far, all transformations built on Cubix are tech demos. While a couple are theoretically useful, they have not undergone the amount of UX engineering needed to actually be time-savers.
- A framework for writing multi-language program analyses. Cubix transformations may consume results provided by other analyses, which may be written in either a single-language or multi-language fashion.
- A framework for analysis/transformation of polyglot programs, i.e.: programs (or single source files) written in multiple languages.
However, Cubix's generic-programming capabilities make it a powerful tool for building all kinds of programming tools, even for only one language. We are particularly excited about the potential of extending Cubix to transform polyglot programs.
To build Cubix:
First, download the sub-libraries
git submodule update
Second, build Cubix:
stack build --ghc-options='-O0 -j +RTS -A256m -n2m -RTS'
You may be prompted for your Github credentials to download the third-party frontends.
You are now ready to run Cubix transformations:
stack exec examples-multi java hoist input-files/java/Foo.java
Cubix has many dependencies, several of which are not on Hackage/Stackage, including some forks of Hackage libraries whose changes have not been merged upstream. This makes it more difficult to create a new package which depends on Cubix. For an example of how to do this, see https://github.com/jkoppel/using-cubix-example.
Because of performance problems in GHC, the full Cubix will not build with -O1 or -O2. We've tried on a server with 64GB RAM; the server ran out of memory. Instead, use this command:
alias stackfastbuild="stack build --ghc-options='-O0 -j +RTS -A256m -n2m -RTS'"
This builds Cubix in parallel with minimal optimization, and sets the initial GHC heap to larger than usual.
We found the following two minimal sets of compilation flags that mitigate this blowup and make compilation manageable:
1: -fno-cse -fno-full-laziness 2: -fno-specialize -funfolding-creation-threshold=0
If disable everything except CSE and specialise, blow-up still occurs. Remains true with -O1
Adding "--flag cubix:only-one-language" to the build command will turn on a compile flag that disables building support for all languages except Lua, the smallest language. This greatly speeds compilation times, to the point where we are able to compile with -O2. Some of the performance reports in cubix/benchmarks/reports were compiled with this flag.
A Docker image containing the version of Cubix submitted to OOPSLA 2018 along with runs of all the experiments, including the human study, is available from https://zenodo.org/record/1413855 .
Overview of directories (corresponding to the top-level of the zip file, and the /cubix directory on the Docker image):
/stack.yaml # High-level build description /package.yaml # Low-level build description /examples # Example transformations built with Cubix /examples/multi # The main driver for the multi-language transformations /comptrans # Source code for the comptrans library /compstrat # Source code for compstrat, our library for Strategic Programming with Compositional Datatypes /input-files # Small test inputs in each of the 5 languages /scripts # Scripts for running the transformations over compiler test suites
Running the built-in transformations and analyses
Use this command from the top-level directory (the one that contains "stack.yaml"):
stack exec examples-multi
You will be give the following help:
Usage: examples-multi <language> <transform> <file>* examples-multi <language> <analysis> <file>* Transforms available: debug,id,cfg,elementary-hoist,hoist,tac,testcov,ipt Analyses available: anal-triv-call
Note that only the IPT transformation can be run on multiple files.
It can be faster to run the executable directory, without using
stack exec. On the first author's laptop, this executable is located at
Using the Interprocedural Plumbing Transformation (IPT) Tool
We recommend also using the
rlwrap command to enable command history.
rlwrap stack exec examples-multi java ipt input-files/java/ipt/*.java
In C and Java, you will additionally be prompted for the type of the
parameter to add. Give this type as an AST for
(PrimType IntT), not
Running the language test suites
Cubix comes with scripts for running any semantics-preserving
and Python, in the files
scripts/test_java.rb and similar. To do so:
- Follow the below instructions to install the language tests.
- Each .rb file contains a constant at the top like
JAVA_TESTSpointing to the language implementation and tests directory. Modify these appropriately.
./scripts/test_<lang>.rb <name of transformation>
These scripts will run the transformation on all tests, run the transformed tests, and report the final pass/fail counts.
For all transformations except
id, they will first run the identity
transformation, and discard any tests that fail. This rules out tests
that trigger bugs in the 3rd party parsers/pretty-printers, most (but
not all) self-referential tests, and tests that the original language
They also come with a special
count_loc parameter that counts the
total lines of code in all relevant tests.
Installing GCC and GCC torture
Note: The C instructions are still being updated.
In some directory:
git clone https://github.com/gcc-mirror/gcc.git cd gcc # If you want the same revision of gcc-torture as in the paper git reset --hard f72de674726c5d054b9d99b0a4db09dfb52bf494 cd .. mkdir gcc_build cd gcc_build ../gcc-mirror/configure make -j8 make install
Installing the K-Java test suite
In some directory:
git clone https://github.com/kframework/java-semantics # If you want the same revision as in the paper cd java-semantics git reset --hard c202266304340a2a4be81fa21ee4fe36b3117ee3
In some directory:
The Lua tests
As described in the paper, we had to make several modifications to the Lua test suite to get them to run with Cubix. In particular, we removed several overly self-referential tests, and modified the test suite to report the number of passing/total assertions, rather than aborting the entire suite on the first failure.
These tests are in the
Installing the CPython tests
In some directory:
git clone https://github.com/python/cpython.git cd cpython git reset --hard 7bd4afec86849a57b48f375a9c4e0c32f0539dad # Only if you want the same revision as in the paper ./configure make