Skip to content

NIMBLE compiler from part of R to C

perrydv edited this page Mar 10, 2015 · 4 revisions

Providing automatic differentiation to R via NIMBLE

Summary: Extending the compiler of the NIMBLE package to generate code for automatic differentiation using a C++ library, most likely CppAD.

Description: NIMBLE contains a compiler that generates C++ from a subset of R syntax, compiles it, loads it into R, and provides the user with an R object to interface the C++ functions. The compiler can generate code that uses C++ libraries, thereby allowing R programmers to use them without knowing any C++. For example, NIMBLE compiles vectorized arithmetic and linear algebra by generating C++ code for the Eigen library. Automatic differentiation (AD) is a numerical method for computing exact derivatives from math expressions. Implementations from C++ libraries can be very efficient. In this project the student will extend the NIMBLE compiler to generate C++ code to use an automatic differentiation library such as CppAD. This will make efficient AD available from R.

Related work: We see there was a R GSoC project in 2010 to build automatic differentiation in R. We are not able to determine if the result of that work continues to be supported, but if so it is unlikely to be as efficient or general as what could be built by harnessing CppAD. There is related work in Template Model Builder (TMB, an update to AD Model Builder) using CppAD with Eigen, the same combination needed for NIMBLE, so learning from that success will be part of this project. There are other potential AD libraries to consider. These will be considered before reaching a final choice. There are other packages that facilitate work in C++ and R, notably RCpp, but NIMBLE takes a different approach by compiling from R-like code (formally a domain-specific language embedded within R) via C++ without the developer needing to think about C++ at all. There are other compiler-based approaches for speeding up R, include the byte-code compiler and Rllvm, but we think NIMBLE is the only one that generates C++ code and hence is suitable to harness C++ libraries that must do compilation-time magic such as the types and operator overloading used by AD libraries.

Potential tasks:

  • Get up to speed on CppAD (or other selected library) by writing a suite of use cases.
  • Determine syntax for expressing AD requests from R
  • Add keyword processing template(s) to the NIMBLE compiler for AD calls
  • Add type inference and C++ code generation to the NIMBLE compiler for AD types and C++ syntax.
  • Write unit tests of new functionality.
  • Write documentation

Skills required / desired: R, C++, and compiler concepts. In more details:

  • Experience treating code as an object in R, such as eval(substitute(...)) idioms.
  • Experience interfacing R and C.
  • Basic knowledge of github.
  • Desired: Knowledge of Eigen linear algebra library and/or automatic differentiation.
  • Desired: Knowledge of compiler concepts such as a parse tree / abstract syntax tree and symbol table.
  • The last two skills are a bit specialized so please apply if you are interested and willing to learn.

Test: An R function that takes a call object as an argument and returns the same call with "2" appended to all variable names. For example, GSoCtest(quote(foo(a + hw(b + exp(c * 5))))) would return a call object containing foo(a2 + hw(b2 + exp(c2 * 5))).

Mentor: Perry de Valpine ([@](mailto:pdevalpine {at} berkeley {dot} edu)) and Chris Paciorek ([@](mailto:paciorek {at} stat {dot} berkeley {dot} edu)) as backup mentor