Proof of Concept. Experimental with an interface subject to change.
"Compiles" a subset of R into machine code so that expressions composed with
that subset can be applied repeatedly on varying data without interpreter
overhead. {r2c}
provides speed ups of up to 100x for iterated statistics,
with R semantics, and without the challenges of directly compilable languages.
{r2c}
"compiles" R expressions or functions composed of basic binary
operators, statistics, and a few other select functions. "Compile" is in quotes
because {r2c}
generates an equivalent C program, and compiles that. To
compute the slope of a single variable regression we might use:
library(r2c)
slope <- function(x, y) sum((x - mean(x)) * (y - mean(y))) / sum((x - mean(x))^2)
r2c_slope <- r2cf(slope)
with(iris, r2c_slope(Sepal.Width, Sepal.Length))
## [1] -0.2233611
with(iris, slope(Sepal.Width, Sepal.Length))
## [1] -0.2233611
While "r2c_fun" functions can be called in the same way as normal R functions as
shown above, there is limited value in doing so. The primary use case of
{r2c}
functions is iteration.
For a full listing of supported R functions see ?"r2c-supported-funs"
.
{r2c}
is fast because it avoids the R interpreter overhead otherwise required
for each iteration. There are currently two iteration mechanisms available:
group_exec
: compute on disjoint groups in data (a.k.a. split-apply-combine).roll*_exec
: compute on (possibly) overlapping sequential windows in data.
For example, to iterate the slope function by groups, we could use:
with(iris, group_exec(r2c_slope, list(Sepal.Width, Sepal.Length), Species))
## setosa versicolor virginica
## 0.6904897 0.8650777 0.9015345
I have not found good alternatives for the general1 use case of {r2c}
, as
can be seen from the timings of computing group and window slopes on larger
data sets2:
{r2c}
is substantially faster, primarily because it does not require calling
an R function for each iteration. {collapse}
does well with group statistics,
but it requires knowing how to reformulate the expressions into a form
compatible with its semantics3 (i.e. you can't just use the R expression).
{FastR}
is also interesting, but has other drawbacks including the need for
its own runtime application, and multiple warm up runs before reaching fast
timings (*
times shown are after 4-5 runs).
For the special case of a simple statistic many packages provide dedicated
pre-compiled alternatives, some of which are faster than {r2c}
:
Even for the simple statistic case {r2c}
is competitive with dedicated
compiled alternatives like those provided by {RcppRoll}
, and {data.table}
's
frollsum
in "exact" mode. Implementations that re-use overlapping window
sections such as {data.table}
's frollsum
in "fast" mode, {roll}
, and
{slider}
, will outperform {r2c}
, particularly for larger windows.
{data.table}
and {roll}
use "on-line" algorithms, and {slider}
uses a
"segment tree" algorithm, each with varying speed and precision trade-offs4.
See Related Work and benchmark details.
To summarize:
For iterated calculations on numeric data,
{r2c}
is fastest at complex expressions, and competitive with specialized pre-compiled alternatives for simple expressions. Additionally,{r2c}
observes base R semantics for the expressions it evaluates; if you know R you can easily use{r2c}
.
First is that r2c
requires compilation. I have not included that step in
timings5 under the view that the compilation time will be amortized over many
calculations. The facilities for this don't exist yet, but the plan is to to
have {r2c}
maintain a local library of pre-compiled user-defined functions,
and for packages to compile {r2c}
functions at install-time.
More importantly, we cannot compile and execute arbitrary R expressions:
- Only
{r2c}
implemented counterpart functions may be used6 (currently: basic arithmetic/relational/comparison operators, statistics,{
,<-
, and some more - see?"r2c-supported-funs"
). - Primary numeric inputs must optionally named but otherwise attribute-less
numeric; any
.numeric
methods defined will be ignored7.
Within these constraints r2c
is flexible. For example, it is possible to have
arbitrary R objects for secondary parameters, as well as to reference
iteration-constant data:
w <- c(1, NA, 2, 3)
u <- c(-1, 1, 0)
h <- rep(1:2, each=2)
r2c_fun <- r2cq(sum(x, na.rm=TRUE) * y)
group_exec(r2c_fun, data=list(x=w), groups=h, MoreArgs=list(y=u))
## 1 1 1 2 2 2
## -1 1 0 -5 5 0
Notice the na.rm
, and that the u
in list(y=u)
is re-used in full for each
group setting the output size to 3.
With the exception of ifelse
and the implemented control structures, the C
counterparts to the R functions are intended to produce identical outputs, but
have different implementations. As such, it is possible that for a particular
set of inputs on a particular platform the results might diverge.
In addition to cleaning up the existing code, there are many extensions that can be built on this proof of concept. Some are listed below. How many I end up working on will depend on some interaction of external interest and my own.
- Expand the set of R functions that can be translated.
- Improve interface:
- Group specification (character grouping vars, multiple grouping vars, pre-grouped data).
- More consistent output format.
- Re-order arguments to support piping?
- Make all error messages intelligible.
- Library for previously "compiled" functions.
- Get on CRAN.
- Nested "r2c_fun" functions (but probably no recursion).
- API to allow other native code to invoke
{r2c}
functions. - Additional runners (e.g. an
apply
analogue).
This package is not available on CRAN yet. To install:
f.dl <- tempfile()
f.uz <- tempfile()
github.url <- 'https://github.com/brodieG/r2c/archive/main.zip'
download.file(github.url, f.dl)
unzip(f.dl, exdir=f.uz)
install.packages(file.path(f.uz, 'r2c-main'), repos=NULL, type='source')
unlink(c(f.dl, f.uz))
Or if you have {remotes}
:
remotes::install_github("brodieg/r2c")
FastR
an implementation of R that can JIT compile R code to run on the
Graal VM. It requires a different runtime (i.e. you can't just run your
normal R installation) and has other trade-offs, including warm-up cycles and
compatibility limitations8. But otherwise you type in what you would have
in normal R and see some impressive speed-ups.
The Ř virtual machine an academic project that is superficially similar to
FastR
(its thesis explains differences). Additionally renjin
appears to offer similar capabilities and tradeoffs as FastR
. I have tried
neither Ř
nor renjin
.
Closer to {r2c}
, there are at least four packages that operate on the
principle of translating R code into C (or C++), compiling that, and providing
access to the resulting native code from R:
{Odin}
, specialized for differential equation solving problems.{ast2ast}
, also targeting ODE solving and optimization.{armacmp}
, a DSL for formulating linear algebra code in R that is translated into C++.{nCompiler}
, a tool for generating C++ and interfacing it with R.
Most of these seem capable of computing iterated statistics in some form, and experienced users can likely achieve it with some work, but it will likely be difficult for someone familiar only with R.
Finally, {inline}
and {Rcpp}
allow you to write code in C/C++ and
easily interface it with R.
I am unaware of any packages that compile R expressions to avoid interpreter overhead in applying them over groups or windows of data. The closest are packages that recognize expressions for which they have equivalent pre-compiled code they run instead. This is limited to simple statistics:
{data.table}
's Gforce (see?data.table::datatable.optimize
).- In theory
{dplyr}
's Hybrid Eval is similar to Gforce, but AFAICT it was quietly dropped and despite suggestions it might return for v1.1 I see no trace of it in the most recent 1.1 candidate development versions (as of 2022-07-03).
Additionally, there is {collapse}
which provides specialized group
statistic functions. These are quite fast, particularly for simple statistics,
but you have to be familiar with {collapse}
semantics to compose complex
statistics from simple ones.
Several packages provide fast dedicated functions for a small set of simple rolling window statistics:
base::filter
for weighted rolling sums / means.{data.table}
'sfroll*
functions.{slider}
slide_<stat>
andslide_index_<stat>
.{roll}
.{zoo}
roll<stat>
.{RcppRoll}
.{runner}
.
- R Core for developing and maintaining such a wonderful language.
- Matt Dowle and Arun
Srinivasan for contributing the
{data.table}
's radix sort to R. - Sebastian Krantz for the idea of pre-computing
group meta data for possible re-use (taken from
collapse::GRP
). - Achim Zeileis et al. for
rollapply
in{zoo}
from the design of whichroll*_exec
borrows elements. - David Vaughan for ideas on window functions, including the index concept
(
position
in theroll*_exec
functions, borrowed from{slider}
). - Byron Ellis and Peter Danenberg for the
inspiration behind
lcurry
(seefunctional::CurryL
), used in tests. - Hadley Wickham and Peter Danenberg for roxygen2.
- Tomas Kalibera for
rchk and the accompanying vagrant image,
and rcnst to help detect errors in compiled code. Tomas also worked on the
precursor to the Oracle
FastR
. - Winston Chang for the r-debug docker container, in particular because of the valgrind level 2 instrumented version of R.
- Hadley Wickham et al. for ggplot2.
g.clp <- GRP(g) # pre-sort/group
fmean2 <- function(x, cg) fmean(x, cg, na.rm=FALSE, TRA="replace_fill")
slope.clp <-
fsum((x - fmean2(x, g.clp)) * (y - fmean2(y, g.clp)), g.clp, na.rm=FALSE) /
fsum((x - fmean2(x, g.clp))^2, g.clp, na.rm=FALSE)
Footnotes
-
It turns out there is
roll::roll_lm
that can compute slopes, but it cannot handle the general case of composing arbitrary statistics from the ones it implements. ↩ -
These timings do not include the
reuse_calls
optimization added in 0.2.0. ↩ -
Fast
{collapse}
implementation can be done with the use ofTRA="replace_fill"
: ↩ -
The "segment tree" algorithm will have better precision than the "on-line" algorithm, and while it is slower than the "on-line" algorithm (see the
{roll}
README for an explanation), it will begin to outperform{r2c}
at window sizes larger than 100 as its performance scales with the logarithm of window size. The "on-line" algorithm is most susceptible to precision issues, but at least on systems with 80bit long double accumulators, it seems likely that the "on-line" algorithm will be sufficiently precise for most applications. ↩ -
The first compilation can be quite slow as it requires loading the compiler, etc. Subsequent compilations run in tenths of seconds. ↩
-
Except that it is possible to embed arbitrary R expressions provided they are found to be iteration constant at runtime (see
?"r2c-expression-types"
). ↩ -
E.g. don't expect S3 dispatch to work if you define
mean.numeric
, although why one would do that for functions covered by{r2c}
is unclear. ↩ -
My limited experience with
{FastR}
is that it is astonishing, but also frustrating. What it does is amazing, but the compatibility limitations are real (e.g. with the current (c.a. Summer 2022) version neither{data.table}
nor{ggplot2}
install out of the box, and more), and performance is volatile (e.g. package installation and some other tasks are painfully slow, some expressions will hiccup after the initial warm-up). At this point it does not seem like a viable drop-in replacement to R. It likely excels at running scalar operations in loops and similar, something that R itself struggles at. ↩