Skip to content

[R] Improve evaluation of R functions from C++ #32447

@asfimport

Description

@asfimport

There are currently a few places where we call R code from C++ (and after ARROW-16444 and ARROW-16703 we will have some more where the overhead of calling into R might be greater than the time it takes to actually evaluate the function/the functions will be called in a tight loop).

The current approach uses cpp11::function. This is totally fine and safe but generates some ugly backtraces on error and is potentially slower than the lean-and-mean approach of purrr (whose entire job is to call R functions in a loop and has been heavily optimized). The purrr approach is to construct the call() and calling environment in advance and then just run Rf_eval(call, env) in the loop. This is both faster (fewer R API calls) and generates better backtraces (e.g., Error in fun(arg1, arg2) instead of Error in (function(a, b) { ...the whole content of the function ... })(every, deparsed, argument).

Before optimizing that heavily we should of course benchmark to see exactly how much that matters!

Reporter: Dewey Dunnington / @paleolimbot

Note: This issue was originally created as ARROW-17148. Please see the migration documentation for further details.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions