Add ALTREP view support #1725

DavisVaughan · 2024-07-01T19:49:14Z

More tests?
Hear back about DATAPTR() usage in character implementation
Extract_subset implementation? Or is the fallback good enough? (New approach is much faster even without Extract_subset due to faster Elt access with cached pointers, we really want base R to handle all the intricacies of this.)
Use #defines to section off code that doesn't work in older R versions, possibly bump minimal rlang R version
Decide on r_init_library_with_dll() helper with Lionel
Support R >=3.6?

Then

Follow up PR to rename r_vec_resize() to r_vec_grow() and remove support for shrinking
- Use ALTREP views in r_dyn_unwrap() in this PR

Adds support for ALTREP contiguous views, i.e. vec_view(x, start, size), for the major R types

A few notes:

x gets marked as not mutable on the way in
No attributes are allowed on x right now
The ALTREP classes are registered at load time but need the dll, so I had to introduce a new r_init_library_with_dll() helper that goes in your R_init_<pkg>() function (good to know if you are utilize the rlang C API). The ALTREP classes are registered with the package name, so if vctrs uses the rlang C API then it will register its own ALTREP view classes under the "vctrs" package name. I think this is a good thing.
Serialization will force materialization, which I think is nice because it avoids us ever having to worry about "old" ALTREP view objects if we ever change the internal structure.

Here are important notes on the internal structure:

/*
Structure of a `view`:
- `data1` is:
  - the original vector, before materialization
  - the materialized vector, after materialization
- `data2` is a RAWSXP holding a `r_view_metadata`
*/

struct r_view_metadata {
  // Whether or not the ALTREP view has been materialized.
  bool materialized;

  // The offset into the original data to start at.
  r_ssize start;

  // The size of the view.
  r_ssize size;

  // A read only pointer into the data, to save indirection costs. We typically
  // set this upon view creation, unless `x` is ALTREP, in which case we delay
  // setting it until the first `DATAPTR_RO()` request, to be friendly to
  // ALTREP types that we wrap. After materialization, it is always set.
  const void* v_data_read;

  // A write only pointer into the data, to save indirection costs.
  // Always `NULL` before materialization, and it set at materialization time.
  // Never set for lists or character vectors, as write pointers are unsafe.
  void* v_data_write;
};

Benchmarking - The most notable thing in the benchmarks is that slicing with a large slice can be up to 2x slower than native code due to ExtractSubset() in base R using the *_Elt() ALTREP method to extract out each element. I think it could be much more efficient by using Dataptr_or_null() to first see if it could get a cheap readonly dataptr to the altrep data (it can here, it's a view) and then using that directly. I plan to submit that patch or talk to Luke about it, then this difference would poof go away entirely.

Also note that the difference is mostly apparent only when a contiguous slice is made with ExtractSubset(), where native objects probably have 0 cache misses, but we have to go through an ALTREP method each time. When you extract a random slice of the same size, the timings actually tend to converge and cache misses become more important.

Otherwise I think it's pretty darn good!

# R-devel at the time
getRversion()
#> [1] '4.5.0'

# load rlang
devtools::load_all(path = "~/files/r/packages/rlang/")

# Identical test
# Currently `identical()` forces materialization due to usage of `INTEGER()`
# rather than `INTEGER_RO()`, booooo.
x <- c(1, 2, 3, 4)
y <- vec_view(x, start = 1, size = 4)
identical(x, y)
#> [1] TRUE
view_is_materialized(y)
#> [1] TRUE

# Object size test
# I think lobstr is more accurate here, there is a small fixed amount of overhead
# from the metadata. Nicely neither materialize the object.
x <- c(1, 2, 3, 4)
y <- vec_view(x, start = 1, size = 4)
object.size(x)
#> 80 bytes
object.size(y)
#> 80 bytes
lobstr::obj_size(x)
#> 80 B
lobstr::obj_size(y)
#> 776 B
view_is_materialized(y)
#> [1] FALSE


# Small view, 100 elements
base <- sample(1e6)
x <- base[1:100]
y <- vec_view(base, start = 1L, size = 100L)

is_altrep(y)
#> [1] TRUE
view_is_materialized(y)
#> [1] FALSE

# One element
bench::mark(
  x[[25L]],
  y[[25L]],
  iterations = 1000000
)
#> # A tibble: 2 × 6
#>   expression      min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr> <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 x[[25L]]          0     41ns 23280892.        0B     69.8
#> 2 y[[25L]]          0     41ns 18932329.        0B     56.8

# A slice
bench::mark(
  x[10:60],
  y[10:60],
  iterations = 1000000
)
#> # A tibble: 2 × 6
#>   expression      min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr> <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 x[10:60]      123ns    246ns  3228295.      512B     25.8
#> 2 y[10:60]      287ns    369ns  2216498.      512B     15.5

# Still not materialized!
view_is_materialized(y)
#> [1] FALSE

# Duplication currently forces materialization, I think this is something
# we can fix in base R. They use `INTEGER()` where I think they should use
# `INTEGER_RO()`. Returned object is not ALTREP.
z <- duplicate(y)
view_is_materialized(y)
#> [1] TRUE

is_altrep(z)
#> [1] FALSE

# A slice after materialization - still takes the same amount of time as before materialization
# (in both cases we use a cached readonly pointer to the data)
bench::mark(
  x[10:60],
  y[10:60],
  iterations = 1000000
)
#> # A tibble: 2 × 6
#>   expression      min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr> <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 x[10:60]      123ns    246ns  3180695.      512B     22.3
#> 2 y[10:60]      287ns    369ns  2222617.      512B     13.3

# Large view, 1,000,000 elements
base <- sample(1e7)
x <- base[seq(from = 1000, length.out = 1000000)]
y <- vec_view(base, start = 1000, size = 1000000)

# Large slice (still roughly 2x slower, same as with small slice)
bench::mark(
  x[1:500000],
  y[1:500000],
  iterations = 1000
)
#> # A tibble: 2 × 6
#>   expression      min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr> <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 x[1:5e+05] 909.54µs   1.17ms      822.    3.81MB     24.5
#> 2 y[1:5e+05]   2.19ms   2.42ms      413.    3.81MB     10.2

# Large slice, random index
# The difference goes away as cache misses tend to dominate instead
idx <- sample(1e7, 1e6)
bench::mark(
  x[idx],
  y[idx],
  iterations = 1000
)
#> # A tibble: 2 × 6
#>   expression      min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr> <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 x[idx]       2.19ms    2.4ms      408.    3.81MB    10.5 
#> 2 y[idx]       2.47ms   2.69ms      357.    3.81MB     8.78

# Still not materialized!
view_is_materialized(y)
#> [1] FALSE

This typically makes most accesses 30-40% faster than the previous approach, due to less indirection. This puts us at just 2x slower than non-ALTREP accesses, rather than previously where we were 3-4x slower. Seems pretty good.

DavisVaughan · 2024-08-05T17:08:25Z

Try using this from r_init_library() to get at the DLLInfo

> unclass(unclass(getLoadedDLLs())$rlang)
$name
[1] "rlang"

$path
[1] "/Users/davis/Library/R/arm64/4.4/library/rlang/libs/rlang.so"

$dynamicLookup
[1] FALSE

$handle
<pointer: 0xa3302e10>
attr(,"class")
[1] "DLLHandle"

$info
<pointer: 0x14488b710>
attr(,"class")
[1] "DLLInfoReference"

$forceSymbols
[1] FALSE

DavisVaughan added 5 commits July 1, 2024 13:45

First implementation of ALTREP views on lgl/int/dbl/cpl

3154926

Add chr/list support

ce1da3b

Add raw support

ad22706

Cache pointers for performance optimizations

acdf0ab

This typically makes most accesses 30-40% faster than the previous approach, due to less indirection. This puts us at just 2x slower than non-ALTREP accesses, rather than previously where we were 3-4x slower. Seems pretty good.

Make ALTLIST view support conditional on R >=4.3.0

8ae0fee

DavisVaughan mentioned this pull request Jul 12, 2024

Making R internals more ALTREP friendly r-devel/r-dev-day#23

Open

DavisVaughan mentioned this pull request Jul 26, 2024

Avoid non-API calls when truncating overallocated vectors r-lib/cpp11#362

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add ALTREP view support #1725

Add ALTREP view support #1725

DavisVaughan commented Jul 1, 2024 •

edited

Loading

DavisVaughan commented Aug 5, 2024

Add ALTREP view support #1725

Are you sure you want to change the base?

Add ALTREP view support #1725

Conversation

DavisVaughan commented Jul 1, 2024 • edited Loading

DavisVaughan commented Aug 5, 2024

DavisVaughan commented Jul 1, 2024 •

edited

Loading