________
/\ mori \
/ \ \
\ / 森 /
\/_______/
Shared Memory for R Objects
→ share() writes an R object into shared memory and returns a shared
version
→ ALTREP serialization hooks — shared objects serialize compactly and
work transparently with serialize() and mirai()
→ ALTREP-backed lazy access — a 100-column data frame is one mmap;
columns materialize on first touch
→ OS-level shared memory (POSIX / Win32) — pure C, no external dependencies; read-only in other processes, preventing corruption of shared data
→ Automatic cleanup — shared memory is freed when the R object is garbage collected
install.packages("mori")share() writes an R object once into shared memory and returns a
zero-copy ALTREP view. Shared objects serialize compactly via ALTREP
serialization hooks, working transparently with mirai and any R
serialization path. Shared memory is automatically freed when the object
is garbage collected.
library(mori)
# Share a vector — returns an ALTREP-backed object
x <- share(rnorm(1e6))
mean(x)
#> [1] 0.0005982035
# Serialized form is ~100 bytes, not ~8 MB
x |> serialize(NULL) |> length()
#> [1] 124shared_name() extracts the SHM name from a shared object.
map_shared() opens a shared region by name — useful for accessing the
same data from another process without serialization:
x <- share(1:1e6)
# Extract the SHM name
nm <- shared_name(x)
nm
#> [1] "/mori_4c0f_1"
# Another process can map the same region by name
y <- map_shared(nm)
identical(x[], y[])
#> [1] TRUEShared objects can be sent to local daemons — the ALTREP serialization hooks ensure only the SHM name crosses the wire, and the daemon maps the same physical memory.
library(lobstr)
library(mirai)
daemons(1)
x <- share(rnorm(1e6))
# Worker maps the same shared memory — 0 bytes copied
m <- mirai(list(mean = mean(x), size = lobstr::obj_size(x)), x = x)
m[]
#> $mean
#> [1] 0.0008675476
#>
#> $size
#> 840 B
daemons(0)Elements of a shared list also serialize compactly — each element travels as a reference to its position in the parent shared region, not as the full data:
daemons(3)
# Share a list — all 3 vectors in a single shared region
x <- list(a = rnorm(1e6), b = rnorm(1e6), c = rnorm(1e6)) |> share()
# Each element is sent as (parent_name, index) — zero-copy on the worker
mirai_map(x, \(v) lobstr::obj_size(v) |> format())[.flat]
#> a b c
#> "840 B" "840 B" "840 B"
daemons(0)Parallel computing multiplies memory. When 8 workers each need the same 210 MB dataset, that is 1.7 GB of serialization, transfer, and deserialization — with 8 separate copies consuming RAM.
mori eliminates all of it. share() writes data into shared memory
once. Each worker maps the same physical pages, receiving a reference of
~300 bytes instead of the full dataset — a payload ~700,000 times
smaller, which translates into a significant saving in memory usage as
well as total runtime:
daemons(8)
# 200 MB data frame — 5 columns × 5M rows
df <- as.data.frame(matrix(rnorm(25e6), ncol = 5))
shared_df <- share(df)
boot_mean <- \(i, data) colMeans(data[sample(nrow(data), replace = TRUE), ])
# Without mori — each daemon deserializes a full copy
mirai_map(1:8, boot_mean, data = df)[] |> system.time()
#> user system elapsed
#> 2.135 38.222 5.823
# With mori — each daemon maps the same shared memory
mirai_map(1:8, boot_mean, data = shared_df)[] |> system.time()
#> user system elapsed
#> 1.377 27.121 3.949
daemons(0)All atomic vector types and lists / data frames are written directly
into shared memory, with attributes preserved end-to-end. Pairlists are
coerced to lists. share() returns ALTREP wrappers that point into the
shared pages — no deserialization, no per-process memory allocation.
All other R objects (environments, closures, language objects) are
returned unchanged by share() — no shared memory region is created.
A data frame with 10 columns lives in a single shared region. A task that touches 3 columns pays for 3. Character strings are accessed lazily per element.
Shared memory is managed by R’s garbage collector. The SHM region stays alive as long as the shared object (or any element extracted from it) is referenced in R. When no references remain, the garbage collector frees the shared memory automatically.
Important: Always assign the result of share() to a variable. The
shared memory is kept alive by the R object reference — if the result is
used temporarily (not assigned), the garbage collector may free the
shared memory before a consumer process has mapped it.
Shared data is mapped read-only. Mutations are always local — R’s copy-on-write mechanism ensures other processes continue reading the original shared data:
- Structural changes to a list or data frame (adding, removing, or reordering elements) produce a regular R list. The shared region is unaffected.
- Modifying values within a shared vector (e.g.,
X[1] <- 0) materializes just that vector into a private copy. Other vectors in the same shared region stay zero-copy.
–
Please note that the mori project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.