Skip to content

Improve unchop() and unnest() performance #1127

@mgirlich

Description

@mgirlich

unnest() and unchop() can be sped up quite a bit.

Using the same reprex as in #751

library(tidyr)
n <- 100000

df <- tibble(
  g = 1:n,
  y = rep(list(tibble(x = 1:5)), n)
)

bench::mark(
  unnest(df, y),
  unnest_legacy(df, y),
  check = TRUE
)


# before
#> Warning: Some expressions had a GC in every iteration; so filtering is disabled.
#> # A tibble: 2 x 13
#>   expression                min   median `itr/sec` mem_alloc `gc/sec` n_itr  n_gc
#>   <bch:expr>           <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl> <int> <dbl>
#> 1 unnest(df, y)          90.6ms   96.3ms     10.1     2.85MB     20.3     6    12
#> 2 unnest_legacy(df, y)  134.8ms  141.2ms      7.08    4.15MB     23.0     4    13

# after
#> Warning: Some expressions had a GC in every iteration; so filtering is disabled.
#> # A tibble: 2 x 13
#>   expression                min   median `itr/sec` mem_alloc `gc/sec` n_itr  n_gc
#>   <bch:expr>           <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl> <int> <dbl>
#> 1 unnest(df, y)          30.2ms   33.8ms     29.6     4.26MB     13.8    15     7
#> 2 unnest_legacy(df, y)  120.6ms  124.1ms      7.66     3.6MB     26.8     4    14

Metadata

Metadata

Assignees

No one assigned

    Labels

    featurea feature request or enhancementrectangling 🗄️converting deeply nested lists into tidy data frames

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions