R crashes when using tidyr::unnest_wider #1348

mattnolan001 · 2022-04-11T05:53:14Z

Running tidyr::unnest_wider as part of a pipe, R crashes with the following error reported,

[91205:91206:20220410,071753.955164:ERROR file_io_posix.cc:148] open /home/matt/.r/crashpad_database/pending/dc2183c4-0851-4c62-908e-7d4e41a2702e.lock: File exists (17)
[91205:91205:20220410,071753.957703:ERROR process_memory_range.cc:86] read out of range
[91205:91205:20220410,071753.957712:ERROR elf_image_reader.cc:558] missing nul-terminator
[91205:91205:20220410,071753.957794:ERROR elf_dynamic_array_reader.h:61] tag not found
[91205:91205:20220410,071753.960132:ERROR elf_dynamic_array_reader.h:61] tag not found
[91205:91205:20220410,071753.960189:ERROR elf_dynamic_array_reader.h:61] tag not found
[91205:91205:20220410,071753.960236:ERROR elf_dynamic_array_reader.h:61] tag not found
[91205:91205:20220410,071753.960281:ERROR elf_dynamic_array_reader.h:61] tag not found

Further description and discussion here: https://stackoverflow.com/questions/71820155/r-crashes-when-using-tidyrunnest-wider

Here is a reproducible example (with thanks to Ben Bolker):

library(tidyverse)
f <- function(n) tibble(neuron=0:(n-1), r.squared = rnorm(n),
     slope = rnorm(n), p.value = rnorm(n))
df <- replicate(1000, f(1000), simplify = FALSE)

dff <- tibble(x=df)
for (i in 1:100) { 
   cat(i, "\n")
   unnest_wider(dff, x)
}

The text was updated successfully, but these errors were encountered:

Closes tidyverse/tidyr#1348

mattnolan001 · 2022-04-12T20:07:44Z

@lionel- A more reliable reprex is:

library(tidyverse)
f <- function(n) {
  df <- tibble(neuron=0:(n-1), r.squared = rnorm(n),
                        slope = rnorm(n), p.value = rnorm(n))
  df$p.value[2] <- NA
  df
}
df <- replicate(1000, f(1000), simplify = FALSE)

dff <- tibble(x=df)
for (i in 1:100) { 
  cat(i, "\n")
  unnest_wider(dff, x)
}

I've tested this with tidyr from CRAN and with tidtr 1.2.0.9000. It reliably crashes with both.

DavisVaughan · 2022-04-12T20:13:09Z

r-lib/vctrs#1553 should fix this

mattnolan001 · 2022-04-12T20:15:35Z

Is r-lib/vctrs#1553 included in v1.2.0.9000? This is the number given for the version I just pullled from the main github repo and it still crashes.

DavisVaughan · 2022-04-12T20:19:20Z

You'd have to run devtools::install_github("r-lib/vctrs#1553"), then restart R. Then both CRAN tidyr and dev tidyr should work

mattnolan001 · 2022-04-12T20:41:21Z

Thanks! That works now.

lionel- · 2022-04-13T10:37:39Z

The fix is now on CRAN, thanks for the reprex and report @mattnolan001 @bbolker!

@krlmlr

# vctrs 0.4.1 * OOB errors with `character()` indexes use "that don't exist" instead of "past the end" (#1543). * Fixed memory protection issues related to common type determination (#1551, tidyverse/tidyr#1348). # vctrs 0.4.0 * New experimental `vec_locate_sorted_groups()` for returning the locations of groups in sorted order. This is equivalent to, but faster than, calling `vec_group_loc()` and then sorting by the `key` column of the result. * New experimental `vec_locate_matches()` for locating where each observation in one vector matches one or more observations in another vector. It is similar to `vec_match()`, but returns all matches by default (rather than just the first), and can match on binary conditions other than equality. The algorithm is inspired by data.table's very fast binary merge procedure. * The `vec_proxy_equal()`, `vec_proxy_compare()`, and `vec_proxy_order()` methods for `vctrs_rcrd` are now applied recursively over the fields (#1503). * Lossy cast errors now inherit from incompatible type errors. * `vec_is_list()` now returns `TRUE` for `AsIs` lists (#1463). * `vec_assert()`, `vec_ptype2()`, `vec_cast()`, and `vec_as_location()` now use `caller_arg()` to infer a default `arg` value from the caller. This may result in unhelpful arguments being mentioned in error messages. In general, you should consider snapshotting vctrs error messages thrown in your package and supply `arg` and `call` arguments if the error context is not adequately reported to your users. * `vec_ptype_common()`, `vec_cast_common()`, `vec_size_common()`, and `vec_recycle_common()` gain `call` and `arg` arguments for specifying an error context. * `vec_compare()` can now compare zero column data frames (#1500). * `new_data_frame()` now errors on negative and missing `n` values (#1477). * `vec_order()` now correctly orders zero column data frames (#1499). * vctrs now depends on cli to help with error message generation. * New `vec_check_list()` and `list_check_all_vectors()` input checkers, and an accompanying `list_all_vectors()` predicate. * New `vec_interleave()` for combining multiple vectors together, interleaving their elements in the process (#1396). * `vec_equal_na(NULL)` now returns `logical(0)` rather than erroring (#1494). * `vec_as_location(missing = "error")` now fails with `NA` and `NA_character_` in addition to `NA_integer_` (#1420, @krlmlr). * Starting with rlang 1.0.0, errors are displayed with the contextual function call. Several vctrs operations gain a `call` argument that makes it possible to report the correct context in error messages. This concerns: - `vec_cast()` and `vec_ptype2()` - `vec_default_cast()` and `vec_default_ptype2()` - `vec_assert()` - `vec_as_names()` - `stop_` constructors like `stop_incompatible_type()` Note that default `vec_cast()` and `vec_ptype2()` methods automatically support this if they pass `...` to the corresponding `vec_default_` functions. If you throw a non-internal error from a non-default method, add a `call = caller_env()` argument in the method and pass it to `rlang::abort()`. * If `NA_character_` is specified as a name for `vctrs_vctr` objects, it is now automatically repaired to `""` (#780). * `""` is now an allowed name for `vctrs_vctr` objects and all its subclasses (`vctrs_list_of` in particular) (#780). * `list_of()` is now much faster when many values are provided. * `vec_as_location()` evaluates `arg` only in case of error, for performance (#1150, @krlmlr). * `levels.vctrs_vctr()` now returns `NULL` instead of failing (#1186, @krlmlr). * `vec_assert()` produces a more informative error when `size` is invalid (#1470). * `vec_duplicate_detect()` is a bit faster when there are many unique values. * `vec_proxy_order()` is described in `vignette("s3-vectors")` (#1373, @krlmlr). * `vec_chop()` now materializes ALTREP vectors before chopping, which is more efficient than creating many small ALTREP pieces (#1450). * New `list_drop_empty()` for removing empty elements from a list (#1395). * `list_sizes()` now propagates the names of the list onto the result. * Name repair messages are now signaled by `rlang::names_inform_repair()`. This means that the messages are now sent to stdout by default rather than to stderr, resulting in prettier messages. Additionally, name repair messages can now be silenced through the global option `rlib_name_repair_verbosity`, which is useful for testing purposes. See `?names_inform_repair` for more information (#1429). * `vctrs_vctr` methods for `na.omit()`, `na.exclude()`, and `na.fail()` have been added (#1413). * `vec_init()` is now slightly faster (#1423). * `vec_set_names()` no longer corrupts `vctrs_rcrd` types (#1419). * `vec_detect_complete()` now computes completeness for `vctrs_rcrd` types in the same way as data frames, which means that if any field is missing, the entire record is considered incomplete (#1386). * The `na_value` argument of `vec_order()` and `vec_sort()` now correctly respect missing values in lists (#1401). * `vec_rep()` and `vec_rep_each()` are much faster for `times = 0` and `times = 1` (@mgirlich, #1392). * `vec_equal_na()` and `vec_fill_missing()` now work with integer64 vectors (#1304). * The `xtfrm()` method for vctrs_vctr objects no longer accidentally breaks ties (#1354). * `min()`, `max()` and `range()` no longer throw an error if `na.rm = TRUE` is set and all values are `NA` (@gorcha, #1357). In this case, and where an empty input is given, it will return `Inf`/`-Inf`, or `NA` if `Inf` can't be cast to the input type. * `vec_group_loc()`, used for grouping in dplyr, now correctly handles vectors with billions of elements (up to `.Machine$integer.max`) (#1133).

lionel- added a commit to r-lib/vctrs that referenced this issue Apr 12, 2022

Fix counters protection issue

70f9793

Closes tidyverse/tidyr#1348

lionel- mentioned this issue Apr 12, 2022

Fix protection issue in reduce-common-type counters r-lib/vctrs#1553

Merged

lionel- closed this as completed in r-lib/vctrs#1553 Apr 13, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

R crashes when using tidyr::unnest_wider #1348

R crashes when using tidyr::unnest_wider #1348

mattnolan001 commented Apr 11, 2022

mattnolan001 commented Apr 12, 2022

DavisVaughan commented Apr 12, 2022

mattnolan001 commented Apr 12, 2022 •

edited

Loading

DavisVaughan commented Apr 12, 2022

mattnolan001 commented Apr 12, 2022

lionel- commented Apr 13, 2022

R crashes when using tidyr::unnest_wider #1348

R crashes when using tidyr::unnest_wider #1348

Comments

mattnolan001 commented Apr 11, 2022

mattnolan001 commented Apr 12, 2022

DavisVaughan commented Apr 12, 2022

mattnolan001 commented Apr 12, 2022 • edited Loading

DavisVaughan commented Apr 12, 2022

mattnolan001 commented Apr 12, 2022

lionel- commented Apr 13, 2022

mattnolan001 commented Apr 12, 2022 •

edited

Loading