Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reproducible segmentation fault (with tidyr and openxlsx), bisect included #244

Closed
bersbersbers opened this issue Oct 18, 2021 · 9 comments · Fixed by #245
Closed

Reproducible segmentation fault (with tidyr and openxlsx), bisect included #244

bersbersbers opened this issue Oct 18, 2021 · 9 comments · Fixed by #245

Comments

@bersbersbers
Copy link

bersbersbers commented Oct 18, 2021

Open R and copy-and-paste this:

x <- data.frame()
invisible(gctorture2(50))

fun1 <- function() {
    as.data.frame(tidyr:::simplifyPieces(
        list(c("A", "A"), c("A", "B"), c("B", "A"), c("B", "B")), 2, FALSE
    )$strings)
}

fun2 <- function() {
    openxlsx::write.xlsx(as.data.frame(1), tempfile())
}

y <- fun1() # try(fun1) does not repro!
fun2()
while (TRUE) {
    print(fun1())
}

gives

 *** caught segfault ***
address 0x55bb00000004, cause 'memory not mapped'

Traceback:
 1: print.data.frame(fun1())
 2: print(fun1())

Possible actions:
1: abort (with core dump, if enabled)
2: normal R exit
3: exit R without saving workspace
4: exit R saving workspace
Selection: `

I can repro the problem on five OSes: 1 x OpenSUSE Leap 15.2, 2 x Windows 10, and 2 x Ubuntu 20.04 in Windows 10 WSL.

This issue has a long history with me (tidyverse/tidyr#1163, ycphs/openxlsx#267, tidyverse/ggplot2#4635, rstudio/rmarkdown#2229, thomasp85/patchwork#278).

After bisecting CRAN releases of tidyr, I finally bisected cpp11 from GitHub and found 087861f as the first bad commit. More details below.

Using R -e 'devtools::install_version("cpp11", "0.4.0"); install.packages("tidyr")' && R # repro!
Using R -e 'devtools::install_version("cpp11", "0.3.1"); install.packages("tidyr")' && R # no repro!

Using R -e 'devtools::install(); install.packages("tidyr")' && R
29ce41de04a6cc884f0217009c9e00b8c611d0c4 (master) bad
5a26bdb8e83d96e78c49bdb14024409309825a62 (v0.3.1) bad (must have been a mistake)
7d9195f47b9c653f05ff888eac4a6ef6d3389439 (v0.3.0) good
73e1acb15bcaac6b9c2c72804b1c0e4fdaeab9c0 good
5a26bdb8e83d96e78c49bdb14024409309825a62 good
5a26bdb8e83d96e78c49bdb14024409309825a62 (v0.3.1) good
1c84e03239dd759c8e8f52d185d9acf3c1251041 bad
9df63865765284ded32ab49bb44deb3c48d385f5 good
ac7374e81bb9e8fe35b21424c53e21b91b00c9e4 good
cde1872c8000b79e008420f5923dbfe2eb51e953 bad
087861f9ee2383d717255ecb0487e4b297808c49 bad
b9ec7eb4448b3eeea47e65df917f69034fda926c good
087861f9ee2383d717255ecb0487e4b297808c49 first bad!

I am still not convinced that cpp11 is the culprit, but developers of openxlsx and tidyr, while being able to reproduce the issue, don't find the source in their code base, either. So maybe people here have an idea.

@DavisVaughan
Copy link
Member

It is also generally worth noting that I can't seem to reproduce this with an old version of tidyr that used Rcpp, but soon after the switch to cpp11 this starts happening. It has proved to be difficult to isolate though

@jimhester
Copy link
Member

I so far have been unable to reproduce this either on macOS or in a Ubuntu Docker container.

The behavior looks like either stack corruption or a protection failure, but it is going to be difficult to track down unless I can reproduce it myself.

@bersbersbers
Copy link
Author

I had hoped that an example that fails on Windows and Linux is universal, but it seems not. But I can provider some further information.

Windows is Windows 10 21H2, R 4.1.1
WSL is Ubuntu 20.04, R 4.1.1.
Linux is OpenSUSE Leap 15.2

Here's a session info from WSL (before the loop):

> sessionInfo()
R version 4.1.1 (2021-08-10)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 20.04.3 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0

locale:
 [1] LC_CTYPE=C.UTF-8       LC_NUMERIC=C           LC_TIME=C.UTF-8
 [4] LC_COLLATE=C.UTF-8     LC_MONETARY=C.UTF-8    LC_MESSAGES=C.UTF-8
 [7] LC_PAPER=C.UTF-8       LC_NAME=C              LC_ADDRESS=C
[10] LC_TELEPHONE=C         LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.7       tidyr_1.1.4      fansi_0.5.0      utf8_1.2.2
 [5] crayon_1.4.1     dplyr_1.0.7      R6_2.5.1         lifecycle_1.0.1
 [9] magrittr_2.0.1   zip_2.2.0        pillar_1.6.4     stringi_1.7.5
[13] rlang_0.4.12     vctrs_0.3.8      generics_0.1.0   ellipsis_0.3.2
[17] openxlsx_4.2.4   tools_4.1.1      glue_1.4.2       purrr_0.3.4
[21] compiler_4.1.1   pkgconfig_2.0.3  tidyselect_1.1.1 tibble_3.1.5

What else could be helpful for me to provide?

@DavisVaughan
Copy link
Member

I've managed to reproduce this without tidyr

  • CRAN cpp11 (0.4.0)
  • CRAN openxlsx (4.2.4)
  • Mac Mojave
  • R 4.1.1

Open up an R Console and run this and for me it segfaults most of the time (running this in RStudio won't immediately crash it, but it often makes further processing hang and eventually segfault somewhere)

invisible(gctorture2(50))

cpp11::cpp_function(
  '
  cpp11::list fn_cpp() {
    cpp11::writable::list out(4);
    return out;
  }
  '
)

fun2 <- function() {
  openxlsx::write.xlsx(as.data.frame(1), tempfile())
}

y <- fn_cpp()
fun2()

while (TRUE) {
  print(fn_cpp())
}

@bersbersbers
Copy link
Author

bersbersbers commented Oct 26, 2021

@DavisVaughan great job. I had to install.packages("decor"), but now I could confirm the segmentation fault on WSL (pasting the code into the R console on the shell) on the first try (not on the second, though). On native Linux, there is not segfault either as far as I have tried (a few times), but I guess the most important point is to have one less package to care about. I wonder when we find/if there is an example without openxlsx.

jimhester added a commit that referenced this issue Oct 26, 2021
It seems relying on non-local static objects can cause memory issues,
that are resolved by moving to use a local static objects instead.

Fixes #244
@jimhester
Copy link
Member

Ok, I have finally be able to reproduce this, I think the problem was I was using R 4.1.0 rather than 4.1.1+, and I didn't encounter the crash with that version.

#245 resolves this for me, @bersbersbers could you try that with your more full examples and verify it fixes the problem there as well?

@DavisVaughan
Copy link
Member

I ran my example above with gctorture2(5) and it seems to be working, so I think that was it?

@bersbersbers
Copy link
Author

bersbersbers commented Oct 26, 2021

Test log:

  • Code in initial post repros
  • devtools::install_github("r-lib/cpp11")
  • Code in initial post repros
  • install.packages("tidyr")
  • Code in initial post repros
  • devtools::install_github("https://github.com/r-lib/cpp11/pull/245")
  • Code in initial post repros
  • install.packages("tidyr")
  • Code in initial post repros (not!!!)
  • install.packages("cpp11")
  • Code in initial post repros (not)
  • install.packages("tidyr")
  • Code in initial post repros

This looks pretty consistent in that installing #245 and reinstalling tidyr fixes the issue until tidyr is rebuilt with another version of cpp11 not including #245. Looks like a weeks-long saga comes to an end, thanks to everyone involved 🚀

@bersbersbers
Copy link
Author

Tests using my original code took a bit longer, so here is the condensed log:

  • install.packages("cpp11"); install.packages("tidyr")
  • My original code repros (1st iteration)
  • devtools::install_github("https://github.com/r-lib/cpp11/pull/245"); install.packages("tidyr")
  • My original code repros (not - tested 3 iterations)
  • devtools::install_github("r-lib/cpp11"); install.packages("tidyr")
  • My original code repros (1st iteration)

Thanks again!

jimhester added a commit that referenced this issue Nov 1, 2021
It seems relying on non-local static objects can cause memory issues,
that are resolved by moving to use a local static objects instead.

Fixes #244
@bersbersbers bersbersbers mentioned this issue Nov 8, 2021
14 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants