Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segmentation fault involving code using write.xlsx (and tidyr) #267

Closed
bersbersbers opened this issue Oct 8, 2021 · 18 comments
Closed

Segmentation fault involving code using write.xlsx (and tidyr) #267

bersbersbers opened this issue Oct 8, 2021 · 18 comments

Comments

@bersbersbers
Copy link

I am cross-posting this here and in tidyverse/tidyr#1163 after a series of earlier investigations (thomasp85/patchwork#278, rstudio/rmarkdown#2229, tidyverse/ggplot2#4635).

I am seeing reproducible segmentation faults on Linux and Windows using the following code. There are two packages remaining which are involved, tidyr and openxlsx. I have excluded many packages along the way, but I am unable to get any further with regard to these two:

x <- data.frame()
invisible(gctorture2(500))

fun1 <- function() {
    as.data.frame(tidyr:::simplifyPieces(
        list(c("A", "A"), c("A", "B"), c("B", "A"), c("B", "B")), 2, FALSE
    )$strings)
}

fun2 <- function() {
    openxlsx::write.xlsx(as.data.frame(1), "tmp.xlsx", overwrite = TRUE)
}

y <- fun1()
fun2()
while (TRUE) {
    print(fun1())
}
@JanMarvin
Copy link
Collaborator

Hi @bersbersbers , does fun2() crash? I don't see openxlsx used anywhere else? If so, why does it crash with our current master? In addition, this bug report is not really helpful, because we now have a symptom, but nothing else to work with. Any specific error message? Unless we crash with as.data.frame(1), this looks unrelated to me.

@JanMarvin JanMarvin added the waiting for answer If not answered, the issue will be closed in 7 days. label Oct 8, 2021
@bersbersbers
Copy link
Author

bersbersbers commented Oct 8, 2021

Hi @bersbersbers , does fun2() crash?

Usually not, no. But the same example without fun2() never crashes, either.

does it crash with our current master?

Good question. Yes (with 3309cb9).

In addition, this bug report is not really helpful because we now have a symptom, but nothing else to work with.

Have you tried reproducing it? I can on two completely unrelated systems. If you could, you might be able to check more easily which parts of openxlsx might be related.

Any specific error message?

Yes - segmentation faults and/or "recursive gc invocation". May look like this:

 *** caught segfault ***
address 0x55b100000004, cause 'memory not mapped'

Traceback:
 1: mode(x)
 2: format.default(x[[i]], ..., justify = justify)
 3: format(x[[i]], ..., justify = justify)
 4: format.data.frame(if (omit) x[seq_len(n0), , drop = FALSE] else x,     digits = digits, na.encode = FALSE)
 5: as.matrix(format.data.frame(if (omit) x[seq_len(n0), , drop = FALSE] else x,     digits = digits, na.encode = FALSE))
 6: print.data.frame(fun1())
 7: print(fun1())
An irrecoverable exception occurred. R is aborting now ...
Segmentation fault (core dumped)

@JanMarvin
Copy link
Collaborator

Well it's crashing in fun1(). Therefore I conclude that it's not our fault. I'm not sure what tidyr does, maybe it's using an option set by us? On the other hand you're using an internal API to interact with their code, probably circumventing fail-safes they might have implemented. There are reasons why one should not do this.

@JanMarvin
Copy link
Collaborator

Maybe use do.call("rbind",...) or the corresponding tidyverse functions, but this looks to me as an unintended use of code and not at all like a bug. Therefore I'm closing this.

@JanMarvin JanMarvin removed the waiting for answer If not answered, the issue will be closed in 7 days. label Oct 8, 2021
@bersbersbers
Copy link
Author

bersbersbers commented Oct 8, 2021

Well it's crashing in fun1(). Therefore I conclude that it's not our fault.

I'm not sure it's that easy. I saw similar code crash with saveRDS(ggplot()), but it turned out not to be related to ggplot2 at all. See the linked issue above,
tidyverse/ggplot2#4635 (comment).

On the other hand you're using an internal API to interact with their code, probably circumventing fail-safes they might have implemented. There are reasons why one should not do this.

True. Check
tidyverse/ggplot2#4635 (comment) near the bottom for an example that uses only public APIs. I made a deliberate effort to reduce the code to the absolute necessary minimum.

Maybe once more: I am not saying this is openxlsx's fault. But it clearly is involved, and maybe you guys can help. I think you would agree that the code above should not segfault at all.

@bersbersbers
Copy link
Author

Well it's crashing in fun1().

Also check the second Details block in tidyverse/ggplot2#4635 (comment). This isopenxlsx very close to the top.

@JanMarvin
Copy link
Collaborator

Hm, the only reason I see, why it could crash and how we might be related is due to some options. You could check this if running in an entire clean environment an one with our options set. Something like opt <- options () and restore after loading our code. Other than that I don't think that we set special things for the environment or classes.

The thing with the tidyverse interaction is that it's nearly impossible to debug whatever is going on. You begin with one package and a few hours later you're at some minor dependency of a minor dependency that is used is some of their examples (that's not entirely exaggerated). I'll look into it later, but not gonna promise anything.

@bersbersbers
Copy link
Author

bersbersbers commented Oct 8, 2021

Hm, the only reason I see, why it could crash and how we might be related is due to some options. You could check this if running in an entire clean environment an one with our options set. Something like opt <- options () and restore after loading our code. Other than that I don't think that we set special things for the environment or classes.

Good idea, I'll try that next week.

The thing with the tidyverse interaction is that it's nearly impossible to debug whatever is going on. You begin with one package and a few hours later you're at some minor dependency of a minor dependency that is used is some of their examples (that's not entirely exaggerated).

I noticed the same thing ;) that's one reason why I converted the public API call to the private one. simplifyPieces is already implented in C++: https://github.com/tidyverse/tidyr/blob/master/R/cpp11.R
https://github.com/tidyverse/tidyr/blob/master/src/simplifyPieces.cpp

@jmbarbone
Copy link
Contributor

@bersbersbers, can you try to use the reprex package and include your session information?

I ran the below and didn't have any errors.

Note that I changed two things: redirected xlsx creation to a tempfile() (for cleanup) and removed the while (TRUE) loop so that it would stop without having to fail -- although I ran this well over 5 minutes without any issues.

x <- data.frame()
invisible(gctorture2(500))

fun1 <- function() {
  as.data.frame(tidyr:::simplifyPieces(
    list(c("A", "A"), c("A", "B"), c("B", "A"), c("B", "B")), 2, FALSE
  )$strings)
}

fun2 <- function() {
  openxlsx::write.xlsx(as.data.frame(1), tempfile(), overwrite = TRUE)
}

y <- fun1()
fun2()
for (i in 1:2) {
  print(fun1())
}
#>   c..A....A....B....B.. c..A....B....A....B..
#> 1                     A                     A
#> 2                     A                     B
#> 3                     B                     A
#> 4                     B                     B
#>   c..A....A....B....B.. c..A....B....A....B..
#> 1                     A                     A
#> 2                     A                     B
#> 3                     B                     A
#> 4                     B                     B

Created on 2021-10-08 by the reprex package (v2.0.1)

Session info
sessioninfo::session_info()
#> - Session info ---------------------------------------------------------------
#>  setting  value                       
#>  version  R version 4.1.0 (2021-05-18)
#>  os       Windows 10 x64              
#>  system   x86_64, mingw32             
#>  ui       RTerm                       
#>  language (EN)                        
#>  collate  English_United States.1252  
#>  ctype    English_United States.1252  
#>  tz       America/New_York            
#>  date     2021-10-08                  
#> 
#> - Packages -------------------------------------------------------------------
#>  package     * version    date       lib source                         
#>  assertthat    0.2.1      2019-03-21 [1] CRAN (R 4.1.0)                 
#>  backports     1.2.1      2020-12-09 [1] CRAN (R 4.1.0)                 
#>  cli           3.0.1      2021-07-17 [1] CRAN (R 4.1.0)                 
#>  crayon        1.4.1      2021-02-08 [1] CRAN (R 4.1.0)                 
#>  DBI           1.1.1      2021-01-15 [1] CRAN (R 4.1.0)                 
#>  digest        0.6.27     2020-10-24 [1] CRAN (R 4.1.0)                 
#>  dplyr         1.0.7      2021-06-18 [1] CRAN (R 4.1.0)                 
#>  ellipsis      0.3.2      2021-04-29 [1] CRAN (R 4.1.0)                 
#>  evaluate      0.14       2019-05-28 [1] CRAN (R 4.1.0)                 
#>  fansi         0.5.0      2021-05-25 [1] CRAN (R 4.1.0)                 
#>  fastmap       1.1.0      2021-01-25 [1] CRAN (R 4.1.0)                 
#>  fs            1.5.0      2020-07-31 [1] CRAN (R 4.1.0)                 
#>  generics      0.1.0      2020-10-31 [1] CRAN (R 4.1.0)                 
#>  glue          1.4.2      2020-08-27 [1] CRAN (R 4.1.0)                 
#>  highr         0.9        2021-04-16 [1] CRAN (R 4.1.0)                 
#>  htmltools     0.5.2      2021-08-25 [1] CRAN (R 4.1.1)                 
#>  knitr         1.34       2021-09-09 [1] CRAN (R 4.1.1)                 
#>  lifecycle     1.0.0      2021-02-15 [1] CRAN (R 4.1.0)                 
#>  magrittr      2.0.1      2020-11-17 [1] CRAN (R 4.1.0)                 
#>  openxlsx      4.2.4.9000 2021-09-28 [1] Github (ycphs/openxlsx@fe7bd5b)
#>  pillar        1.6.2      2021-07-29 [1] CRAN (R 4.1.0)                 
#>  pkgconfig     2.0.3      2019-09-22 [1] CRAN (R 4.1.0)                 
#>  purrr         0.3.4      2020-04-17 [1] CRAN (R 4.1.0)                 
#>  R6            2.5.1      2021-08-19 [1] CRAN (R 4.1.0)                 
#>  Rcpp          1.0.7      2021-07-07 [1] CRAN (R 4.1.0)                 
#>  reprex        2.0.1      2021-08-05 [1] CRAN (R 4.1.0)                 
#>  rlang         0.4.11     2021-04-30 [1] CRAN (R 4.1.0)                 
#>  rmarkdown     2.10       2021-08-06 [1] CRAN (R 4.1.0)                 
#>  rstudioapi    0.13       2020-11-12 [1] CRAN (R 4.1.0)                 
#>  sessioninfo   1.1.1      2018-11-05 [1] CRAN (R 4.1.0)                 
#>  stringi       1.7.4      2021-08-25 [1] CRAN (R 4.1.1)                 
#>  stringr       1.4.0      2019-02-10 [1] CRAN (R 4.1.0)                 
#>  styler        1.5.1      2021-07-13 [1] CRAN (R 4.1.0)                 
#>  tibble        3.1.4      2021-08-25 [1] CRAN (R 4.1.1)                 
#>  tidyr         1.1.3      2021-03-03 [1] CRAN (R 4.1.0)                 
#>  tidyselect    1.1.1      2021-04-30 [1] CRAN (R 4.1.0)                 
#>  utf8          1.2.2      2021-07-24 [1] CRAN (R 4.1.0)                 
#>  vctrs         0.3.8      2021-04-29 [1] CRAN (R 4.1.0)                 
#>  withr         2.4.2      2021-04-18 [1] CRAN (R 4.1.0)                 
#>  xfun          0.25       2021-08-06 [1] CRAN (R 4.1.0)                 
#>  yaml          2.2.1      2020-02-01 [1] CRAN (R 4.1.0)                 
#>  zip           2.2.0      2021-05-31 [1] CRAN (R 4.1.0)                 
#> 
#> [1] C:/Users/jmbar/Documents/R/win-library/4.1
#> [2] C:/Program Files/R/R-4.1.0/library

Additionally, the error you provided (below) is showing a traceback from print.data.frame() and failing in mode() when trying to format whichever column returned from fun1(). These all look like base functions, so it would be surprising if there is something in the openxlsx package that causes these to fail.

 *** caught segfault ***
address 0x55b100000004, cause 'memory not mapped'

Traceback:
 1: mode(x)
 2: format.default(x[[i]], ..., justify = justify)
 3: format(x[[i]], ..., justify = justify)
 4: format.data.frame(if (omit) x[seq_len(n0), , drop = FALSE] else x,     digits = digits, na.encode = FALSE)
 5: as.matrix(format.data.frame(if (omit) x[seq_len(n0), , drop = FALSE] else x,     digits = digits, na.encode = FALSE))
 6: print.data.frame(fun1())
 7: print(fun1())
An irrecoverable exception occurred. R is aborting now ...
Segmentation fault (core dumped)

On the other hand you're using an internal API to interact with their code, probably circumventing fail-safes they might have implemented. There are reasons why one should not do this.

True. Check
tidyverse/ggplot2#4635 (comment) near the bottom for an example that uses only public APIs. I made a deliberate effort to reduce the code to the absolute necessary minimum.

Using the non-exported functions for examples is actually less helpful because there could be some additional steps around them that you didn't account for. Can you make a more simple reprex using exported functions?

@JanMarvin
Copy link
Collaborator

I tried it at home and I could confirm the crash (R 4.1.1 on Arch). After a rebuild of openxlsx it went away. Maybe something spurious caused by an updated package? I replaced the openxlsx:: call with some other package and it caused a similar error which went away after a rebuild too. Took me at least an hour of my life 🙈

@bersbersbers
Copy link
Author

bersbersbers commented Oct 8, 2021

@bersbersbers, can you try to use the reprex package and include your session information?

x <- data.frame()
invisible(gctorture2(500))

fun1 <- function() {
  as.data.frame(tidyr:::simplifyPieces(
    list(c("A", "A"), c("A", "B"), c("B", "A"), c("B", "B")), 2, FALSE
  )$strings)
}

fun2 <- function() {
  openxlsx::write.xlsx(as.data.frame(1), tempfile(), overwrite = TRUE)
}

y <- fun1()
fun2()
for (i in 1:2) {
  print(fun1())
}
#>   c..A....A....B....B.. c..A....B....A....B..
#> 1                     A                     A
#> 2                     A                     B
#> 3                     B                     A
#> 4                     B                     B
#>   c..A....A....B....B.. c..A....B....A....B..
#> 1                     A                     A
#> 2                     A                     B
#> 3                     B                     A
#> 4                     B                     B

Created on 2021-10-08 by the reprex package (v2.0.1)

R version 4.1.1 (2021-08-10)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19043)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252
[2] LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_United States.1252

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

loaded via a namespace (and not attached):
 [1] rstudioapi_0.13   knitr_1.36        magrittr_2.0.1    R.cache_0.15.0
 [5] R6_2.5.1          rlang_0.4.11      fastmap_1.1.0     fansi_0.5.0
 [9] highr_0.9         styler_1.6.2      tools_4.1.1       xfun_0.26
[13] R.oo_1.24.0       utf8_1.2.2        cli_3.0.1         clipr_0.7.1
[17] withr_2.4.2       htmltools_0.5.2   ellipsis_0.3.2    yaml_2.2.1
[21] digest_0.6.28     tibble_3.1.5      lifecycle_1.0.1   crayon_1.4.1
[25] processx_3.5.2    callr_3.7.0       purrr_0.3.4       ps_1.6.0
[29] vctrs_0.3.8       R.utils_2.11.0    fs_1.5.0          evaluate_0.14
[33] glue_1.4.2        rmarkdown_2.11    reprex_2.0.1      compiler_4.1.1
[37] pillar_1.6.3      backports_1.2.1   R.methodsS3_1.8.1 pkgconfig_2.0.3

Note that I changed two things: redirected xlsx creation to a tempfile() (for cleanup) and removed the while (TRUE) loop so that it would stop without having to fail -- although I ran this well over 5 minutes without any issues.

That code still fails for me when pasted into R (not as a reprex, though), Output on Windows:

Type 'q()' to quit R.

> x <- data.frame()
> invisible(gctorture2(500))
>
> fun1 <- function() {
+   as.data.frame(tidyr:::simplifyPieces(
+     list(c("A", "A"), c("A", "B"), c("B", "A"), c("B", "B")), 2, FALSE
+   )$strings)
+ }
>
> fun2 <- function() {
+   openxlsx::write.xlsx(as.data.frame(1), tempfile(), overwrite = TRUE)
+ }
>
> y <- fun1()
> fun2()
> for (i in 1:2) {
+   print(fun1())
+ }
  c..A....A....B....B.. c..A....B....A....B..
1                     A                     A
2                     A                     B
3                     B                     A
4                     B                     B

C:\Users\bers>

On the other hand you're using an internal API to interact with their code, probably circumventing fail-safes they might have implemented. There are reasons why one should not do this.

True. Check
tidyverse/ggplot2#4635 (comment) near the bottom for an example that uses only public APIs. I made a deliberate effort to reduce the code to the absolute necessary minimum.

Using the non-exported functions for examples is actually less helpful because there could be some additional steps around them that you didn't account for. Can you make a more simple reprex using exported functions?

You may be misunderstanding (or maybe I am). tidyverse/ggplot2#4635 (comment) does not use any non-exported functions:

df <- data.frame(A.A = 0, A.B = 0, B.A = 0, B.B = 0)

invisible(gctorture2(500))

invisible(data.table::as.data.table(1))

x <- tidyr::build_longer_spec(
    df, tidyr::everything(), names_to = c("C", "D"), names_sep = "\\."
)

openxlsx::write.xlsx(as.data.frame(1), "tmp.xlsx", overwrite = TRUE)

print(tidyr::build_longer_spec(
    df, tidyr::everything(), names_to = c("C", "D"), names_sep = "\\."
))

print(tidyr::build_longer_spec(
    df, tidyr::everything(), names_to = c("C", "D"), names_sep = "\\."
))

@bersbersbers
Copy link
Author

I tried it at home and I could confirm the crash (R 4.1.1 on Arch).

Great, so I'm not crazy after all ;)

After a rebuild of openxlsx it went away.

What do you mean by "rebuild"? I reinstalled all packages two nights ago using install.packages(), and that did not fix the problem for me. Note that the problem is still a bit stochastic - after rebooting or other similar things, it sometimes does not reproduce for a a few tries, until it suddenly does (and then consistently does).

Maybe something spurious caused by an updated package?

Still possible of course, but as I said, I reinstalled everything...

I replaced the openxlsx:: call with some other package and it caused a similar error

Can you give an example of another call that produces the same error? That would be super helpful in the other issue, and might give tidyr people some additional incentive to look into this.

Thanks for your help so far!

@bersbersbers
Copy link
Author

Took me at least an hour of my life 🙈

And you are only starting from 10 lines of code. Imagine having to isolate such a case from thousands of lines of code, with the problem first appeared in a shiny app using lots of rmarkdown and ggplot, and only reproduced through shinytest initially... I guess I'm well over 20 hours into this ;)

@JanMarvin
Copy link
Collaborator

With build I meant R CMD install . Maybe some library on my OS changed (GCC or Rcpp or something).

Well my time is spent voluntarily ... 😄 Since you're on windows it might be something else pending. I'll see if I can revive my windows R install, but most likely it won't be soon I'm afraid. Maybe end of next week. And for me it was random too. Wrapping fun1() in try() made the problem go away.

For the other libs, I just replaced the code with other packages of mine (readspss and readstata13). The latter caused issues since it's installation was most likely to old. Once I rebuild it, the problem went away.

@bersbersbers
Copy link
Author

Thanks for your investigations!

With build I meant R CMD install . Maybe some library on my OS changed (GCC or Rcpp or something).

I see! I tried R CMD INSTALL . for openxlsx master, and it did not solve the problem for me. (I had done that before, anyway, by reinstalling all packages in my library, so I didn't expect a change, either.)

I have noticed that changing system-wide things, or restarting after a while (such as after a weekend), makes the problem seem to have gone away. But usually, this just means you have to repro it once using a smaller gctorture2 value, and it will reappear.

Adding try around the first call of fun1 seems to have a lasting effect, though - that's interesting! Maybe that hints again at tidyr.

I haven't been able to find a fun2 example using any other library, though. Here's two things I tried:

x <- data.frame()
invisible(gctorture2(50))

fun1 <- function() {
    as.data.frame(tidyr:::simplifyPieces(
        list(c("A", "A"), c("A", "B"), c("B", "A"), c("B", "B")), 2, FALSE
    )$strings)
}

fun2 <- function() {
    openxlsx::write.xlsx(as.data.frame(1), tempfile()) # repros
    # readstata13::save.dta13(as.data.frame(1), tempfile()) # does not repro
    # readspss::read.sav(system.file("extdata", "electric.sav", package = "readspss")) # does not repro
}

y <- fun1() # try(fun1) does not repro!
fun2()
while (TRUE) {
    print(fun1())
}

@bersbersbers
Copy link
Author

bersbersbers commented Oct 11, 2021

Fyi, I can now repro this consistently for every version of tidyr using cpp11 (which is >= 1.1.1), but not for any earlier version using Rcpp. I think this is another bit of information that makes tidyr (or their dependencies) the most likely source. I'll focus on tidyverse/tidyr#1163 next - thanks for your help!

@bersbersbers
Copy link
Author

Fyi, apparently fixed in r-lib/cpp11#245.

@JanMarvin
Copy link
Collaborator

Thanks for the heads-up!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants