Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

build_site() segfault when ÷ character formatted as code #189

Closed
tobyhodges opened this issue May 27, 2024 · 5 comments · Fixed by #191
Closed

build_site() segfault when ÷ character formatted as code #189

tobyhodges opened this issue May 27, 2024 · 5 comments · Fixed by #191

Comments

@tobyhodges
Copy link

Running pkgdown::build_site() on a package that includes Markdown files with the ÷ character formatted as code triggers a segfault. See my error output below, when I ran the function on a minimal package whose index.md contains:

What happens if I `÷`?
> pkgdown::build_site()
── Installing package divisiontesting into temporary library ─────────────────────────
── Building pkgdown site for package divisiontesting ───────────────────────────
Reading from: /masking/my/path/R/divisiontesting
Writing to: /masking/my/path/R/divisiontesting/docs
── Initialising site ───────────────────────────────────────────────────────────
── Building home ───────────────────────────────────────────────────────────────
Reading index.md
\
 *** caught segfault ***
address 0x0, cause 'invalid permissions'
|
Traceback:
 1: parse(con, keep.source = TRUE, encoding = "UTF-8", srcfile = srcfile)
 2: doTryCatch(return(expr), name, parentenv, handler)
 3: tryCatchOne(expr, names, parentenv, handlers[[1L]])
 4: tryCatchList(expr, classes, parentenv, handlers)
 5: tryCatch(parse(con, keep.source = TRUE, encoding = "UTF-8", srcfile = srcfile),     error = function(e) NULL)
 6: safe_parse(text)
 7: autolink_url(text)
 8: FUN(X[[i]], ...)
 9: vapply(.x, .f, ..., FUN.VALUE = character(1), USE.NAMES = FALSE)
10: map_chr(text, fun, ...)
11: tweak_children(x, xpath_inline, autolink, replace = "contents")
12: downlit::downlit_html_node(html)
13: tweak_page(html, name, pkg = pkg)
14: render_page(pkg, "title-body", data = list(pagetitle = attr(body,     "title"), body = body, filename = filename, source = repo_source(pkg,     fs::path_rel(filename, pkg$src_path))), path = path)
15: FUN(X[[i]], ...)
16: lapply(mds, render_md, pkg = pkg)
17: build_home_md(pkg)
18: build_home(pkg, override = override, preview = FALSE)
19: build_site_local(pkg = pkg, examples = examples, run_dont_run = run_dont_run,     seed = seed, lazy = lazy, override = override, preview = preview,     devel = devel)
20: pkgdown::build_site(...)
ameters = list()), repo = NULL, development = list(        destination = "dev", mode = "default", version_label = "muted",         in_dev = FALSE), topics = list(name = c(hello.Rd = "hello"),         file_in = "hello.Rd", file_out = "hello.html", alias = list(            hello.Rd = "hello"), funs = list(hello.Rd = "hello()"),         title = c(hello.Rd = "Hello, World!"), rd = list(hello.Rd = list(            list("hello"), "\n", list("hello"), "\n", list("Hello, World!"),             "\n", list("\n", "hello()\n"), "\n", list("\n", "Prints 'Hello, world!'.\n"),             "\n", list("\n", "hello()\n"), "\n")), source = list(            hello.Rd = character(0)), keywords = list(character(0)),         concepts = list(character(0)), internal = FALSE), tutorials = list(        name = character(0), file_out = character(0), title = character(0),         pagetitle = character(0), url = character(0)), vignettes = list(        name = character(0), file_in = character(0), file_out = character(0),         title = character(0), description = character(0), depth = integer(0)),     bs_version = 5L, prefix = "")), examples = base::quote(TRUE),     run_dont_run = base::quote(FALSE), seed = base::quote(1014L),     lazy = base::quote(FALSE), override = base::quote(list()),     install = base::quote(FALSE), preview = base::quote(FALSE),     new_process = base::quote(FALSE), devel = base::quote(FALSE),     cli_colors = base::quote(256L), hyperlinks = base::quote(TRUE),     pkgdown_internet = base::quote(TRUE))
e::quote(list(pkg = list(package = "divisiontesting",     version = "0.1.0", lang = "en", src_path = "/Users/hodges/Documents/R/hacks/divisiontesting",     dst_path = "/Users/hodges/Documents/R/hacks/divisiontesting/docs",     install_metadata = FALSE, desc = <environment>, meta = list(        template = list(bootstrap = 5L)), figures = list(dev = "ragg::agg_png",         dpi = 96L, dev.args = list(), fig.ext = "png", fig.width = 7.29166666666667,         fig.height = NULL, fig.retina = 2L, fig.asp = 0.618046971569839,         bg = NULL, other.parameters = list()), repo = NULL, development = list(        destination = "dev", mode = "default", version_label = "muted",         in_dev = FALSE), topics = list(name = c(hello.Rd = "hello"),         file_in = "hello.Rd", file_out = "hello.html", alias = list(            hello.Rd = "hello"), funs = list(hello.Rd = "hello()"),         title = c(hello.Rd = "Hello, World!"), rd = list(hello.Rd = list(            list("hello"), "\n", list("hello"), "\n", list("Hello, World!"),             "\n", list("\n", "hello()\n"), "\n", list("\n", "Prints 'Hello, world!'.\n"),             "\n", list("\n", "hello()\n"), "\n")), source = list(            hello.Rd = character(0)), keywords = list(character(0)),         concepts = list(character(0)), internal = FALSE), tutorials = list(        name = character(0), file_out = character(0), title = character(0),         pagetitle = character(0), url = character(0)), vignettes = list(        name = character(0), file_in = character(0), file_out = character(0),         title = character(0), description = character(0), depth = integer(0)),     bs_version = 5L, prefix = ""), examples = TRUE, run_dont_run = FALSE,     seed = 1014L, lazy = FALSE, override = list(), install = FALSE,     preview = FALSE, new_process = FALSE, devel = FALSE, cli_colors = 256L,     hyperlinks = TRUE, pkgdown_internet = TRUE)), envir = base::quote(<environment>),     quote = base::quote(TRUE))
23: base::do.call(base::do.call, base::c(base::readRDS("/var/folders/rj/3gf6c_l166qc7fl3z_v4pbxw0000gr/T//RtmpcFL1ML/callr-fun-c0dc366e5621"),     base::list(envir = .GlobalEnv, quote = TRUE)), envir = .GlobalEnv,     quote = TRUE)
24: base::saveRDS(base::do.call(base::do.call, base::c(base::readRDS("/var/folders/rj/3gf6c_l166qc7fl3z_v4pbxw0000gr/T//RtmpcFL1ML/callr-fun-c0dc366e5621"),     base::list(envir = .GlobalEnv, quote = TRUE)), envir = .GlobalEnv,     quote = TRUE), file = "/var/folders/rj/3gf6c_l166qc7fl3z_v4pbxw0000gr/T//RtmpcFL1ML/callr-res-c0dc5d8ac72",     compress = FALSE)
25: base::withCallingHandlers({    NULL    base::saveRDS(base::do.call(base::do.call, base::c(base::readRDS("/var/folders/rj/3gf6c_l166qc7fl3z_v4pbxw0000gr/T//RtmpcFL1ML/callr-fun-c0dc366e5621"),         base::list(envir = .GlobalEnv, quote = TRUE)), envir = .GlobalEnv,         quote = TRUE), file = "/var/folders/rj/3gf6c_l166qc7fl3z_v4pbxw0000gr/T//RtmpcFL1ML/callr-res-c0dc5d8ac72",         compress = FALSE)    base::flush(base::stdout())    base::flush(base::stderr())    NULL    base::invisible()}, error = function(e) {    {        callr_data <- base::as.environment("tools:callr")$`__callr_data__`        err <- callr_data$err        if (FALSE) {            base::assign(".Traceback", base::.traceback(4), envir = callr_data)            utils::dump.frames("__callr_dump__")            base::assign(".Last.dump", .GlobalEnv$`__callr_dump__`,                 envir = callr_data)            base::rm("__callr_dump__", envir = .GlobalEnv)        }        e <- err$process_call(e)        e2 <- err$new_error("error in callr subprocess")        class <- base::class        class(e2) <- base::c("callr_remote_error", class(e2))        e2 <- err$add_trace_back(e2)        cut <- base::which(e2$trace$scope == "global")[1]        if (!base::is.na(cut)) {            e2$trace <- e2$trace[-(1:cut), ]        }        base::saveRDS(base::list("error", e2, e), file = base::paste0("/var/folders/rj/3gf6c_l166qc7fl3z_v4pbxw0000gr/T//RtmpcFL1ML/callr-res-c0dc5d8ac72",             ".error"))    }}, interrupt = function(e) {    {        callr_data <- base::as.environment("tools:callr")$`__callr_data__`        err <- callr_data$err        if (FALSE) {            base::assign(".Traceback", base::.traceback(4), envir = callr_data)            utils::dump.frames("__callr_dump__")            base::assign(".Last.dump", .GlobalEnv$`__callr_dump__`,                 envir = callr_data)            base::rm("__callr_dump__", envir = .GlobalEnv)        }        e <- err$process_call(e)        e2 <- err$new_error("error in callr subprocess")        class <- base::class        class(e2) <- base::c("callr_remote_error", class(e2))        e2 <- err$add_trace_back(e2)        cut <- base::which(e2$trace$scope == "global")[1]        if (!base::is.na(cut)) {            e2$trace <- e2$trace[-(1:cut), ]        }        base::saveRDS(base::list("error", e2, e), file = base::paste0("/var/folders/rj/3gf6c_l166qc7fl3z_v4pbxw0000gr/T//RtmpcFL1ML/callr-res-c0dc5d8ac72",             ".error"))    }}, callr_message = function(e) {    base::try(base::signalCondition(e))})
26: doTryCatch(return(expr), name, parentenv, handler)
27: tryCatchOne(expr, names, parentenv, handlers[[1L]])
28: tryCatchList(expr, names[-nh], parentenv, handlers[-nh])
29: doTryCatch(return(expr), name, parentenv, handler)
30: tryCatchOne(tryCatchList(expr, names[-nh], parentenv, handlers[-nh]),     names[nh], parentenv, handlers[[nh]])
31: tryCatchList(expr, classes, parentenv, handlers)
32: base::tryCatch(base::withCallingHandlers({    NULL    base::saveRDS(base::do.call(base::do.call, base::c(base::readRDS("/var/folders/rj/3gf6c_l166qc7fl3z_v4pbxw0000gr/T//RtmpcFL1ML/callr-fun-c0dc366e5621"),         base::list(envir = .GlobalEnv, quote = TRUE)), envir = .GlobalEnv,         quote = TRUE), file = "/var/folders/rj/3gf6c_l166qc7fl3z_v4pbxw0000gr/T//RtmpcFL1ML/callr-res-c0dc5d8ac72",         compress = FALSE)    base::flush(base::stdout())    base::flush(base::stderr())    NULL    base::invisible()}, error = function(e) {    {        callr_data <- base::as.environment("tools:callr")$`__callr_data__`        err <- callr_data$err        if (FALSE) {            base::assign(".Traceback", base::.traceback(4), envir = callr_data)            utils::dump.frames("__callr_dump__")            base::assign(".Last.dump", .GlobalEnv$`__callr_dump__`,                 envir = callr_data)            base::rm("__callr_dump__", envir = .GlobalEnv)        }        e <- err$process_call(e)        e2 <- err$new_error("error in callr subprocess")        class <- base::class        class(e2) <- base::c("callr_remote_error", class(e2))        e2 <- err$add_trace_back(e2)        cut <- base::which(e2$trace$scope == "global")[1]        if (!base::is.na(cut)) {            e2$trace <- e2$trace[-(1:cut), ]        }        base::saveRDS(base::list("error", e2, e), file = base::paste0("/var/folders/rj/3gf6c_l166qc7fl3z_v4pbxw0000gr/T//RtmpcFL1ML/callr-res-c0dc5d8ac72",             ".error"))    }}, interrupt = function(e) {    {        callr_data <- base::as.environment("tools:callr")$`__callr_data__`        err <- callr_data$err        if (FALSE) {            base::assign(".Traceback", base::.traceback(4), envir = callr_data)            utils::dump.frames("__callr_dump__")            base::assign(".Last.dump", .GlobalEnv$`__callr_dump__`,                 envir = callr_data)            base::rm("__callr_dump__", envir = .GlobalEnv)        }        e <- err$process_call(e)        e2 <- err$new_error("error in callr subprocess")        class <- base::class        class(e2) <- base::c("callr_remote_error", class(e2))        e2 <- err$add_trace_back(e2)        cut <- base::which(e2$trace$scope == "global")[1]        if (!base::is.na(cut)) {            e2$trace <- e2$trace[-(1:cut), ]        }        base::saveRDS(base::list("error", e2, e), file = base::paste0("/var/folders/rj/3gf6c_l166qc7fl3z_v4pbxw0000gr/T//RtmpcFL1ML/callr-res-c0dc5d8ac72",             ".error"))    }}, callr_message = function(e) {    base::try(base::signalCondition(e))}), error = function(e) {    NULL    if (FALSE) {        base::try(base::stop(e))    }    else {        base::invisible()    }}, interrupt = function(e) {    NULL    if (FALSE) {        e    }    else {        base::invisible()    }})
An irrecoverable exception occurred. R is aborting now ...

In my investigations so far, I have come across no other characters that trigger the problem. I am unsure whether the problem is with pkgdown, pandoc, or somewhere else, and I am right at the limits of my R debugging abilities (so far!). Any suggestions you can provide for where to look next would be much appreciated, and I would be happy to provide more information from my side if needed.

Output of devtools::session_info():

> devtools::session_info()
─ Session info ─────────────────────────────────────────────────────────────────────
 setting  value
 version  R version 4.4.0 (2024-04-24)
 os       macOS Sonoma 14.4.1
 system   aarch64, darwin20
 ui       RStudio
 language (EN)
 collate  en_US.UTF-8
 ctype    en_US.UTF-8
 tz       Europe/Berlin
 date     2024-05-27
 rstudio  2024.04.1+748 Chocolate Cosmos (desktop)
 pandoc   3.1.11 @ /Applications/RStudio.app/Contents/Resources/app/quarto/bin/tools/aarch64/ (via rmarkdown)

─ Packages ─────────────────────────────────────────────────────────────────────────
 package     * version date (UTC) lib source
 cachem        1.1.0   2024-05-16 [1] CRAN (R 4.4.0)
 callr         3.7.6   2024-03-25 [1] CRAN (R 4.4.0)
 cli           3.6.2   2023-12-11 [1] CRAN (R 4.4.0)
 crayon        1.5.2   2022-09-29 [1] CRAN (R 4.4.0)
 desc          1.4.3   2023-12-10 [1] CRAN (R 4.4.0)
 devtools      2.4.5   2022-10-11 [1] CRAN (R 4.4.0)
 digest        0.6.35  2024-03-11 [1] CRAN (R 4.4.0)
 ellipsis      0.3.2   2021-04-29 [1] CRAN (R 4.4.0)
 evaluate      0.23    2023-11-01 [1] CRAN (R 4.4.0)
 fansi         1.0.6   2023-12-08 [1] CRAN (R 4.4.0)
 fastmap       1.2.0   2024-05-15 [1] CRAN (R 4.4.0)
 fs            1.6.4   2024-04-25 [1] CRAN (R 4.4.0)
 glue          1.7.0   2024-01-09 [1] CRAN (R 4.4.0)
 htmltools     0.5.8.1 2024-04-04 [1] CRAN (R 4.4.0)
 htmlwidgets   1.6.4   2023-12-06 [1] CRAN (R 4.4.0)
 httpuv        1.6.15  2024-03-26 [1] CRAN (R 4.4.0)
 knitr         1.46    2024-04-06 [1] CRAN (R 4.4.0)
 later         1.3.2   2023-12-06 [1] CRAN (R 4.4.0)
 lifecycle     1.0.4   2023-11-07 [1] CRAN (R 4.4.0)
 magrittr      2.0.3   2022-03-30 [1] CRAN (R 4.4.0)
 memoise       2.0.1   2021-11-26 [1] CRAN (R 4.4.0)
 mime          0.12    2021-09-28 [1] CRAN (R 4.4.0)
 miniUI        0.1.1.1 2018-05-18 [1] CRAN (R 4.4.0)
 pillar        1.9.0   2023-03-22 [1] CRAN (R 4.4.0)
 pkgbuild      1.4.4   2024-03-17 [1] CRAN (R 4.4.0)
 pkgconfig     2.0.3   2019-09-22 [1] CRAN (R 4.4.0)
 pkgdown       2.0.9   2024-04-18 [1] CRAN (R 4.4.0)
 pkgload       1.3.4   2024-01-16 [1] CRAN (R 4.4.0)
 processx      3.8.4   2024-03-16 [1] CRAN (R 4.4.0)
 profvis       0.3.8   2023-05-02 [1] CRAN (R 4.4.0)
 promises      1.3.0   2024-04-05 [1] CRAN (R 4.4.0)
 ps            1.7.6   2024-01-18 [1] CRAN (R 4.4.0)
 purrr         1.0.2   2023-08-10 [1] CRAN (R 4.4.0)
 R6            2.5.1   2021-08-19 [1] CRAN (R 4.4.0)
 Rcpp          1.0.12  2024-01-09 [1] CRAN (R 4.4.0)
 remotes       2.5.0   2024-03-17 [1] CRAN (R 4.4.0)
 rlang         1.1.3   2024-01-10 [1] CRAN (R 4.4.0)
 rmarkdown     2.27    2024-05-17 [1] CRAN (R 4.4.0)
 rprojroot     2.0.4   2023-11-05 [1] CRAN (R 4.4.0)
 rstudioapi    0.16.0  2024-03-24 [1] CRAN (R 4.4.0)
 sessioninfo   1.2.2   2021-12-06 [1] CRAN (R 4.4.0)
 shiny         1.8.1.1 2024-04-02 [1] CRAN (R 4.4.0)
 stringi       1.8.4   2024-05-06 [1] CRAN (R 4.4.0)
 stringr       1.5.1   2023-11-14 [1] CRAN (R 4.4.0)
 tibble        3.2.1   2023-03-20 [1] CRAN (R 4.4.0)
 urlchecker    1.0.1   2021-11-30 [1] CRAN (R 4.4.0)
 usethis       2.2.3   2024-02-19 [1] CRAN (R 4.4.0)
 utf8          1.2.4   2023-10-22 [1] CRAN (R 4.4.0)
 vctrs         0.6.5   2023-12-01 [1] CRAN (R 4.4.0)
 withr         3.0.0   2024-01-16 [1] CRAN (R 4.4.0)
 xfun          0.44    2024-05-15 [1] CRAN (R 4.4.0)
 xtable        1.8-4   2019-04-21 [1] CRAN (R 4.4.0)
 yaml          2.3.8   2023-12-11 [1] CRAN (R 4.4.0)

 [1] /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/library

────────────────────────────────────────────────────────────────────────────────────
@tobyhodges
Copy link
Author

A similar problem on another site helped me discover that the multiplication sign × also triggers the segfault. This made me wonder if it is a problem with the whole Latin-1 Supplement block of Unicode, but testing with other characters in that block (e.g. thorn Þ ) did not provoke the error.

@hadley
Copy link
Member

hadley commented May 28, 2024

Interestingly this worked just fine for me on R 4.3.2, but when I upgraded to R 4.4.0, I see the same problem as you.

Backtrace from C:

 * frame #0: 0x0000000184750904 libsystem_platform.dylib`_platform_strlen + 4
    frame r-lib/pkgdown#1: 0x00000001009f2954 libR.dylib`Rf_mkChar(name=0x0000000000000000) at envir.c:4076:19 [opt]
    frame r-lib/pkgdown#2: 0x0000000100a463ac libR.dylib`finalizeData at gram.c:0 [opt]
    frame r-lib/pkgdown#3: 0x0000000100a456dc libR.dylib`R_Parse(n=-1, status=0x000000016fdf91ac, srcfile=0x000000010876db68) at gram.c:4215:10 [opt]
    frame r-lib/pkgdown#4: 0x0000000100a45770 libR.dylib`R_ParseConn(con=<unavailable>, n=<unavailable>, status=<unavailable>, srcfile=<unavailable>) at gram.c:4277:12 [opt] [artificial]
    frame r-lib/pkgdown#5: 0x0000000100adca6c libR.dylib`do_parse(call=<unavailable>, op=<unavailable>, args=<unavailable>, env=<unavailable>) at source.c:294:6 [opt]

@hadley
Copy link
Member

hadley commented May 28, 2024

Simpler reprex 😄

downlit::autolink_url("×")

@hadley hadley transferred this issue from r-lib/pkgdown May 28, 2024
@hadley
Copy link
Member

hadley commented May 28, 2024

Moving to downlit since the source of the problem is there, but it's either a bug with R 4.4 or something is wrong with the way I'm parsing the code. A base R reprex is:

text <- "×"
srcfile <- srcfilecopy("test.r", text)

Encoding(text) <- "unknown"
con <- textConnection(text)
parse(con, keep.source = TRUE, encoding = "UTF-8", srcfile = srcfile)

@tobyhodges
Copy link
Author

Thanks very much for the quick response and fix @hadley

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants