Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pkgdown consumed a huge amount (10+GB) of RAM on build_site() #783

Closed
Robinlovelace opened this issue Aug 13, 2018 · 28 comments
Closed

pkgdown consumed a huge amount (10+GB) of RAM on build_site() #783

Robinlovelace opened this issue Aug 13, 2018 · 28 comments
Labels
reprex needs a minimal reproducible example

Comments

@Robinlovelace
Copy link

I'm not sure how or why this happened but I'm confident it's pkgdown because I tried it twice (both times it crashed my R session).

Here's the crash message:

<simpleError in process_initialize(self, private, command, args, stdin, stdout,     stderr, connections, poll_connection, env, cleanup, cleanup_tree,     wd, echo_cmd, supervise, windows_verbatim_args, windows_hide_window,     encoding, post_process): Cannot fork Unknown error -12 at unix/processx.c:333>
Warning message:
system call failed: Cannot allocate memory 
Error in process_initialize(self, private, command, args, stdin, stdout,  : 
  Cannot fork Unknown error -12 at unix/processx.c:333

Here's the dramatic memory use log on Ubuntu:

image

With:

packageVersion("pkgdown")
#> [1] '1.1.0'

Created on 2018-08-13 by the reprex package (v0.2.0).

@Robinlovelace
Copy link
Author

Live update: now trying with

pkgdown::build_site(new = F)

In a terminal to see if that solves it - seems to be something to do with running in parallel. Update - crazy RAM use happened again - didn't solve it.

@Robinlovelace
Copy link
Author

In the name of reproducibility, I think others will be able to reproduce this by cloning this repo and running build_site() from in there: https://github.com/geocompr/geocompkg

I suspect the culprit is https://github.com/geocompr/geocompkg/blob/master/vignettes/point-pattern-heatmap.Rmd which has some leaflet code.

@Robinlovelace
Copy link
Author

Update on the last attempt run from a terminal with new_process = F - it was definitely the R session itself that was munching the session because when it stopped R with quit() this happened:

image

Robinlovelace added a commit to geocompx/geocompkg that referenced this issue Aug 13, 2018
@Robinlovelace
Copy link
Author

Another update: thinking maybe running it on a bigger computer with a different OS might help I tried it on my work desktop and get an error - the error message should shed more light on it though, hope this helps:

image

@Robinlovelace
Copy link
Author

In any case, for users, all the vignettes can be built by running build_articles(), that seems to run fine.

@Nowosad
Copy link

Nowosad commented Aug 21, 2018

I also got this problem today. Vignette works well when knitted, but there is a memory leak while using build_site().

@jayhesselberth jayhesselberth added the reprex needs a minimal reproducible example label Oct 6, 2018
@jayhesselberth
Copy link
Collaborator

Could you please build a minimal package that recreates the issue? The geocompkg package has too many dependencies to debug this effectively.

@Robinlovelace
Copy link
Author

I think it was the number of vignettes, not the dependencies of the package the website of which we were trying to build, that caused the issue. Agree hard to debug.

@hadley hadley closed this as completed Nov 6, 2018
@Robinlovelace
Copy link
Author

FYI for anyone else reading this issue: run

build_article()

one-by-one and it seems to be fine. All good!

@samvaltenbergs
Copy link

samvaltenbergs commented Dec 5, 2018

@Robinlovelace did you ever resolve this issue (other than having to build each article individually, which I don't want to do as part of our CI/CD pipeline)? I see @hadley closed the issue, but I don't see details of a resolution. I'm having the same issue: vignettes knit by themselves NP and I can build each article manually by itself, but when pkgdown::build_site() runs and builds all the articles at once it crashes because it "cannot allocate vector of size 11.0 Gb". The behavior is the same on my local machine in RStudio as it is in a Docker EE container as part of a CI/CD pipeline.

@Nowosad
Copy link

Nowosad commented Dec 5, 2018

Hi @samvaltenbergs I think the problem is still there. We (@Robinlovelace and I) haven't found a solution yet..

@Robinlovelace
Copy link
Author

To confirm: no I didn't find a solution. I'm not 100% sure what led to the issue being closed. Any info on that @hadley would be appreciated. I think one thing that would help developers would be reproducible examples.

@jayhesselberth
Copy link
Collaborator

If you are running very large vignettes it is going to be very difficult to debug this, and I would also advocate for breaking these up anyway (or at least trimming your example data sets down a bit).

If this is a bug, it's probably reproducible by a very short vignette.

@Robinlovelace
Copy link
Author

Thanks for the feedback. Will try to create a reproducible example. Anyone else finding issues with memory usage: please add reproducible examples of what happened.

@Nowosad
Copy link

Nowosad commented Dec 5, 2018

Hi @jayhesselberth, I spent an hour a few weeks trying to create a reproducible short example and I was unable to do so...

@hadley
Copy link
Member

hadley commented Dec 5, 2018

It was closed because we don’t have a reprex and it seems unlikely that the root cause is pkgdown.

@Robinlovelace
Copy link
Author

I think it's possible to create a reproducible example. May be in bash or have lots of system() calls though. Will aim to do this when on a decent computer in a session I can afford to crash. Steps to reproduce as far as I remember:

git clone git@github.com:geocompr/geocompkg
cd geocompkg
R
pkgdown::build_site()

@samvaltenbergs
Copy link

samvaltenbergs commented Dec 7, 2018

@hadley, @Robinlovelace and @Nowosad I understand that it's difficult to troubleshoot without a reproducible example, but unfortunately I can't share the vignettes due to corporate policy (The Man)... I can show a screenshot that demonstrates the failure signature: building the articles by themselves works okay but when pkgdown tries to put the site together after all the vignettes are turned into HTML it hits a wall. And then if I immediately run the same build_articles() command (at which point the vignettes are already turned into HTML) everything is AOK.

build_site

What has me scratching my head is that the vignettes are all very simple: the only piece of R code that is executed across the seven vignettes is a call to Sys.Date(). That is, there isn't any R code being executed that could possibly create something in memory that's 11Gb in size.

What is even more strange is that the "11Gb failure" only occurs the first time build_articles() or build_site() is run for a given R session. If I delete the docs\articles folder (which will cause pkgdown to rebuild all the HTML vignettes) it runs without an issue. But if I then restart the R session and run it fresh, fails again...

build_site_restart

Any ideas?

@Robinlovelace
Copy link
Author

Robinlovelace commented Dec 10, 2018

Reproducible example verified:

asciicast

Heads-up @hadley and @jayhesselberth please try to reproduce.

@Robinlovelace
Copy link
Author

Further evidence that this is an issue with build_site() and not the contents of the vignettes: when you build them individually, e.g. with the following commands, it works:

# build site:
pkgdown::build_home()

# build vignettes one by one, e.g.
pkgdown::build_article("sea-level-rise")

pkgdown::build_articles()
# build all articles
articles = list.files(path = "vignettes/", pattern = ".Rmd")
articles = gsub(pattern = ".Rmd", replacement = "", articles)
for(i in articles) {
  pkgdown::build_article(i)
}

@hadley
Copy link
Member

hadley commented Dec 10, 2018

@Robinlovelace that video doesn't really add anything. All build_articles() does is call build_article() with purrr::walk(), so it's already basically equivalent to your code. Including the contents of traceback() would be more useful.

@hadley
Copy link
Member

hadley commented Dec 10, 2018

I have started the process of installing all the geocompr dependencies so I can reproduce locally.

@Robinlovelace
Copy link
Author

Sure. 1st time I've used the function. Hope the output is useful (means nothing to me)!

traceback()
5: stop(err[[2]])
4: get_result(output = out, options)
3: callr::r(function(..., crayon_enabled, crayon_colors, pkgdown_internet) {
       options(crayon.enabled = crayon_enabled, crayon.colors = crayon_colors, 
           pkgdown.internet = pkgdown_internet)
       pkgdown::build_site(...)
   }, args = args, show = TRUE, )
2: build_site_external(pkg = pkg, examples = examples, document = document, 
       run_dont_run = run_dont_run, seed = seed, lazy = lazy, override = override, 
       preview = preview)
1: pkgdown::build_site()

@Robinlovelace
Copy link
Author

The good news: with the work-around I've managed to re-build and update the site: https://github.com/geocompr/geocompkg

Heads-up @samvaltenbergs that work-around #783 (comment) may be of use if build_site() is still playing-up for you.

@Robinlovelace
Copy link
Author

btw @hadley apologies for all the dependencies.

@hadley
Copy link
Member

hadley commented Dec 10, 2018

After running pkgbuild::build_site(new_process = FALSE) I see:

> traceback()
10: knitr::knit_meta_add(old_knit_meta, attr(old_knit_meta, "knit_meta_id"))
9: render(input = md_file, output_file = html_file, output_format = override_output_format, 
       output_options = list(self_contained = FALSE), quiet = TRUE, 
       encoding = "UTF-8")
8: discover_rmd_resources(input_file, encoding, discover_single_resource)
7: rmarkdown::find_external_resources(input_path, "UTF-8") at rmarkdown.R#48
6: render_rmarkdown(pkg, input = input, output = output_file, output_format = format, 
       output_options = options, quiet = quiet) at build-articles.R#216
5: .f(.x[[i]], ...)
4: purrr::walk(pkg$vignettes$name, build_article, pkg = pkg, quiet = quiet, 
       lazy = lazy) at build-articles.R#133
3: build_articles(pkg, lazy = lazy, override = override, preview = FALSE) at build.r#367
2: build_site_local(pkg = pkg, examples = examples, document = document, 
       run_dont_run = run_dont_run, seed = seed, lazy = lazy, override = override, 
       preview = preview) at build.r#287
1: pkgdown::build_site(new_process = F)

@hadley
Copy link
Member

hadley commented Dec 10, 2018

If I run it again, it fails later:

Reading 'vignettes/solutions06.Rmd'
Error in CPL_geos_is_empty(st_geometry(x)) : 
  Evaluation error: IllegalArgumentException: Points of LinearRing do not form a closed linestring.

@Robinlovelace
Copy link
Author

Thanks for reporting. I've raised an issue here: geocompx/geocompkg#2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
reprex needs a minimal reproducible example
Projects
None yet
Development

No branches or pull requests

5 participants