Skip to content

Intermittent leaked worker connections when a future errorsΒ #820

@DavisVaughan

Description

@DavisVaughan

I've been having issues running furrr's tests lately, where I see a lot of this:

Warning message:
In .Internal(gc(verbose, reset, full)) :
  closing unused connection 4 (<-localhost:11913)

If you just go run devtools::test() in furrr, you should see that most of the time. (no longer, i disabled the problematic tests for now futureverse/furrr#308)

I've tracked this down to a minimal-ish future only issue:

library(future)

fn <- function() {
  futures <- vector("list", length = 10)

  for (i in 1:10) {
    futures[[i]] <- future(
      expr = stop("oh no")
    )
  }

  value(futures)
}

# You probably have to run this a few times
plan("multisession", workers = 2)
tryCatch(fn(), error = function(cnd) {})
plan(sequential)
gc()

Alternatively you can run fn() in a loop like this, with warn = 1 so you see warnings immediately

# Alternatively:
options(warn = 1)
for (i in 1:100) {
  plan("multisession", workers = 2)
  tryCatch(fn(), error = function(cnd) {})
  plan(sequential)
  gc()
}

This leak is intermittent, which is probably going to make this hard to track down.

It looks like this:

test1.mov
test2.mov

This is not great!

There are a few things that seem to be required to make this appear:

  • The future() must error, i.e. it must call stop("oh no") or something similar
  • I think you need to use plan("multisession", workers = 2). At least, you need a few workers.
  • I needed to use 10 futures to get it to reproduce pretty reliably. With just 2 futures it did still occur, but way less frequently.

This comes up on the furrr side with tests that look like this:

furrr_test_that("unused components can be absorbed", {
  x <- list(c(1, 2), c(3, 5))

  fn1 <- function(x) {
    x
  }

  expect_error({
    future_pmap_dbl(x, fn1)
  })
})

furrr_test_that() causes the test to be run across different plan() styles, including multisession with 2 workers. And this test is expecting an error, i.e. the future_pmap_dbl() call will error when trying to run fn1 on the worker. Then expect_error() acts as my tryCatch() in my minimal reprex.

This feels like a medium to high ish severity issue? Not being able to shut down workers seems quite bad!

Metadata

Metadata

Labels

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions