-
Notifications
You must be signed in to change notification settings - Fork 7
Description
Background
We can only use no-more than 125 parallel PSOCK, SOCK, and MPI workers, because of the hard-coded upper limit of 125+3 open connections in R.
If we use
ncores <- availableCores()
cl <- makeClusterPSOCK(ncores)we risk running into:
Error: Cannot create 192 parallel PSOCK nodes. Each node needs
one connection, but there are only 124 connections left out of
the maximum 128 available on this R installationon modern machines. A way to mitigate this problem is to use:
ncores <- min(freeConnections(), availableCores())With the corner-case problem that we now might end up with ncores = 0, which gives:
> cl <- parallelly::makeClusterPSOCK(ncores)
Error: Number of 'workers' must be one or greater: 0Idea
Maybe we could introduce a new argument constraints, e.g.
ncores <- availableCores(constraints = "connections")to cover the case when we work with PSOCK, SOCK, and MPI clusters? Note that not all parallel backends rely on connections, so this should not be default. For example, mclapply() and callr backends do not use connections.
One thing to figure out is what to do when there are zero free connections available. Currently, availableCores() is designed to always return at least 1L. That "contract" is easy to understand and program with. To keep this guarantee also when freeConnections() == 0, we could do:
ncores <- min(freeConnections(), ncores)
ncores <- max(1L, ncores)and leave it to the downstream code to fail gracefully. For example, makeClusterPSOCK() already handles this with:
Error: Cannot create 1 parallel PSOCK nodes. Each node needs one
connection but there are only 0 connections left out of the maximum
128 available on this R installation