-
Notifications
You must be signed in to change notification settings - Fork 4
Description
Background
As documents in help("connections", package="base") the maximum number of connections one can have open in R (in addition to the three always reserved) is 125;
"A maximum of 128 connections can be allocated (not necessarily open) at any one time. Three of these are pre-allocated (see stdout). The OS will impose limits on the numbers of connections of various types, but these are usually larger than 125."
Here is an example showing what happens when we try to open too many connections:
> cons <- list()
> for (ii in 1:126) { cons[[ii]] <- textConnection("foo") }
Error in textConnection("foo") : all connections are in use
> nrow(showConnections())
[1] 125
> head(showConnections())
description class mode text isopen can read can write
3 "\"foo\"" "textConnection" "r" "text" "opened" "yes" "no"
4 "\"foo\"" "textConnection" "r" "text" "opened" "yes" "no"
5 "\"foo\"" "textConnection" "r" "text" "opened" "yes" "no"
6 "\"foo\"" "textConnection" "r" "text" "opened" "yes" "no"
7 "\"foo\"" "textConnection" "r" "text" "opened" "yes" "no"
8 "\"foo\"" "textConnection" "r" "text" "opened" "yes" "no"
> tail(showConnections())
description class mode text isopen can read can write
122 "\"foo\"" "textConnection" "r" "text" "opened" "yes" "no"
123 "\"foo\"" "textConnection" "r" "text" "opened" "yes" "no"
124 "\"foo\"" "textConnection" "r" "text" "opened" "yes" "no"
125 "\"foo\"" "textConnection" "r" "text" "opened" "yes" "no"
126 "\"foo\"" "textConnection" "r" "text" "opened" "yes" "no"
127 "\"foo\"" "textConnection" "r" "text" "opened" "yes" "no" Issue
There are several use cases where one might hit the upper limit of number of open connections possible in R. A common use case where one is may face this issue is when using SNOW compute clusters. SNOW clusters as implemented by the parallel package (a core R package) uses one connection per SNOW worker. These days more users have access to large clusters or machines with a large number of cores, making it more likely to try to use clusters with > 125 nodes.
> library("parallel")
> cl <- makeCluster(126L)
Error in socketConnection(port = port, server = TRUE, blocking = TRUE, :
all connections are in use
> nrow(showConnections())
[1] 125The problem with the low NCONNECTIONS limit in relationship to SNOW clusters has been discussed by others in the past, e.g.
- https://stat.ethz.ch/pipermail/r-devel/2004-March/029295.html
- https://stat.ethz.ch/pipermail/r-sig-hpc/2010-March/000590.html
- https://stat.ethz.ch/pipermail/r-sig-hpc/2012-May/001372.html
- http://digitheadslabnotebook.blogspot.com/2012/12/r-in-cloud.html
- https://stat.ethz.ch/pipermail/r-sig-hpc/2014-February/001818.html
- SnowParam: cannot create 126 workers; 125 connections available in this session Bioconductor/BiocParallel#55
- https://stat.ethz.ch/pipermail/r-devel/2021-August/081033.html
Troubleshooting
The total limit of 128 connections is hardcoded into the R source code as constant / macro NCONNECTIONS in src/main/connections.c;
#define NCONNECTIONS 128 /* snow needs one per slave node */which is used to preallocate a set of Rconnection:s of this size;
static Rconnection Connections[NCONNECTIONS];The NCONNECTIONS limit was increased from 50 to 128 in R 2.4.0 (released October 2006), which appears to have been done for the same reason as explained here.
Wish
- Increase the
NCONNECTIONSlimit, to say, 1024.- I've verified that it works with
NCONNECTIONS=16384on Linux (see comment below). Similar checks may have to be done on macOS and Windows as well. - This would only require a simple update of the above constant / macro.
- The disadvantage of increasing the limit is that it will also increase the linear-search time of internal
int ConnIndex(Rconnection con)for non-existing connections.Using a linked list would avoid this particular problem (see below).
- I've verified that it works with
- Make the error message informative about the actual limit, e.g.
all 128 connections are in use. - An alternative, and possibly better, approach would be to re-implement
Connectionsas a linked list, which (including its memory usage) could grow and shrink as needed. This could even remove having a limit at all. This would require redesign of the code and increase the risk of introducing bugs. (This idea was proposed by @mtmorgan).