Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Sign upQ: process connections do not involve R connections, correct? #91
Comments
|
The CRAN version uses R connections, so that is limited. The GH version uses it's own connection implementation, so that is not limited. It is only limited by the OS limits for the number of open file descriptors. Plus on windows |
|
Btw. I guess you mean |
|
Btw. it is also easy to try :)
|
|
Great. Yes, I played around with it like that, but I wanted to make sure I didn't miss anything. Among other things, I think this is a big advantage when it comes to parallelism; machines with > 125 cores are soon to be commonly available where people are starting to the limit with classical SOCK-clusters of the parallel package. Of course, it's not too hard to bump up the limit in R itself - but it requires some convincing :/ Any ETAs for CRAN releases, or do you play it by ear? |
|
CRAN is ignoring my |
|
Oh... that's unfortunate. Since there are no obvious errors, hopefully it's just that t/he/y are busy right now. PS. I only knew about ftp://cran.r-project.org/incoming/ - didn't know more details are available under https://win-builder.r-project.org/incoming_pretest/ - useful. |
|
Follow up: It looks like you've merged in some of the processx internals to callr - is that correct? If so, do comments in this thread also apply to callr? Specifically,
|
Yes, the same code is in callr now, so all this applies. I'll rewrite the windows IO with IOCP soonish, in the next 1-2-3 months, and then this limitation will go away. |
|
Awesome. Thxs. |
|
Just for the record (in case someone stumbles upon this thread): Since processx 3.1.0 (2018-05-15) there is no longer a limit of "64 connections" on Windows. From NEWS of processx 3.1.0:
|
|
Any news on Linux? I have a 128 core machine and through experimentation determined that makeCluster(124) was largest I could create. |
|
processx/callr never had a limit on Linux - it was Windows. So, you should be good to go using as many parallel {callr} processes as you'd like. I saw your PR on future (thxs), so if you relate this to PSOCK workers vs callr workers, then yes you can use the future.callr backend to parallelize on your local machine with how many callr workers you'd like. |
|
Sure, no problem. I am waiting for the day the future package full covers Rmpi. ;) As to this, I had reached the point where I am sure this was an R issue earlier today. The issue I was having was starting workers using the doParallel package:
|
|
FYI, I've collected info and references on the 125 connection limit in R in HenrikBengtsson/Wishlist-for-R#28. If you're into building R from source, then you'll see there that it's just a single line of code you need to tweak to increase this limit. |
|
I saw that. It seems everything we are doing on this AMD machine is build from source. Any hints on getting performance from R on an AMD Epyc machine?? ;) I used EasyBuild for this, I am going to have to see what the inerts of that module contain for sure. |
I just like to confirm that
processxis not limited by the maximum number of connections R can have open, i.e.NCONNECTIONS=128. I've played around with things such asp <- process$new(..., stdout = "|")andp$get_output_connection()and it appears not to be, but could you please confirm?EDIT:
stdoutnotstdin