When processing futures on in external R processes, it may be useful to the main/calling R process to have access to some information about that R process also before the future is resolved (cf. discussion in Issue #93), e.g.
>info<- worker_info()
> str(info)
Listof3$r:Listof15..$platform:chr"x86_64-pc-linux-gnu"..$arch:chr"x86_64"..$os:chr"linux-gnu"..$system:chr"x86_64, linux-gnu"..$status:chr""..$major:chr"3"..$minor:chr"3.3"..$year:chr"2017"..$month:chr"03"..$day:chr"06"..$svnrev:chr"72310"..$language:chr"R"..$version.string:chr"R version 3.3.3 (2017-03-06)"..$nickname:chr"Another Canoe"..$os.type:chr"unix"$system:Listof8..$sysname:chr"Linux"..$release:chr"4.4.0-72-generic"..$version:chr"#93-Ubuntu SMP Fri Mar 31 14:07:41 UTC 2017"..$nodename:chr"hb-x1"..$machine:chr"x86_64"..$login:chr"unknown"..$user:chr"hb"..$effective_user:chr"hb"$process:Listof1..$pid:int4246
This information should be exposed in the Future API as an element of a Future object, e.g.
>f<- future(Sys.sleep(300))
>info<-f$worker
For persistent clusters such as the ones created by parallel::makePSOCKcluster() this information could be created once already at setup, e.g. plan(cluster, workers = cl). The function future::makeClusterPSOCK(), or more specifically, future::makeNodePSOCK() could even collect this information when setting up each worker and plan(cluster, workers = cl) could add it only if missing. Doing this already at setup would also have the advantage of making a first validation that the worker and the master can communicate properly (beyond setting up the connection). The disadvantage of gathering this information is a small additional overhead, but since these workers are persistent, that is they serve many futures, that should be a minor problem.
The text was updated successfully, but these errors were encountered:
retrieving session information including the process ID from the
corresponding R process. The same information is also collected by
plan(cluster) and plan(multisession) if not already available, e.g. when
parallel::makeCluster() is used instead. This makes it possible to find
session information for a future that is not yet resolved.
(Issue #142)
When processing futures on in external R processes, it may be useful to the main/calling R process to have access to some information about that R process also before the future is resolved (cf. discussion in Issue #93), e.g.
This information could be gathered as:
Example:
This information should be exposed in the Future API as an element of a Future object, e.g.
For persistent clusters such as the ones created by
parallel::makePSOCKcluster()
this information could be created once already at setup, e.g.plan(cluster, workers = cl)
. The functionfuture::makeClusterPSOCK()
, or more specifically,future::makeNodePSOCK()
could even collect this information when setting up each worker andplan(cluster, workers = cl)
could add it only if missing. Doing this already at setup would also have the advantage of making a first validation that the worker and the master can communicate properly (beyond setting up the connection). The disadvantage of gathering this information is a small additional overhead, but since these workers are persistent, that is they serve many futures, that should be a minor problem.The text was updated successfully, but these errors were encountered: