Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ROBUSTNESS: Produce an error (not NULL) when a forked (mc)process crashes #199

Closed
HenrikBengtsson opened this issue Feb 15, 2018 · 5 comments
Milestone

Comments

@HenrikBengtsson
Copy link
Owner

Issue

A forked (mc)process that crashes returns NULL, e.g.

> library(future)
> plan(multicore, workers = 2L)
> f <- future({ quit("no"); TRUE })
> v <- value(f)
> str(v)
NULL

Ideally this should be detected and produce an FutureError, as for instance with multisession, e.g.

> library(future)
> plan(multisession, workers = 2L)
> f <- future({ quit("no"); TRUE })
> v <- value(f)
Error in unserialize(node$con) : 
Failed to retrieve the value of MultisessionFuture from cluster node #1 (on 'localhost').  The reason reported was 'error reading from connection'

Troubleshooting

This behavior originates from the parallel package, e.g.

> library(parallel)
> y <- mclapply(1:2, FUN = function(x) { quit("no"); TRUE })
> str(y)
List of 2
 $ : NULL
 $ : NULL

and more precisely:

> library(parallel)
> f <- mcparallel({ quit("no"); TRUE })
> str(f)
List of 2
 $ pid: int 19583
 $ fd : int [1:2] 3 8
 - attr(*, "class")= chr [1:3] "parallelJob" "childProcess" "process"
> v <- mccollect(f)
> str(v)
List of 1
 $ 19583: NULL

Action

It does not seem to be possible to detect this. It'll require a modification in the future expression and how results are returned by the future workers. The good news, as part of trying to resolve Issues #25, #59, #67, #154, #188, this is already in the pipeline - hopefully already for the 1.8.0 release.

See also

This is related to Issue #155 and probably also Issue #198.

@HenrikBengtsson HenrikBengtsson added this to the Next release milestone Feb 15, 2018
@HenrikBengtsson HenrikBengtsson changed the title ROBUSTNESS: Produce an error (not NULL) when a forked (mc)process crashesROBUSTNESS: A forked (mc)process that crashes ROBUSTNESS: Produce an error (not NULL) when a forked (mc)process crashes Feb 15, 2018
@wlandau
Copy link

wlandau commented Feb 22, 2018

+1 for this! Would really help with ropensci/drake#227.

@HenrikBengtsson
Copy link
Owner Author

It'll also avoid HenrikBengtsson/Wishlist-for-R#57 - which reveals itself in cases such as in Issue #200.

I've got a pretty good idea how to implement this, and hopefully it's completely backward compatible with all existing future code so it can be rolled out without major efforts. Need to do solid testing though.

@wlandau
Copy link

wlandau commented Feb 22, 2018

Awesome! Afterwards, will all backends throw an error on value(f) if f crashes?

@HenrikBengtsson
Copy link
Owner Author

Yes, the objective is a consistent behavior across backends. Programming with futures should be independent of backend. This has always been the goal - any discrepancies are oversights and basically considered bugs.

HenrikBengtsson added a commit that referenced this issue Feb 23, 2018
@HenrikBengtsson
Copy link
Owner Author

This now solved in the develop branch in the sense that exceptional errors in the orchestration of futures are detected an reported as FutureError conditions, e.g.

library(future)
plan(multicore, workers = 2L)
f <- future({ quit("no"); TRUE })
> v <- value(f)
Error: Internal error: Unexpected value retrieved for MulticoreFuture future ('<none>'): '{; quit("no"); TRUE; }'
> class(tryCatch(v <- value(f), error = identity))
[1] "FutureError"     "simpleError"     "error"           "FutureCondition" "condition"      

ROAD MAP: Although it does not look much to the world, there is a major internal update where futures can now return a much richer set of results than just the "value". I've made this update backward compatible with existing future backends, but eventually we'll have to make sure all future backends will support these new richer set of results. This far I've prepared and tested this for future.callr. I also need to do the same for future.batchtools etc. so it'll take some time. I hope to updated all those, or at least make sure it works, before submitting these updates of the future package to CRAN.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants