Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Capturing and outputting stdout #232

Closed
HenrikBengtsson opened this issue Jun 14, 2018 · 3 comments
Closed

Capturing and outputting stdout #232

HenrikBengtsson opened this issue Jun 14, 2018 · 3 comments
Labels
Backend API Part of the Future API that only backend package developers rely on enhancement Frontend API Part of the Future API that users of futures rely on
Milestone

Comments

@HenrikBengtsson
Copy link
Owner

Since future 1.8.0 and the introduction of (internal) class FutureResults, futures now support retrieval of not just values and errors from futures, but also richer sets of results. The first canditate for utilizing this, and also a frequently requested feature in parallel processing, is to capture output and relay it on the master process.

Below are my thoughts about how to introduce this as a new feature. All feedback is welcomed, particularly those that may raise concerns about potentially introducing problems if implemented one way or the other.

Note: This is not about relaying output in a "live" fashion - captured output will only be relayed after the future is resolved and its result has been received by the master process.

Capturing

Some backends capture standard output for us, e.g. callr. Others don't and then we need to (internally) wrap the future expression in a capture_output() statement. Since both alternatives exist, there should be an argument to the (internal) getExpression() for this such that each future backend can implement its own approach, e.g. getExpression(future, stdout = TRUE/FALSE).

The developer should be able to disable any capturing, that is, suppress standard output. This could be done as f <- future(expr, stdout = FALSE) and v %<-% { expr } %stdout% FALSE.

Outputting

Analogously to how value() signals errors by default (signal = TRUE), it could/should output captured standard output by default (stdout = TRUE). Should this be done each time value() is called? That would be the easiest. The alternative to only do it the first time would also be possible, but then we would need to record this in the Future object. For v %<-% { expr }, value() will only be called once, so there we don't have to worry about this.

Examples:

> library(future)
> plan(multisession)
> f <- future({ cat("Hello world!\n"); print(1:3); 42L })
> cat("Waiting ...\n")
Waiting ...
> v <- value(f)
Hello world!
[1] 1 2 3
> v
[1] 42L
> 1 + 2
[1] 3
> value(f)
Hello world!
[1] 1 2 3
[1] 42L

and

> v %<-% { cat("Hello world!\n"); print(1:3); 42L }
> v
Hello world!
[1] 1 2 3
[1] 42L

as well as

> v %<-% { cat("Hello world!\n"); print(1:3); 42L } %stdout% FALSE
> v
[1] 42L

API?

What should the the API for getting the captured output of a resolved future be - getStandardOutput(f)? Do we even need to introduce a function for this - would capture.output(value(f)) work equally well? The advantage would obviously be that we keep the Future API to a bare minimum, which is always easier to maintain.

Standard error?

Due to current limitations of R, it is more or less impossible for us to reliably capture standard-error (stderr) output. For more details, see HenrikBengtsson/Wishlist-for-R#55.

@HenrikBengtsson
Copy link
Owner Author

HenrikBengtsson commented Jul 8, 2018

In the feature/stdout branch, there's now a prototype where stdout is automatically captured for each future and re-outputted when the value of the future is queried. For example,

> library(future)
> plan(multiprocess)

> f <- future({ cat("hello"); 42 })
> value(f)
hello[1] 42
> value(f)
hello[1] 42
> 

> y %<-% { cat("world"); 42 }
> y
world[1] 42
> y
[1] 42
> 

This will work with any future backend (after some very minor modifications to those backends; already available in their 'develop' branches), e.g.

> plan(future.batchtools::batchtools_local)
> f <- future({ cat("hello"); 42 })
> value(f)
hello[1] 42
> value(f)
hello[1] 42
> 

This will work out of the box for higher-level API such as future.apply. For example,

> library(future.apply)
> plan(multisession, workers = 3L)
> y <- future_lapply(1:10, FUN = function(i) { print(i); i^2 })
[1] 1
[1] 2
[1] 3
[1] 4
[1] 5
[1] 6
[1] 7
[1] 8
[1] 9
[1] 10
> 

The same for doFuture, e.g.

> library(doFuture)
> registerDoFuture()
> plan(multisession, workers = 3L)
> y <- foreach(i = 1:10) %dopar% { print(i); i^2 }
[1] 1
[1] 2
[1] 3
[1] 4
[1] 5
[1] 6
[1] 7
[1] 8
[1] 9
[1] 10
> 

Note that in the above example, there are three workers meaning that there will be three futures (one per chunk) that process the ten elements. Because of this, the captured stdout will be outputted in three "batches" when the values of the futures are collected. (Again, this is not "live streaming" of the standard output).

UPDATE 2018-07-09 23:55 UTC: Added proof-of-concept foreach() example with the doFuture adapter.

HenrikBengtsson added a commit that referenced this issue Jul 10, 2018
…pt output (as in previous versions of the future package) [#232]
HenrikBengtsson added a commit to HenrikBengtsson/future.callr that referenced this issue Jul 10, 2018
HenrikBengtsson added a commit to HenrikBengtsson/future.BatchJobs that referenced this issue Jul 10, 2018
HenrikBengtsson/future#232

Merge branch 'develop' of github.com:HenrikBengtsson/future.BatchJobs into develop
@HenrikBengtsson HenrikBengtsson added this to the Next release milestone Jul 10, 2018
@HenrikBengtsson
Copy link
Owner Author

I've now merged branch feature/stdout into the develop branch (= next release). Backends future.callr, future.batchtools, and future.BatchJobs have been updated on CRAN to support stdout relaying (minor tweaks were needed). To test the develop version, install it using:

remotes::install_github('HenrikBengtsson/future@develop')

I'm hoping to submit future 1.9.0 to CRAN in a not too far future.

@HenrikBengtsson
Copy link
Owner Author

future 1.9.0 with support for relaying stdout just hit CRAN.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Backend API Part of the Future API that only backend package developers rely on enhancement Frontend API Part of the Future API that users of futures rely on
Projects
None yet
Development

No branches or pull requests

1 participant