-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Viewing/signaling conditions signaled in futures #49
Comments
@DavisVaughan Sometimes an error occurs in a specific sub-process because of the parameters passed to it, especially when doing parameter tuning for a model. It's very useful to |
@yogat3ch Just as a heads up, this sort of condition-handling stuff ended up with me refining/generalizing the code there into the catchr package, which you might want to look at. It's not as focused on library(future)
library(furrr)
library(catchr)
plan(sequential)
write_to_file <- function(cond) {
cond_class <- class(cond)[1]
msg <- paste(cond_class, ":", cond$message)
write(msg, file="out.txt", append=TRUE)
}
cond_to_file <- make_catch_fn(
warning = c(write_to_file, muffle),
message = c(write_to_file, muffle),
error = c(write_to_file, exit_with("Returned error!"))
)
# Example in action
fn <- function(x) {
message("HEY")
warning("UHOH")
if (x==2) stop("It broke")
paste("end","result!")
}
res <- future_map(1:2, ~cond_to_file(fn(.x)))
res In the example above, the |
Hi @burchill, Many thanks! If I specify the calls to catchr functions using |
future now relays all conditions as of early 2019 library(furrr)
#> Loading required package: future
#> Warning: package 'future' was built under R version 4.0.2
plan(multisession, workers = 2)
future_map(1:5, ~{
if (.x == 3L || .x == 5L) {
warning("oh no!")
}
.x
})
#> Warning in ...furrr_fn(...): oh no!
#> Warning in ...furrr_fn(...): oh no!
#> [[1]]
#> [1] 1
#>
#> [[2]]
#> [1] 2
#>
#> [[3]]
#> [1] 3
#>
#> [[4]]
#> [1] 4
#>
#> [[5]]
#> [1] 5
plan(sequential) Created on 2020-08-06 by the reprex package (v0.3.0) |
I noticed this! Thanks for reminded me about this. However, I'm still using |
can you try and provide a full reproducible example of what you'd like to do? |
I was not seeing messages surface when running a cluster plan in an Rstudio background processes in August 2019 for some reason, but it definitely appears that messages are now surfacing! I thought I had the dev version installed but perhaps I didn't. |
@DavisVaughan Here's the reprex:
|
Ah, currently messages are captured in the future on the individual workers, and are relayed back to the user all at once after that future finishes. For the example above, furrr will make 2 future objects, one with the elements 1:2 and one with elements 3:5. So as soon as the map over 1:2 finishes, you get those messages, then 3:5 finishes a little bit later and you get those messages. This example makes things a little clearer: cl <- future::makeClusterPSOCK(2)
future::plan(future::cluster, workers = cl)
furrr::future_map(1:5, ~{
idx <- .x
purrr::map(1:3, ~{
message(paste("Progress message:", .x, "from idx", idx))
Sys.sleep(1)
})
.x
})
# These messages are relayed after indices 1:2 finish
Progress message: 1 from idx 1
Progress message: 2 from idx 1
Progress message: 3 from idx 1
Progress message: 1 from idx 2
Progress message: 2 from idx 2
Progress message: 3 from idx 2
# These messages are relayed a little bit later after 3:5 finish
# (They are a little slower because they have one more element to process (5))
Progress message: 1 from idx 3
Progress message: 2 from idx 3
Progress message: 3 from idx 3
Progress message: 1 from idx 4
Progress message: 2 from idx 4
Progress message: 3 from idx 4
Progress message: 1 from idx 5
Progress message: 2 from idx 5
Progress message: 3 from idx 5 What you are asking for is "near real time updates". These have recently been made possible in future so that it can support near real time progress updates. It is supported for multisession, sequential, and cluster futures, but not multicore as of right now. It is not publicly advertised, but it is possible to hook in to this feature for your own "near real time" messages. The idea is to subclass your message with immediateMessage <- function(..., domain = NULL, appendLF = TRUE) {
msg <- .makeMessage(..., domain = domain, appendLF = appendLF)
call <- sys.call()
m <- simpleMessage(msg, call)
cls <- class(m)
cls <- setdiff(cls, "condition")
cls <- c(cls, "immediateCondition", "condition")
class(m) <- cls
message(m)
invisible(m)
}
cl <- future::makeClusterPSOCK(2)
future::plan(future::cluster, workers = cl)
furrr::future_map(1:5, ~{
idx <- .x
purrr::map(1:3, ~{
immediateMessage(paste("Progress message:", .x, "from idx", idx))
Sys.sleep(1)
})
.x
})
# These are relayed as soon as possible, essentially in real time
# Notice how they are not sequential
Progress message: 1 from idx 1
Progress message: 1 from idx 3
Progress message: 2 from idx 1
Progress message: 2 from idx 3
Progress message: 3 from idx 1
Progress message: 3 from idx 3
Progress message: 1 from idx 2
Progress message: 1 from idx 4
Progress message: 2 from idx 2
Progress message: 2 from idx 4
Progress message: 3 from idx 2
Progress message: 3 from idx 4
Progress message: 1 from idx 5
Progress message: 2 from idx 5
Progress message: 3 from idx 5 Now, I'm not sure that this feature is 100% stable. @HenrikBengtsson would be able to tell you if this is a good idea or not. |
Yes, the idea is to be able to use immediateCondition:s for other purposes as well - not just progress updates. There are some simple wrappers for Having said this, I don't want to over-promise anything right now so stride carefully. For example, there is a risk that immediateCondition:s might be signaled twice - once in a near-live fashion and once when the future value is queried. I cannot remember where it stands on that right now. I now that progressr's 'progression' conditions, which inherits from 'immediateCondition', has built-in protect against this. BTW, note that you can "sticky" progressr messages, e.g |
Hi @DavisVaughan,
It does indeed!
Yes exactly.
This is exactly what I need! Thank you for the reprex as I would have stumbled around for a while trying to figure out how to implement that on my own! Much gratitude for your expert and expedited assistance!
@HenrikBengtsson Excellent, I'll have a look. Thanks for hopping in here to help out!
Noted. Thank you for letting me know. In the way I'm intending to use this to monitor a long-running background ML model build, duplicate signaling should not be an issue.
I'll check out
That's a neat feature, I can think of a use for it already. Thanks for mentioning! Thank you both so much for the excellent advice! 🤞 that I'm able to implement it without any snags. Grateful for everyone's detailed advice and efforts on these wonderfully useful packages! |
One of the issues that I've thought about in
furrr
(and infuture
in general, actually), is that when the plan being used isn'tsequential
/transparent
, the only condition that gets signaled to functions outside of the future call is the error condition. To the best of my knowledge, warnings and messages that arise withinfuture_map
/future
are basically gone once the results are returned.While I understand that the whole concept of
futures
precludes higher-up processes from dictating how the lower processes within the futures deal with these conditions, it seems to me inadvisable that potentially important warnings/messages are automatically thrown out. It seems like this design choice hampers "good code".I don't know what your thoughts are about this, but if you're interested in implementing something like this, I've extended the
future_map
functions in my personal custom codebase so that I can collect all the messages, warnings, and errors signaled in these functions, and I can then signal/view everything that was collected, in case the functions callingfuture_map
need to do anything about them.I uploaded the relevant code as a gist that can be accessed here. Basically, I wrote a function that "pries open" the
future_map
functions, wraps the.f
function in a function that collects conditions, and saves them as new functions. I went with this strategy for my code because I didn't see the point in hard-copying your code into new functions just to change something so small, but it has the added benefit of making things very flexible. I don't know if there are any theoretical problems with the condition collection incollect_all
, but it's worked in all the scenarios I've been using it.Just food for thought, basically! If you think such things are in the purview of this package, I could spruce it up and make a pull request, but I'd also be fine with you making your own code conceptually based off mine, if you just acknowledge me. At some level, I think stuff like this should probably be addressed in the
future
package itself, so I'll probably be asking Henrik about it regardless.The text was updated successfully, but these errors were encountered: