You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
After discussing #83 and seeing that it might be a little more work than expected for users to update to progressr from my hacky .progress = TRUE argument in furrr, I am reconsidering keeping the .progress argument, but having it use progressr internally. It would be focused on signaling the simplest progress update possible, and would not wrap with_progress() as #78 does. That would still be left to the user.
I am curious about your thoughts on this.
It would work like this:
library(dplyr)
library(tidyr)
library(furrr)
library(progressr)
df<- nest(mtcars, data=-gear)
# Without dplyr
with_progress({
future_map(df$data, nrow, .progress=TRUE)
})
# In a pipelinedf %>%
mutate(x= with_progress(future_map(data, nrow, .progress=TRUE)))
# The old way, for comparisondf %>%
mutate(x= future_map(data, nrow, .progress=TRUE))
I am also considering letting .progress take FALSE/TRUE and some other type of argument that would control how the internal progressor is created, just to give it a little more flexibility, but I'm not quite sure yet. I might start simpler and do that if it is requested.
Pros of this:
I think it is a little bit easier to use vs creating the p <- progressor() object and ticking it manually with p() inside the .f function. As we've seen in Adding a progress bar by default in a package #78, it seems to be a little tricky to get it exactly right, especially with dplyr.
The default of .progress = FALSE will have zero overhead
Perhaps this is a longer horizon "pro", but with a global calling handler we could completely remove the need for with_progress() and it would look exactly like the original furrr implementation.
Cons of this:
It requires managing a .progress option in furrr, but I think that is okay.
It is a little strange to have .progress = TRUE which signals progress updates, but then you also have to wrap it in with_progress() to see the updates. It also requires a little documentation to explain this, but it could go right in the docs for the .progress argument.
It would call p() on every element of .x, which might be overkill and result in performance issues. This is related to Limit the number of progression conditions signaled #81, but if it was a real issue then users could use a custom progressor object and control the progress updates themselves.
Nested progress updates are still an issue, but I think that is a separate issue from this one, and whatever improvements come there would flow naturally into furrr.
Technically furrr's .progress = TRUE argument "worked" on multicore, and this currently won't, but I can live with that.
The text was updated successfully, but these errors were encountered:
I and @DavisVaughan talked offline, but for those who might follow/see this, lots of these comments/ideas bring up the issues that relate to trying to identify exactly what progressr should and should not do and how it should be done. These are design problems that are not straightforward to decide so it'll take some time.
Regarding:
Technically furrr's .progress = TRUE argument "worked" on multicore, and this currently won't, but I can live with that.
This one is "simple" - support near-live progress updates using the multicore future backend can and will be implemented in the future package. The first iteration will probably relay such progress information via the local file system shared by the workers and the main R session. Either way, it's independent of the progressr package.
Related discussions in #83, #3, #78
After discussing #83 and seeing that it might be a little more work than expected for users to update to progressr from my hacky
.progress = TRUE
argument in furrr, I am reconsidering keeping the.progress
argument, but having it use progressr internally. It would be focused on signaling the simplest progress update possible, and would not wrapwith_progress()
as #78 does. That would still be left to the user.I am curious about your thoughts on this.
It would work like this:
I am also considering letting
.progress
takeFALSE/TRUE
and some other type of argument that would control how the internal progressor is created, just to give it a little more flexibility, but I'm not quite sure yet. I might start simpler and do that if it is requested.Pros of this:
I think it is a little bit easier to use vs creating the
p <- progressor()
object and ticking it manually withp()
inside the.f
function. As we've seen in Adding a progress bar by default in a package #78, it seems to be a little tricky to get it exactly right, especially with dplyr.The default of
.progress = FALSE
will have zero overheadThis approach still allows users to create a progressor object themselves and tick it manually, as displayed in Adding a progress bar by default in a package #78.
Perhaps this is a longer horizon "pro", but with a global calling handler we could completely remove the need for
with_progress()
and it would look exactly like the original furrr implementation.Cons of this:
It requires managing a
.progress
option in furrr, but I think that is okay.It is a little strange to have
.progress = TRUE
which signals progress updates, but then you also have to wrap it inwith_progress()
to see the updates. It also requires a little documentation to explain this, but it could go right in the docs for the.progress
argument.It would call
p()
on every element of.x
, which might be overkill and result in performance issues. This is related to Limit the number of progression conditions signaled #81, but if it was a real issue then users could use a custom progressor object and control the progress updates themselves.Nested progress updates are still an issue, but I think that is a separate issue from this one, and whatever improvements come there would flow naturally into furrr.
Technically furrr's
.progress = TRUE
argument "worked" on multicore, and this currently won't, but I can live with that.The text was updated successfully, but these errors were encountered: