-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
mutate() doesn't need to calculate for empty groups #4088
Comments
Yes, thanks for pointing that out, I thought |
|
I'm a bit torn here. What if evaluation of groups had a desired side effect? Maybe we could decide this based on the new |
I don't understand the point about |
@romainfrancois this probably needs a test for when all groups are empty. What type does the generated column have? |
Turns out in that case, this lines gets it: https://github.com/tidyverse/dplyr/blob/master/src/mutate.cpp#L168 library(dplyr, warn.conflicts = FALSE)
d <- tibble(f = factor(levels = c("a", "b")))
d %>%
group_by(f) %>%
mutate(x = print(32.5))
#> [1] 32.5
#> # A tibble: 0 x 2
#> # Groups: f [2]
#> # … with 2 variables: f <fct>, x <dbl> but it might be better to handle it with the case where there are no groups: Line 490 in 1c474b5
|
Oh, thanks for the correction! Now I'm not sure know why I assumed so... |
@krlmlr Do you have any actual examples in your mind? I've come up with this concern, but I couldn't find any useful cases. For side effects, I think
|
Just my own FUD here. I think it's fine if we document the behavior, also with a test. |
Thanks, agreed. |
This old issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with reprex) and link to this issue. https://reprex.tidyverse.org/ |
The verbs that returns the same numbers of rows as the input (e.g.
mutate()
,filter()
) probably doesn't need to calculate the empty groups because they never appear in the result. Actually,filter()
andarrange()
seem to skip empty groups, butmutate()
doesn't:Created on 2019-01-08 by the reprex package (v0.2.1)
Skipping empty groups is not only good for performance, but also nice to avoid warnings or errors related to 0-length vectors. I feel the warnings above seem a bit puzzling.
Maybe
Gatherer
can checkindices.size()
around here?dplyr/src/mutate.cpp
Lines 272 to 276 in ce7f2bc
But, of course,
summarise()
cannot skip empty groups. This meansGatherer
needs to know the context whether the operation is forsummary
or not to determine whether to skip the calculation or not. So, this might make the implementation a bit complex.I think this problem is not so serious as #4061 because I expect this doesn't break the existing code so hard, but it's nice if it is possible to avoid unnecessary computations.
The text was updated successfully, but these errors were encountered: