-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simpler group_map() #4251
Simpler group_map() #4251
Conversation
This looks interesting! Any thought to having i.e. Or perhaps maybe a separate |
A group_map_dfr <- function(.tbl, .f, ..., .id = NULL) bind_rows(group_map(.tbl, .f, ...), .id = .id) But what that does not do is relate back to the grouping variables of Perhaps the user can make sure they include the grouping variables, and then do a |
I really like this! For reference, this is the kind of thing that I was trying to do: library(dplyr)
library(ggplot2)
library(patchwork)
data <- iris %>%
group_by(Species)
plot_length_by_width <- function(data) {
data %>%
ggplot(aes(x = Sepal.Length,
y = Sepal.Width)) +
geom_point() +
ggtitle(paste(data[["Species"]]))
}
data %>%
group_map( ~ plot_length_by_width(.x),
keep = TRUE) %>%
wrap_plots() Created on 2019-03-06 by the reprex package (v0.2.1.9000) and it is super easy to accomplish with this new approach! I especially like the |
I suspect my head is not in the right place but I don't get the motivation for this yet. The first example, where a function is mapped over the groups induced by Then, if we use I feel like I am missing something. |
The advantage of this For most cases it does not matter all that much, but when for some reason you have an empty group, the |
But indeed @sharlagelfand example might as well be data %>%
group_split(keep = TRUE) %>%
purrr::map(plot_length_by_width) %>%
wrap_plots() |
At the time of applying the function, yes, but then it's generally absent from the result, right? It fees like this thread does not yet have an example that actually uses the We seem to be converging on evaluating them in terms of ease of access to the grouping keys vs data at the time of applying the mapped function and in the result. |
The result is whatever the user makes of it. This no longer makes assumptions or requirements on that, it just applies the function to each group, a group being:
If it wasn't for empty groups then we would not need this proposed The other thing is that with this implementation, group_map <- function(.tbl, .f, ..., keep = FALSE) {
.f <- as_group_map_function(.f)
# call the function on each group
chunks <- group_split(.tbl, keep = keep)
keys <- group_keys(.tbl)
group_keys <- map(seq_len(nrow(keys)), function(i) keys[i, , drop = FALSE])
map2(chunks, group_keys, .f, ...)
} |
More thoughts: Is the impetus for change that current |
It's like we have what we have now, because we wanted something between
but we implemented this too early, i.e. before they can return tibbles (#2326). This still makes sense to have a
to splice the resulting columns of 🙈 library(dance)
iris %>%
group_by(Species) %>%
jive( ~ broom::tidy(lm(Sepal.Length ~ Sepal.Width)))
#> # A tibble: 6 x 6
#> # Groups: Species [3]
#> Species term estimate std.error statistic p.value
#> <fct> <chr> <dbl> <dbl> <dbl> <dbl>
#> 1 setosa (Intercept) 2.64 0.310 8.51 3.74e-11
#> 2 setosa Sepal.Width 0.690 0.0899 7.68 6.71e-10
#> 3 versicolor (Intercept) 3.54 0.563 6.29 9.07e- 8
#> 4 versicolor Sepal.Width 0.865 0.202 4.28 8.77e- 5
#> 5 virginica (Intercept) 3.91 0.757 5.16 4.66e- 6
#> 6 virginica Sepal.Width 0.902 0.253 3.56 8.43e- 4 Created on 2019-03-08 by the reprex package (v0.2.1.9000) |
To come back to your last question @jennybc I'd say "both". The thing we currently call And the name |
0670bd1
to
7ff980c
Compare
7ff980c
to
a014ad3
Compare
This new implementation feels like a much better fit to the name @jennybc as an example of how you might use plot_length_by_width <- function(data, group) {
data %>%
ggplot(aes(Sepal.Length, Sepal.Width)) +
geom_point() +
ggtitle(group$Species)
}
iris %>%
group_by(cyl) %>%
group_map(plot_length_by_width) (The fact that the original code actually works is a bug in ggplot2: |
I can have a go at group_modify() in another branch/pr. Is that a temporary measure until we have something like summarise() without the vec_size() == 1 constraint and support for auto splice tibble results ? Such a thing would have a more natural quosure-like interface. |
I think |
group_map() becomes the proposed
✅ so this now has:
|
This old issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with reprex) and link to this issue. https://reprex.tidyverse.org/ |
Following discussions on twitter, ... here is a simpler
group_map()
that just lives up to its name and return a list:This simplifies also the implementation, but makes it harder to get the results it previously made when it was assuming the results from
.f
were always data frames.This makes
group_map()
similar togroup_split() + map()
with the bonus of having.y
available, which is sometimes necessary for empty groups.I'd need to fix the documentation if this is the direction we want to take.