Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automatically unpack in across() #6360

Closed
hadley opened this issue Jul 24, 2022 · 5 comments · Fixed by #6429
Closed

Automatically unpack in across() #6360

hadley opened this issue Jul 24, 2022 · 5 comments · Fixed by #6429
Assignees
Labels
each-col ↔️ feature a feature request or enhancement
Milestone

Comments

@hadley
Copy link
Member

hadley commented Jul 24, 2022

To support this sort of functionality:

library(dplyr, warn.conflicts = FALSE)

quantile_df <- function(x, probs = c(0.25, 0.5, 0.75)) {
  tibble(
    "{{x}}_val" := quantile(x, probs),
    "{{x}}_quant" := probs
  )
}

df <- tibble(
  grp = rep(1:3, each = 10),
  x = runif(30),
  y = rnorm(30)
)

df |> 
  group_by(grp) |> 
  summarise(across(x:y, quantile_df))
@hadley hadley added this to the 1.1.0 milestone Jul 24, 2022
@hadley hadley added feature a feature request or enhancement each-col ↔️ labels Jul 24, 2022
@DavisVaughan
Copy link
Member

See also #5563

@DavisVaughan DavisVaughan self-assigned this Aug 22, 2022
@DavisVaughan
Copy link
Member

DavisVaughan commented Aug 22, 2022

UI proposed by @lionel- and I

Essentially the idea is to add one more layer on top of what we already get back from across(cols, fns, .names = spec)

@param .unpack If `fns` returns a data frame, should that data frame be unpacked?
  - If `TRUE`, uses a `"_"` separator to separate the names between the original column/function name combination and the data frame column names. This is identical to using `"{outer}_{inner}"` (see below).
  - If `FALSE`, no unpacking is done.
  - A glue specification can also be used, using the keywords `"{outer}"` and `"{inner}"`. `outer` represents the names formed by the `.names` argument, and `inner` represents the names of the data frame returned by `.fns`
# Default
# Same as:
# `.names = "{col}_{fn}", .unpack = "{outer}_{inner}"`
across(x:y, quantiles, .names = NULL, .unpack = TRUE) 

# Don't unpack 
across(x:y, quantiles, .unpack = FALSE)

# Unpack with a specific glue spec
across(x:y, quantiles, .unpack = "{outer}.{inner}")

@hadley
Copy link
Member Author

hadley commented Aug 22, 2022

I assume you discussed and rejected adding another variable to the existing name specification? i.e. across(x:y, quantiles, .names = "{outer}.{inner}.{fn}")? Or do we assume that when unpacking there's only ever a single .fns? (It's hard to imagine having two data frame functions returning the same number of rows, but I guess it might happen?)

@DavisVaughan
Copy link
Member

Note: We also realized that we'd probably have to default it to unpack = FALSE. I think it is impossible to do across() expansion with unpack = TRUE, so we'd only ever attempt it when it is FALSE

@DavisVaughan
Copy link
Member

DavisVaughan commented Aug 22, 2022

I think having the additional argument is nice because we are going to need it regardless to be able to opt-in to the unpacking (or out if we figure out a way to allow it to default to unpack = TRUE).

I think putting the glue spec in .unpack is nice because it allows people to mix functions that need unpacking with those that don't:

across(
  x:y, 
  list(lag = ~multilag(.x, 1:5), double = ~.x * 2), 
  .names = "{col}.{fn}",
  .unpack = "{outer}.{inner}"
)

In this example when x.lag and y.lag are unpacked further, they use the .unpack glue spec, but x.double and y.double ignore it.

It's hard to imagine having two data frame functions returning the same number of rows, but I guess it might happen?

multilag() returns a data frame with the same number of rows as the input, and I could probably see other functions that do that too. i.e. someone might do multilag() combined with a multilead() in the same across() call.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
each-col ↔️ feature a feature request or enhancement
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants