Skip to content
This repository has been archived by the owner on Jun 5, 2020. It is now read-only.

Tangling with dots #24

Open
hadley opened this issue Feb 22, 2018 · 0 comments
Open

Tangling with dots #24

hadley opened this issue Feb 22, 2018 · 0 comments

Comments

@hadley
Copy link
Member

hadley commented Feb 22, 2018

I wrote this for Advanced R, but it doesn't feel quite right there. I think it might be better in the programming with dplyr vignette (whatever that ends up being)

Tangling with dots

In our grouped_mean() example above, we allow the user to select one grouping variable, and one summary variable. What if we wanted to allow the user to select more than one? One option would be to use .... There are three possible ways we could use ... it:

  • Pass ... onto the mean() function. That would make it easy to set
    na.rm = TRUE. This is easiest to implement.

  • Allow the user to select multiple groups

  • Allow the user to select multiple variables to summarise.

Implementing each one of these is relatively straightforward, but what if we want to be able to group by multiple variables, summarise multiple variables, and pass extra args on to mean(). Generally, I think it is better to avoid this sort of API (instead relying on multiple function that each do one thing) but sometimes it is the lesser of the two evils, so it is useful to have a technique in your backpocket to handle it.

grouped_mean <- function(df, groups, vars, args) {

  var_means <- map(vars, function(var) expr(mean(!!var, !!!args)))
  names(var_means) <- map_chr(vars, expr_name)
  
  df %>%
    dplyr::group_by(!!!groups) %>%
    dplyr::summarise(!!!var_means)
}

grouped_mean(mtcars, exprs(vs, am), exprs(hp, drat, wt), list(na.rm = TRUE))

If you use this design a lot, you may also want to provide an alias to exprs() with a better name. For example, dplyr provides the vars() wrapper to support the scoped verbs (e.g. summarise_if(), mutate_at()). aes() in ggplot2 is similar, although it does a little more: requires all arguments be named, naming the the first arguments (x and y) by default, and automatically renames so you can use the base names for aesthetics (e.g. pch vs shape).

grouped_mean(mtcars, vars(vs, am), vars(hp, drat, wt), list(na.rm = TRUE))

Exercises

  1. Implement the three variants of grouped_mean() described above:

    # ... passed on to mean
    grouped_mean <- function(df, group_by, summarise, ...) {}
    # ... selects variables to summarise
    grouped_mean <- function(df, group_by, ...) {}
    # ... selects variables to group by
    grouped_mean <- function(df, ..., summarise) {}
    
@lionel- lionel- transferred this issue from tidyverse/dplyr Feb 7, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant