Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

any_of does not respect variable name order #186

Closed
jogrue opened this issue Apr 8, 2020 · 2 comments
Closed

any_of does not respect variable name order #186

jogrue opened this issue Apr 8, 2020 · 2 comments
Labels
bug an unexpected problem or unintended behavior
Milestone

Comments

@jogrue
Copy link

jogrue commented Apr 8, 2020

I am running the following script on R 3.6.3 on Windows 10 (64 Bit). tidyverse is 1.3.0, dplyr is 0.8.5, and tidyselect is 1.0.0.
library(tidyverse)

# Sys information
Sys.info()
R.version
packageVersion("dplyr")
packageVersion("tidyselect")
packageVersion("tidyverse")

my_tibble <- tibble(first_var = 1:3,
                    third_var = 1:3,
                    second_var = 1:3)
sorted_vars <- c("first_var", "second_var", "third_var", "fourth_var")

# Variable names in original order
my_tibble %>% names
# all_of gives an error because of no "fourth_var"
select(my_tibble, all_of(sorted_vars)) %>%
  names
# any_of works but does not respect order in "sorted_vars"
select(my_tibble, any_of(sorted_vars)) %>%
  names
# this works as expected
select(
  my_tibble,
  all_of(
    sorted_vars[sorted_vars %in% names(my_tibble)]
  )
) %>% names

At first variables are sorted in the orginal tibble order.

my_tibble %>% names
[1] "first_var"  "third_var"  "second_var"

all_of gives an error, of course, because my list of variable names includes a non-existing "fourth_var".

select(my_tibble, all_of(sorted_vars)) %>%
  names
Error: Can't subset columns that don't exist.
x The column `fourth_var` doesn't exist.
Run `rlang::last_error()` to see where the error occurred.

all_of based on a variable name list that only includes names for existing variables works as expected. Columns are sorted in the order of the provided list and not in the original order.

select(
  my_tibble,
  all_of(
    sorted_vars[sorted_vars %in% names(my_tibble)]
  )
) %>% names
"first_var"  "second_var" "third_var"

any_of works, but I would have expected it to select variables in the order of the list provided. Instead it returns the columns in the same order as in the original tibble.

select(my_tibble, any_of(sorted_vars)) %>%
  names
[1] "first_var"  "third_var"  "second_var"

From the help file ("The order of selected columns is determined by the inputs."), I would also have expected any_of to respect the order of variable names provided. Am I missing out on something here?

@japhir

This comment has been minimized.

@hadley
Copy link
Member

hadley commented Apr 23, 2020

library(tidyselect)

df <- data.frame(x = 1, y = 2)
unname(eval_select(any_of(c("y", "x")), df))
#> [1] 1 2
unname(eval_select(all_of(c("y", "x")), df))
#> [1] 2 1
unname(eval_select(one_of(c("y", "x")), df))
#> [1] 2 1

Created on 2020-04-23 by the reprex package (v0.3.0)

@hadley hadley added the bug an unexpected problem or unintended behavior label Apr 23, 2020
@lionel- lionel- added this to the 1.1.0 milestone May 8, 2020
@lionel- lionel- closed this as completed in 64b5afb May 8, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug an unexpected problem or unintended behavior
Projects
None yet
Development

No branches or pull requests

4 participants