-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Closed
Labels
bugan unexpected problem or unintended behavioran unexpected problem or unintended behavior
Milestone
Description
all_of or c_across function cannot help tidyselect a set of variables according to a particular set of strings. This is a really really weird error and please see my demo below. Thank you!
library(tidyverse)
### Behavior of tidy-select is changed after 1.0.6 release of dplyr.
### If all_of function has been modified, it would be likely that all_of is causing the problem below.
# Generate a random dataset with colname A, B, C, D, E
FOOBAR.df <- data.frame(matrix(rnorm(50), nrow = 10)) %>% `colnames<-`(LETTERS[1:5])
# Usually if you want the rowSums, c_across, all_of and a vector of string always did the trick when version <= 1.0.5
FOOBAR.df %>% rowwise() %>% transmute(sum(c_across(all_of(c("A", "B", "C", "D", "E")))))
#> # A tibble: 10 x 1
#> # Rowwise:
#> `sum(c_across(all_of(c("A", "B", "C", "D", "E"))))`
#> <dbl>
#> 1 -0.967
#> 2 -0.168
#> 3 0.0993
#> 4 -0.959
#> 5 -2.26
#> 6 -4.41
#> 7 1.48
#> 8 -0.261
#> 9 2.16
#> 10 -3.38
# But I got an error specific to the vector string of colname I was using in my dataset, thus I am typing in exactly what I used when this error occurred
colnames(FOOBAR.df) <- c("Prevotella", "Streptococcus", "Gemella", "Rothia", "Haemophilus")
# Try to get the row sum of all the variables in the vector of string, will result in an error.
FOOBAR.df %>% rowwise() %>% transmute(sum(c_across(all_of(c("Prevotella", "Streptococcus", "Gemella", "Rothia", "Haemophilus"))))) #%>% tryCatch(error = function(e) "no such index at level 1")
#> Error: Problem with `mutate()` input `..1`.
#> ℹ `..1 = sum(...)`.
#> x no such index at level 1
#>
#> ℹ The error occurred in row 1.
# Why I said that this error seems to be particularly to the vector string of colname? I've tried the following tests:
# a) Replace a spelling, for example I shall spell "Prevotella", the first element in previous colname, as "Prevotela" with only 1 l and it works fine
colnames(FOOBAR.df) <- c("Prevotela", "Streptococcus", "Gemella", "Rothia", "Haemophilus")
FOOBAR.df %>% rowwise() %>% transmute(sum(c_across(all_of(c("Prevotela", "Streptococcus", "Gemella", "Rothia", "Haemophilus")))))
#> # A tibble: 10 x 1
#> # Rowwise:
#> `sum(...)`
#> <dbl>
#> 1 -0.967
#> 2 -0.168
#> 3 0.0993
#> 4 -0.959
#> 5 -2.26
#> 6 -4.41
#> 7 1.48
#> 8 -0.261
#> 9 2.16
#> 10 -3.38
# b) Removing one or some elements off that string and it works just fine
FOOBAR.df %>% rowwise() %>% transmute(sum(c_across(all_of(c("Streptococcus", "Gemella", "Rothia", "Haemophilus")))))
#> # A tibble: 10 x 1
#> # Rowwise:
#> `sum(c_across(all_of(c("Streptococcus", "Gemella", "Rothia", "Haemophilus"))…
#> <dbl>
#> 1 -1.60
#> 2 -1.10
#> 3 -0.0986
#> 4 -0.576
#> 5 -1.25
#> 6 -4.27
#> 7 1.96
#> 8 0.491
#> 9 1.44
#> 10 -2.45
# c) Wrap the string I was using into a vector, and feed the name of the vector as parameter of all_of, and it also works fine
test_colname_vec <- c("Prevotella", "Streptococcus", "Gemella", "Rothia", "Haemophilus")
colnames(FOOBAR.df) <- test_colname_vec
FOOBAR.df %>% rowwise() %>% transmute(sum(c_across(all_of(test_colname_vec))))
#> # A tibble: 10 x 1
#> # Rowwise:
#> `sum(c_across(all_of(test_colname_vec)))`
#> <dbl>
#> 1 -0.967
#> 2 -0.168
#> 3 0.0993
#> 4 -0.959
#> 5 -2.26
#> 6 -4.41
#> 7 1.48
#> 8 -0.261
#> 9 2.16
#> 10 -3.38
# In summary, it seems to be an error related to how 'all_of' or 'c_across' deal with string vector as input parameter,
# and is also likely sensitive to the content and length of string. This is really confusing, and the same code had zero problem in a 1.0.5 dplyr env
# FYI, I am also listing the version of R I was using below:
version[c("system", "version.string")]
#> _
#> system x86_64, darwin15.6.0
#> version.string R version 3.6.2 (2019-12-12)Created on 2021-05-12 by the reprex package (v2.0.0)
Metadata
Metadata
Assignees
Labels
bugan unexpected problem or unintended behavioran unexpected problem or unintended behavior