Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

group_by and summarise don't work after another analysis. "Can't subset elements that don't exist." #5733

Closed
szimmer opened this issue Feb 3, 2021 · 3 comments · Fixed by #5765
Assignees
Labels
bug an unexpected problem or unintended behavior
Milestone

Comments

@szimmer
Copy link
Contributor

@szimmer szimmer commented Feb 3, 2021

When doing analysis using group_by and summarise, an isolated analysis works but when an unrelated analysis proceeds it, there is an error. This does not occur with dplyr 1.0.2. Issue originally posted on community.rstudio.com by someone else: https://community.rstudio.com/t/it-works-alone-but-fails-successively/95017

Working example:

library(tidyverse)
library(palmerpenguins)

# code B
penguins %>%
  group_by(species) %>%
  summarise(
    n = n(),
    across(starts_with("bill_"), mean, na.rm = TRUE),
    Area = mean(bill_length_mm * bill_depth_mm, na.rm = TRUE),
    across(ends_with("_g"), mean, na.rm = TRUE),
  )
#> # A tibble: 3 x 6
#>   species       n bill_length_mm bill_depth_mm  Area body_mass_g
#> * <fct>     <int>          <dbl>         <dbl> <dbl>       <dbl>
#> 1 Adelie      152           38.8          18.3  712.       3701.
#> 2 Chinstrap    68           48.8          18.4  900.       3733.
#> 3 Gentoo      124           47.5          15.0  712.       5076.

Created on 2021-02-03 by the reprex package (v1.0.0)

Session info
sessioninfo::session_info()
#> - Session info ---------------------------------------------------------------
#>  setting  value                       
#>  version  R version 4.0.3 (2020-10-10)
#>  os       Windows 10 x64              
#>  system   x86_64, mingw32             
#>  ui       RTerm                       
#>  language (EN)                        
#>  collate  English_United States.1252  
#>  ctype    English_United States.1252  
#>  tz       America/New_York            
#>  date     2021-02-03                  
#> 
#> - Packages -------------------------------------------------------------------
#>  package        * version date       lib source        
#>  assertthat       0.2.1   2019-03-21 [1] CRAN (R 4.0.2)
#>  backports        1.2.0   2020-11-02 [1] CRAN (R 4.0.3)
#>  broom            0.7.4   2021-01-29 [1] CRAN (R 4.0.3)
#>  cellranger       1.1.0   2016-07-27 [1] CRAN (R 4.0.2)
#>  cli              2.0.2   2020-02-28 [1] CRAN (R 4.0.2)
#>  colorspace       2.0-0   2020-11-11 [1] CRAN (R 4.0.3)
#>  crayon           1.3.4   2017-09-16 [1] CRAN (R 4.0.2)
#>  DBI              1.1.0   2019-12-15 [1] CRAN (R 4.0.2)
#>  dbplyr           2.0.0   2020-11-03 [1] CRAN (R 4.0.3)
#>  digest           0.6.27  2020-10-24 [1] CRAN (R 4.0.3)
#>  dplyr          * 1.0.4   2021-02-02 [1] CRAN (R 4.0.3)
#>  ellipsis         0.3.1   2020-05-15 [1] CRAN (R 4.0.2)
#>  evaluate         0.14    2019-05-28 [1] CRAN (R 4.0.2)
#>  fansi            0.4.1   2020-01-08 [1] CRAN (R 4.0.2)
#>  forcats        * 0.5.1   2021-01-27 [1] CRAN (R 4.0.3)
#>  fs               1.5.0   2020-07-31 [1] CRAN (R 4.0.2)
#>  generics         0.1.0   2020-10-31 [1] CRAN (R 4.0.3)
#>  ggplot2        * 3.3.3   2020-12-30 [1] CRAN (R 4.0.3)
#>  glue             1.4.2   2020-08-27 [1] CRAN (R 4.0.3)
#>  gtable           0.3.0   2019-03-25 [1] CRAN (R 4.0.2)
#>  haven            2.3.1   2020-06-01 [1] CRAN (R 4.0.2)
#>  highr            0.8     2019-03-20 [1] CRAN (R 4.0.2)
#>  hms              1.0.0   2021-01-13 [1] CRAN (R 4.0.3)
#>  htmltools        0.5.0   2020-06-16 [1] CRAN (R 4.0.2)
#>  httr             1.4.2   2020-07-20 [1] CRAN (R 4.0.2)
#>  jsonlite         1.7.2   2020-12-09 [1] CRAN (R 4.0.3)
#>  knitr            1.30    2020-09-22 [1] CRAN (R 4.0.3)
#>  lifecycle        0.2.0   2020-03-06 [1] CRAN (R 4.0.2)
#>  lubridate        1.7.9.2 2020-11-13 [1] CRAN (R 4.0.3)
#>  magrittr         2.0.1   2020-11-17 [1] CRAN (R 4.0.3)
#>  modelr           0.1.8   2020-05-19 [1] CRAN (R 4.0.2)
#>  munsell          0.5.0   2018-06-12 [1] CRAN (R 4.0.2)
#>  palmerpenguins * 0.1.0   2020-07-23 [1] CRAN (R 4.0.3)
#>  pillar           1.4.7   2020-11-20 [1] CRAN (R 4.0.3)
#>  pkgconfig        2.0.3   2019-09-22 [1] CRAN (R 4.0.2)
#>  purrr          * 0.3.4   2020-04-17 [1] CRAN (R 4.0.2)
#>  R.cache          0.14.0  2019-12-06 [1] CRAN (R 4.0.3)
#>  R.methodsS3      1.8.1   2020-08-26 [1] CRAN (R 4.0.3)
#>  R.oo             1.24.0  2020-08-26 [1] CRAN (R 4.0.3)
#>  R.utils          2.10.1  2020-08-26 [1] CRAN (R 4.0.3)
#>  R6               2.5.0   2020-10-28 [1] CRAN (R 4.0.3)
#>  Rcpp             1.0.5   2020-07-06 [1] CRAN (R 4.0.2)
#>  readr          * 1.4.0   2020-10-05 [1] CRAN (R 4.0.3)
#>  readxl           1.3.1   2019-03-13 [1] CRAN (R 4.0.2)
#>  rematch2         2.1.2   2020-05-01 [1] CRAN (R 4.0.2)
#>  reprex           1.0.0   2021-01-27 [1] CRAN (R 4.0.3)
#>  rlang            0.4.10  2020-12-30 [1] CRAN (R 4.0.3)
#>  rmarkdown        2.5     2020-10-21 [1] CRAN (R 4.0.3)
#>  rstudioapi       0.11    2020-02-07 [1] CRAN (R 4.0.2)
#>  rvest            0.3.6   2020-07-25 [1] CRAN (R 4.0.2)
#>  scales           1.1.1   2020-05-11 [1] CRAN (R 4.0.2)
#>  sessioninfo      1.1.1   2018-11-05 [1] CRAN (R 4.0.3)
#>  stringi          1.5.3   2020-09-09 [1] CRAN (R 4.0.3)
#>  stringr        * 1.4.0   2019-02-10 [1] CRAN (R 4.0.2)
#>  styler           1.3.2   2020-02-23 [1] CRAN (R 4.0.3)
#>  tibble         * 3.0.6   2021-01-29 [1] CRAN (R 4.0.3)
#>  tidyr          * 1.1.2   2020-08-27 [1] CRAN (R 4.0.2)
#>  tidyselect       1.1.0   2020-05-11 [1] CRAN (R 4.0.2)
#>  tidyverse      * 1.3.0   2019-11-21 [1] CRAN (R 4.0.3)
#>  utf8             1.1.4   2018-05-24 [1] CRAN (R 4.0.2)
#>  vctrs            0.3.5   2020-11-17 [1] CRAN (R 4.0.3)
#>  withr            2.3.0   2020-09-22 [1] CRAN (R 4.0.3)
#>  xfun             0.19    2020-10-30 [1] CRAN (R 4.0.3)
#>  xml2             1.3.2   2020-04-23 [1] CRAN (R 4.0.2)
#>  yaml             2.2.1   2020-02-01 [1] CRAN (R 4.0.2)
#> 
#> [1] ../Documents/R/win-library/4.0
#> [2] C:/Program Files/R/R-4.0.3/library

Broken example:

library(tidyverse)
library(palmerpenguins)

# code A
penguins %>%
  group_by(species, island) %>%
  summarise(
    prob = c(.25, .75),
    across(
      c(bill_length_mm, bill_depth_mm, flipper_length_mm),
      ~ quantile(., prob, na.rm = TRUE)
    )
  )
#> `summarise()` has grouped output by 'species', 'island'. You can override using the `.groups` argument.
#> # A tibble: 10 x 6
#> # Groups:   species, island [5]
#>    species   island     prob bill_length_mm bill_depth_mm flipper_length_mm
#>    <fct>     <fct>     <dbl>          <dbl>         <dbl>             <dbl>
#>  1 Adelie    Biscoe     0.25           37.7          17.6              185.
#>  2 Adelie    Biscoe     0.75           40.7          19.0              193 
#>  3 Adelie    Dream      0.25           36.8          17.5              185 
#>  4 Adelie    Dream      0.75           40.4          18.8              193 
#>  5 Adelie    Torgersen  0.25           36.7          17.4              187 
#>  6 Adelie    Torgersen  0.75           41.1          19.2              195 
#>  7 Chinstrap Dream      0.25           46.3          17.5              191 
#>  8 Chinstrap Dream      0.75           51.1          19.4              201 
#>  9 Gentoo    Biscoe     0.25           45.3          14.2              212 
#> 10 Gentoo    Biscoe     0.75           49.6          15.7              221

# code B
penguins %>%
  group_by(species) %>%
  summarise(
    n = n(),
    across(starts_with("bill_"), mean, na.rm = TRUE),
    Area = mean(bill_length_mm * bill_depth_mm, na.rm = TRUE),
    across(ends_with("_g"), mean, na.rm = TRUE),
  )
#> Error: Can't subset elements that don't exist.
#> x Location 5 doesn't exist.
#> i There are only 3 elements.

rlang::last_error()
#> <error/vctrs_error_subscript_oob>
#> Can't subset elements that don't exist.
#> x Location 5 doesn't exist.
#> i There are only 3 elements.
#> Backtrace:
#>   1. `%>%`(...)
#>  32. dplyr::cur_group()
#>  33. peek_mask("cur_group()")$current_key()
#>  34. vctrs::vec_slice(private$keys, self$get_current_group())
#>  36. vctrs:::stop_subscript_oob(...)
#>  37. vctrs:::stop_subscript(...)
#> Run `rlang::last_trace()` to see the full context.

Created on 2021-02-03 by the reprex package (v1.0.0)

Session info
sessioninfo::session_info()
#> - Session info ---------------------------------------------------------------
#>  setting  value                       
#>  version  R version 4.0.3 (2020-10-10)
#>  os       Windows 10 x64              
#>  system   x86_64, mingw32             
#>  ui       RTerm                       
#>  language (EN)                        
#>  collate  English_United States.1252  
#>  ctype    English_United States.1252  
#>  tz       America/New_York            
#>  date     2021-02-03                  
#> 
#> - Packages -------------------------------------------------------------------
#>  package        * version date       lib source        
#>  assertthat       0.2.1   2019-03-21 [1] CRAN (R 4.0.2)
#>  backports        1.2.0   2020-11-02 [1] CRAN (R 4.0.3)
#>  broom            0.7.4   2021-01-29 [1] CRAN (R 4.0.3)
#>  cellranger       1.1.0   2016-07-27 [1] CRAN (R 4.0.2)
#>  cli              2.0.2   2020-02-28 [1] CRAN (R 4.0.2)
#>  colorspace       2.0-0   2020-11-11 [1] CRAN (R 4.0.3)
#>  crayon           1.3.4   2017-09-16 [1] CRAN (R 4.0.2)
#>  DBI              1.1.0   2019-12-15 [1] CRAN (R 4.0.2)
#>  dbplyr           2.0.0   2020-11-03 [1] CRAN (R 4.0.3)
#>  digest           0.6.27  2020-10-24 [1] CRAN (R 4.0.3)
#>  dplyr          * 1.0.4   2021-02-02 [1] CRAN (R 4.0.3)
#>  ellipsis         0.3.1   2020-05-15 [1] CRAN (R 4.0.2)
#>  evaluate         0.14    2019-05-28 [1] CRAN (R 4.0.2)
#>  fansi            0.4.1   2020-01-08 [1] CRAN (R 4.0.2)
#>  forcats        * 0.5.1   2021-01-27 [1] CRAN (R 4.0.3)
#>  fs               1.5.0   2020-07-31 [1] CRAN (R 4.0.2)
#>  generics         0.1.0   2020-10-31 [1] CRAN (R 4.0.3)
#>  ggplot2        * 3.3.3   2020-12-30 [1] CRAN (R 4.0.3)
#>  glue             1.4.2   2020-08-27 [1] CRAN (R 4.0.3)
#>  gtable           0.3.0   2019-03-25 [1] CRAN (R 4.0.2)
#>  haven            2.3.1   2020-06-01 [1] CRAN (R 4.0.2)
#>  highr            0.8     2019-03-20 [1] CRAN (R 4.0.2)
#>  hms              1.0.0   2021-01-13 [1] CRAN (R 4.0.3)
#>  htmltools        0.5.0   2020-06-16 [1] CRAN (R 4.0.2)
#>  httr             1.4.2   2020-07-20 [1] CRAN (R 4.0.2)
#>  jsonlite         1.7.2   2020-12-09 [1] CRAN (R 4.0.3)
#>  knitr            1.30    2020-09-22 [1] CRAN (R 4.0.3)
#>  lifecycle        0.2.0   2020-03-06 [1] CRAN (R 4.0.2)
#>  lubridate        1.7.9.2 2020-11-13 [1] CRAN (R 4.0.3)
#>  magrittr         2.0.1   2020-11-17 [1] CRAN (R 4.0.3)
#>  modelr           0.1.8   2020-05-19 [1] CRAN (R 4.0.2)
#>  munsell          0.5.0   2018-06-12 [1] CRAN (R 4.0.2)
#>  palmerpenguins * 0.1.0   2020-07-23 [1] CRAN (R 4.0.3)
#>  pillar           1.4.7   2020-11-20 [1] CRAN (R 4.0.3)
#>  pkgconfig        2.0.3   2019-09-22 [1] CRAN (R 4.0.2)
#>  purrr          * 0.3.4   2020-04-17 [1] CRAN (R 4.0.2)
#>  R.cache          0.14.0  2019-12-06 [1] CRAN (R 4.0.3)
#>  R.methodsS3      1.8.1   2020-08-26 [1] CRAN (R 4.0.3)
#>  R.oo             1.24.0  2020-08-26 [1] CRAN (R 4.0.3)
#>  R.utils          2.10.1  2020-08-26 [1] CRAN (R 4.0.3)
#>  R6               2.5.0   2020-10-28 [1] CRAN (R 4.0.3)
#>  Rcpp             1.0.5   2020-07-06 [1] CRAN (R 4.0.2)
#>  readr          * 1.4.0   2020-10-05 [1] CRAN (R 4.0.3)
#>  readxl           1.3.1   2019-03-13 [1] CRAN (R 4.0.2)
#>  rematch2         2.1.2   2020-05-01 [1] CRAN (R 4.0.2)
#>  reprex           1.0.0   2021-01-27 [1] CRAN (R 4.0.3)
#>  rlang            0.4.10  2020-12-30 [1] CRAN (R 4.0.3)
#>  rmarkdown        2.5     2020-10-21 [1] CRAN (R 4.0.3)
#>  rstudioapi       0.11    2020-02-07 [1] CRAN (R 4.0.2)
#>  rvest            0.3.6   2020-07-25 [1] CRAN (R 4.0.2)
#>  scales           1.1.1   2020-05-11 [1] CRAN (R 4.0.2)
#>  sessioninfo      1.1.1   2018-11-05 [1] CRAN (R 4.0.3)
#>  stringi          1.5.3   2020-09-09 [1] CRAN (R 4.0.3)
#>  stringr        * 1.4.0   2019-02-10 [1] CRAN (R 4.0.2)
#>  styler           1.3.2   2020-02-23 [1] CRAN (R 4.0.3)
#>  tibble         * 3.0.6   2021-01-29 [1] CRAN (R 4.0.3)
#>  tidyr          * 1.1.2   2020-08-27 [1] CRAN (R 4.0.2)
#>  tidyselect       1.1.0   2020-05-11 [1] CRAN (R 4.0.2)
#>  tidyverse      * 1.3.0   2019-11-21 [1] CRAN (R 4.0.3)
#>  utf8             1.1.4   2018-05-24 [1] CRAN (R 4.0.2)
#>  vctrs            0.3.5   2020-11-17 [1] CRAN (R 4.0.3)
#>  withr            2.3.0   2020-09-22 [1] CRAN (R 4.0.3)
#>  xfun             0.19    2020-10-30 [1] CRAN (R 4.0.3)
#>  xml2             1.3.2   2020-04-23 [1] CRAN (R 4.0.2)
#>  yaml             2.2.1   2020-02-01 [1] CRAN (R 4.0.2)
#> 
#> [1] ../Documents/R/win-library/4.0
#> [2] C:/Program Files/R/R-4.0.3/library
@arthurgailes

This comment has been minimized.

@schroeder-matt

This comment has been minimized.

@romainfrancois romainfrancois added the bug an unexpected problem or unintended behavior label Feb 5, 2021
@romainfrancois romainfrancois self-assigned this Feb 5, 2021
@rtaph
Copy link

@rtaph rtaph commented Feb 5, 2021

I have run into this as well. I suspect it has to do with not being able to pass-in additional arguments. Have temporarily switched to a formula specification in across(), which works in 1.0.4.

library(dplyr)

# works in 1.0.3 and 1.0.4
storms %>%
  mutate(across(name, ~forcats::fct_reorder(., .x = hour)))
#> # A tibble: 10,010 x 13
#>    name   year month   day  hour   lat  long status category  wind pressure
#>    <fct> <dbl> <dbl> <int> <dbl> <dbl> <dbl> <chr>  <ord>    <int>    <int>
#>  1 Amy    1975     6    27     0  27.5 -79   tropi… -1          25     1013
#>  2 Amy    1975     6    27     6  28.5 -79   tropi… -1          25     1013
#>  3 Amy    1975     6    27    12  29.5 -79   tropi… -1          25     1013
#>  4 Amy    1975     6    27    18  30.5 -79   tropi… -1          25     1013
#>  5 Amy    1975     6    28     0  31.5 -78.8 tropi… -1          25     1012
#>  6 Amy    1975     6    28     6  32.4 -78.7 tropi… -1          25     1012
#>  7 Amy    1975     6    28    12  33.3 -78   tropi… -1          25     1011
#>  8 Amy    1975     6    28    18  34   -77   tropi… -1          30     1006
#>  9 Amy    1975     6    29     0  34.4 -75.8 tropi… 0           35     1004
#> 10 Amy    1975     6    29     6  34   -74.8 tropi… 0           40     1002
#> # … with 10,000 more rows, and 2 more variables: ts_diameter <dbl>,
#> #   hu_diameter <dbl>

# works in 1.0.3, fails in 1.0.4
storms %>%
  mutate(across(name, forcats::fct_reorder, .x = hour))
#> Error: Problem with `mutate()` input `..1`.
#> x object 'hour' not found
#> ℹ Input `..1` is `(function (.cols = everything(), .fns = NULL, ..., .names = NULL) ...`.

Created on 2021-02-05 by the reprex package (v1.0.0)

@romainfrancois romainfrancois added this to the 1.0.5 milestone Feb 11, 2021
romainfrancois added a commit that referenced this issue Feb 15, 2021
…ause it is internally referenced and weird stuff happens.

closes #5733
romainfrancois added a commit that referenced this issue Feb 15, 2021
* no longer need expr_protect()

* Only modify the content of .current_group, not the object itself, because it is internally referenced and weird stuff happens.

closes #5733

* news
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug an unexpected problem or unintended behavior
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants