Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should complete() and expand() disallow accessing the group columns? #1299

Closed
DavisVaughan opened this issue Jan 10, 2022 · 0 comments · Fixed by #1300
Closed

Should complete() and expand() disallow accessing the group columns? #1299

DavisVaughan opened this issue Jan 10, 2022 · 0 comments · Fixed by #1300
Labels
ask :bowtie: breaking change ☠️ API change likely to affect existing code bug an unexpected problem or unintended behavior grids #️⃣ expanding, nesting, crossing, ... group 👨‍👨‍👦‍👦

Comments

@DavisVaughan
Copy link
Member

DavisVaughan commented Jan 10, 2022

With #1289 we added a grouped-df method for complete(). Originally I made the decision to pass cur_data_all() through to each ungrouped call to complete() because it seemed like people were (nonsensically) completing on grouping variables. This actually causes issues!

tidyr/R/complete.R

Lines 99 to 108 in c73cf21

dplyr::summarise(
data,
complete(
data = dplyr::cur_data_all(),
...,
fill = fill,
explicit = explicit
),
.groups = "keep"
)

I am thinking we should just pass cur_data() through instead, which would avoid the issue seen below.

This would result in a breaking change that people would no longer be able to complete on a grouping variable (You'd get an error saying that the grouping variable isn't found). But I think that completing on a grouping variable is a bit useless, as the whole point is to complete within that variable, so you shouldn't have access to it.

library(tidyr)
library(dplyr, warn.conflicts = FALSE)

df <- tibble(
  item_id = c(1:2, 2, 3),
  value1 = c(1, NA, 3, 4),
  item_name = c("a", "a", "b", "b"),
  value2 = 4:7,
  group = c(1:2, 1, 2)
)
gdf <- group_by(df, group)
gdf
#> # A tibble: 4 × 5
#> # Groups:   group [2]
#>   item_id value1 item_name value2 group
#>     <dbl>  <dbl> <chr>      <int> <dbl>
#> 1       1      1 a              4     1
#> 2       2     NA a              5     2
#> 3       2      3 b              6     1
#> 4       3      4 b              7     2

# expand "within" each group. so the group value should be repeated.
expand(gdf, item_id, item_name)
#> # A tibble: 8 × 3
#> # Groups:   group [2]
#>   group item_id item_name
#>   <dbl>   <dbl> <chr>    
#> 1     1       1 a        
#> 2     1       1 b        
#> 3     1       2 a        
#> 4     1       2 b        
#> 5     2       2 a        
#> 6     2       2 b        
#> 7     2       3 a        
#> 8     2       3 b

# (the NAs in the 1 `group` should be 1, and should be 2 in the 2 `group`)
# we don't get that here because we passed the "full" data including grouping
# cols through to `complete.data.frame()`, which did a full-join resulting in
# missing values for those columns. So then `summarize()` thought it didn't
# need to add the grouping columns back on.
complete(gdf, item_id, item_name)
#> # A tibble: 8 × 5
#> # Groups:   group [3]
#>   group item_id item_name value1 value2
#>   <dbl>   <dbl> <chr>      <dbl>  <int>
#> 1     1       1 a              1      4
#> 2    NA       1 b             NA     NA
#> 3    NA       2 a             NA     NA
#> 4     1       2 b              3      6
#> 5     2       2 a             NA      5
#> 6    NA       2 b             NA     NA
#> 7    NA       3 a             NA     NA
#> 8     2       3 b              4      7

Created on 2022-01-10 by the reprex package (v2.0.1)

@DavisVaughan DavisVaughan added bug an unexpected problem or unintended behavior breaking change ☠️ API change likely to affect existing code group 👨‍👨‍👦‍👦 grids #️⃣ expanding, nesting, crossing, ... ask :bowtie: labels Jan 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ask :bowtie: breaking change ☠️ API change likely to affect existing code bug an unexpected problem or unintended behavior grids #️⃣ expanding, nesting, crossing, ... group 👨‍👨‍👦‍👦
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant