Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

group_by logs incorrect value #4

Closed
aniruhil opened this issue Feb 2, 2019 · 10 comments
Closed

group_by logs incorrect value #4

aniruhil opened this issue Feb 2, 2019 · 10 comments

Comments

@aniruhil
Copy link

aniruhil commented Feb 2, 2019

Great idea! I was planning on using it to teach tidy verse next week and noticed group_by() throws an incorrect value. I am using your code example too, so not sure what is going on here. sessionInfo follows, in case that helps.

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(tidylog)
#> 
#> Attaching package: 'tidylog'
#> The following objects are masked from 'package:dplyr':
#> 
#>     anti_join, distinct, filter, filter_all, filter_at, filter_if,
#>     full_join, group_by, group_by_all, group_by_at, group_by_if,
#>     inner_join, left_join, mutate, mutate_all, mutate_at,
#>     mutate_if, right_join, select, select_all, select_at,
#>     select_if, semi_join, transmute, transmute_all, transmute_at,
#>     transmute_if
#> The following object is masked from 'package:stats':
#> 
#>     filter
summary <- mtcars %>%
  select(mpg, cyl, hp) %>%
  filter(mpg > 15) %>%
  mutate(mpg_round = round(mpg)) %>%
  group_by(cyl, mpg_round) %>%
  tally() %>%
  filter(n >= 1)
#> select: dropped 8 variables (disp, drat, wt, qsec, vs, …) 
#> filter: removed 6 rows (19%) 
#> mutate: new variable 'mpg_round' with 15 unique values and 0% NA 
#> group_by: 0 groups [] 
#> filter: no rows removed


sessionInfo()
#> R version 3.5.2 (2018-12-20)
#> Platform: x86_64-apple-darwin15.6.0 (64-bit)
#> Running under: macOS Sierra 10.12.6
#> 
#> Matrix products: default
#> BLAS: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRblas.0.dylib
#> LAPACK: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib
#> 
#> locale:
#> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> loaded via a namespace (and not attached):
#>  [1] compiler_3.5.2  magrittr_1.5    tools_3.5.2     htmltools_0.3.6
#>  [5] yaml_2.2.0      Rcpp_1.0.0      stringi_1.2.4   rmarkdown_1.11 
#>  [9] highr_0.7       knitr_1.21      stringr_1.3.1   xfun_0.4       
#> [13] digest_0.6.18   evaluate_0.12
@elbersb
Copy link
Owner

elbersb commented Feb 2, 2019

That's indeed odd. Just a hunch: Could you upgrade your glue package and try again?

@aniruhil
Copy link
Author

aniruhil commented Feb 2, 2019

Force-updated glue (though the system threw the expected "Skipping install of 'glue' from a github remote, the SHA1 (8188cea6) has not changed since last install." message. Same result:

group_by: 0 groups [] 

@elbersb
Copy link
Owner

elbersb commented Feb 2, 2019

Does it work with other data frames?

Could you run this and post the output?

summary <- mtcars %>%
  select(mpg, cyl, hp) %>%
  filter(mpg > 15) %>%
  mutate(mpg_round = round(mpg)) %>%
  group_by(cyl, mpg_round)
str(summary)

@aniruhil aniruhil closed this as completed Feb 2, 2019
@aniruhil
Copy link
Author

aniruhil commented Feb 2, 2019

summary <- mtcars %>%
   select(mpg, cyl, hp) %>%
   filter(mpg > 15) %>%
   mutate(mpg_round = round(mpg)) %>%
   group_by(cyl, mpg_round)

select: dropped 8 variables (disp, drat, wt, qsec, vs, …)
filter: removed 6 rows (19%)
mutate: new variable 'mpg_round' with 15 unique values and 0% NA
group_by: 0 groups []

str(summary)
Classes ‘grouped_df’, ‘tbl_df’, ‘tbl’ and 'data.frame':	26 obs. of  4 variables:
 $ mpg      : num  21 21 22.8 21.4 18.7 18.1 24.4 22.8 19.2 17.8 ...
 $ cyl      : num  6 6 4 6 8 6 4 4 6 6 ...
 $ hp       : num  110 110 93 110 175 105 62 95 123 123 ...
 $ mpg_round: num  21 21 23 21 19 18 24 23 19 18 ...
 - attr(*, "groups")=Classes ‘tbl_df’, ‘tbl’ and 'data.frame':	17 obs. of  3 variables:
  ..$ cyl      : num  4 4 4 4 4 4 4 4 4 6 ...
  ..$ mpg_round: num  21 22 23 24 26 27 30 32 34 18 ...
  ..$ .rows    :List of 17
  .. ..$ : int 26
  .. ..$ : int 17
  .. ..$ : int  3 8
  .. ..$ : int 7
  .. ..$ : int 22
  .. ..$ : int 21
  .. ..$ : int  15 23
  .. ..$ : int 14
  .. ..$ : int 16
  .. ..$ : int  6 10
  .. ..$ : int 9
  .. ..$ : int 25
  .. ..$ : int  1 2 4
  .. ..$ : int  13 19
  .. ..$ : int  11 18 24
  .. ..$ : int 12
  .. ..$ : int  5 20
  ..- attr(*, ".drop")= logi TRUE

@aniruhil
Copy link
Author

aniruhil commented Feb 2, 2019

closed it by accident; sorry!!

also tried it with the diamonds data:

diamonte <- diamonds %>%
   filter(carat > 1) %>%
   group_by(cut)

filter: removed 36438 rows (68%)
group_by: 0 groups []

@aniruhil aniruhil reopened this Feb 2, 2019
@elbersb
Copy link
Owner

elbersb commented Feb 2, 2019

Are you running a development version of dplyr?

library(dplyr)
sessionInfo()

@aniruhil
Copy link
Author

aniruhil commented Feb 2, 2019

dplyr version is 0.8.0, release candidate scheduled for Feb 1

[1] ggplot2_3.1.0.9000 reprex_0.2.1 tidylog_0.1.0 dplyr_0.8.0

If I rollback to dplyr 0.7.8 then group_by() works. So there is something in 0.8.0 that is causing the break

@elbersb
Copy link
Owner

elbersb commented Feb 2, 2019

08d11d2 should fix this. Can you reinstall tidylog and try it out?

@aniruhil
Copy link
Author

aniruhil commented Feb 2, 2019

08d11d2 works like a charm. Thanks for the quick fix, and this immensely useful package that will benefit all tidyverse learners.

Have a great weekend!

@elbersb
Copy link
Owner

elbersb commented Feb 2, 2019

thanks for the report! Now we already support dplyr 0.8 before the release :)

@elbersb elbersb closed this as completed Feb 2, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants