Join GitHub today
select not working for grouped_df #170
I think that select verb is not working as expected for grouped_df.
require(dplyr) group_by(mtcars, vs) %.% select(mpg)
Error: index out of bounds
str(group_by(mtcars, vs) %.% subset(select = mpg))
Classes ‘grouped_df’, ‘tbl_df’, ‘tbl’ and 'data.frame': 32 obs. of 1 variable: $ mpg: num 21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ...
Here is the implementation of
So the error comes from the
Maybe we should lose the groupings or force selection of grouping variables. @hadley ?
referenced this issue
Jan 23, 2014
added a commit
Jan 27, 2014
It seems to me Romain's proposal to implicitly select grouping variables was superior from a DRY or "war on boilerplate" point of view. Now we have to write
Mandated repetition of
An alternative could be to remove the constraint and allow to drop grouping variables. This makes more sense than it seems: imagine you are doing resampling and you are only interested in the variation of an estimator, not exactly which group it comes from. But there are implications to this: I don't think it's natural to preserve the grouping when the grouping variables are gone. Hence it would become an implicit ungroup.