Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect group-by columns when joining grouped data frames with overlapping columns #2330

davidkretch opened this issue Dec 17, 2016 · 0 comments


Copy link

@davidkretch davidkretch commented Dec 17, 2016

join updates overlapping column names, but does not update corresponding group column names in attribute vars. This becomes an issue when a group column is not used in the join. The resulting data frame causes errors in mutate.


df1 <- data.frame(x = 1:10, y = 1:10)
df2 <- df1

df1g <- df1 %>% group_by(x, y)

df3 <- inner_join(df1g, df2, by = "x")

df3 %>%
  mutate(a = 1)


Error in mutate_impl(.data, dots) : unknown column 'y.x' 

I have a fix and will submit a pull request.

@krlmlr krlmlr closed this in #2334 Jan 26, 2017
krlmlr added a commit that referenced this issue Jan 26, 2017

* Fix subset_join to update group column names in attribute
vars when they are duplicate column names.

* Add tests for appropriate group columns after join.

* Add test for group indices on expanding join with grouped
data frame.

* Fix build_index_cpp to report correct missing group
column name. Currently when a group column name does not
exist in the data frame, it reports a name from the
names vector (all columns) instead of the vars vector
(group columns).

* Add test for error message on non-existent group columns.
@lock lock bot locked as resolved and limited conversation to collaborators Jun 8, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
None yet
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
1 participant