Skip to content

outer joins don't keep join columns from both sides #4589

@skinner

Description

@skinner

The docs say

full_join() return all rows and all columns from both x and y

but it doesn't return all columns; I expect the result of this join to have four columns, but it has three:

> ta <- tibble(a=c(NA, 2, 3, 3), b=c(1, 2, 3, 4))
> tx <- tibble(x=c(3, 4, 5, NA), y=c(3, 4, 5, 6))
> dplyr::full_join(ta, tx, by=c(a="x"), na_matches = "never")
# A tibble: 7 x 3
      a     b     y
  <dbl> <dbl> <dbl>
1    NA     1    NA
2     2     2    NA
3     3     3     3
4     3     4     3
5     4    NA     4
6     5    NA     5
7    NA    NA     6

I'm interested in questions like "how many unique a have a matching x?" (and vice versa). To answer that, I'd need both a and x to exist in the output.

Presumably, dplyr collapses join columns down to one because in some joins (e.g. inner join) they'd be the same. But with outer joins they're different.

Left, right, and full joins all appear to have this behavior. This is with dplyr_0.8.1

Metadata

Metadata

Assignees

No one assigned

    Labels

    featurea feature request or enhancementtables 🧮joins and set operations

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions