The docs say
full_join() return all rows and all columns from both x and y
but it doesn't return all columns; I expect the result of this join to have four columns, but it has three:
> ta <- tibble(a=c(NA, 2, 3, 3), b=c(1, 2, 3, 4))
> tx <- tibble(x=c(3, 4, 5, NA), y=c(3, 4, 5, 6))
> dplyr::full_join(ta, tx, by=c(a="x"), na_matches = "never")
# A tibble: 7 x 3
a b y
<dbl> <dbl> <dbl>
1 NA 1 NA
2 2 2 NA
3 3 3 3
4 3 4 3
5 4 NA 4
6 5 NA 5
7 NA NA 6
I'm interested in questions like "how many unique a have a matching x?" (and vice versa). To answer that, I'd need both a and x to exist in the output.
Presumably, dplyr collapses join columns down to one because in some joins (e.g. inner join) they'd be the same. But with outer joins they're different.
Left, right, and full joins all appear to have this behavior. This is with dplyr_0.8.1
The docs say
but it doesn't return all columns; I expect the result of this join to have four columns, but it has three:
I'm interested in questions like "how many unique
ahave a matchingx?" (and vice versa). To answer that, I'd need bothaandxto exist in the output.Presumably, dplyr collapses join columns down to one because in some joins (e.g. inner join) they'd be the same. But with outer joins they're different.
Left, right, and full joins all appear to have this behavior. This is with
dplyr_0.8.1