Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

Implement right join and outer join #96

Closed
hadley opened this Issue Oct 25, 2013 · 11 comments

Comments

Projects
None yet
5 participants
Owner

hadley commented Oct 25, 2013

Since they probably are common enough to be worthwhile

Collaborator

romainfrancois commented Oct 25, 2013

Quick questions:

  • is right_join( x, y ) is the same as left_join( y, x ) ?
  • what is outer_join
Owner

hadley commented Oct 25, 2013

  1. Yes, in terms of the rows, but the columns will be different orders
  2. It's basically union(left_join(x, y), right_join(x, y)) - i.e. preserve all rows in both data frames.

@hadley hadley self-assigned this Aug 1, 2014

@hadley hadley modified the milestones: 0.4, 0.3 Sep 12, 2014

@hadley hadley modified the milestones: 0.3, 0.4 Oct 1, 2014

@hadley hadley assigned romainfrancois and unassigned hadley and romainfrancois Oct 1, 2014

@hadley hadley modified the milestones: 0.3.1, 0.3 Oct 1, 2014

Owner

hadley commented Oct 1, 2014

Would be nice to have right_join and outer_join for 0.3.1. @romainfrancois if you implement the internals for data frames (I think it just needs outer_join()), I'll add methods for everything else.

Collaborator

romainfrancois commented Oct 1, 2014

There is internal code for right_join now. It just needs testing.
I'll get to outer_join.

romainfrancois added a commit that referenced this issue Oct 4, 2014

Collaborator

romainfrancois commented Oct 4, 2014

I've added the generic and the methods for data.frame and tbl_df for right_join and some tests.

Collaborator

romainfrancois commented Oct 4, 2014

Getting this for outer_join :

> a <- data.frame(x=1:3,y=2:4)
>   b <- data.frame(x=3:5,z=3:5)
>   res <- outer_join(a,b, "x")
>
>
> a
  x y
1 1 2
2 2 3
3 3 4
> b
  x z
1 3 3
2 4 4
3 5 5
>
> res
  x  y  z
1 1  2 NA
2 2  3 NA
3 3  4  3
4 4 NA  4
5 5 NA  5

Let me know if this is what you expect. I'm transferring this one to you @hadley as it's now about making the other various methods and documentation.

Owner

jennybc commented Nov 14, 2014

I get strange results with right_join(). I can illustrate with code scaled down from the current test of right_join(), which only has an expectation for the column order.

(a <- data.frame(x = 1:5, y = 2:6))
(b <- data.frame(x = 3:7, z = 4:8))
right_join(a, b)
left_join(b, a)

Actually run

> (a <- data.frame(x = 1:5, y = 2:6))
  x y
1 1 2
2 2 3
3 3 4
4 4 5
5 5 6
> (b <- data.frame(x = 3:7, z = 4:8))
  x z
1 3 4
2 4 5
3 5 6
4 6 7
5 7 8
> right_join(a, b)
Joining by: "x"
  x  y z
1 3  4 4
2 4  5 5
3 5  6 6
4 3 NA 7
5 3 NA 8
> left_join(b, a)
Joining by: "x"
  x z  y
1 3 4  4
2 4 5  5
3 5 6  6
4 6 7 NA
5 7 8 NA

It seems like the last two elements of x in the joined result should be 6, 7, as they are in the left_join() result, instead of 3, 3 as they are in the right_join() result.

@hadley hadley modified the milestones: 0.3.1, 0.4 Nov 18, 2014

Owner

jennybc commented Nov 30, 2014

The fix introduced by c9bea13 clears up the bug explored in my example above, so that's RESOLVED.

@hadley hadley modified the milestones: 0.4, 0.5 Jan 2, 2015

@hadley hadley closed this Jan 2, 2015

alexfun commented Jan 29, 2015

is it worth implementing "left" and "right" versions of semi_join and anti_join, as essentially these are "left" versions?

Any chance we'll have methods for tbl_sql for these joins? I am not sure if it's included in "add methods for everything"

Owner

hadley commented May 27, 2015

@piccolbo can you please open a new issue so I don't lose track of this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment