Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement right join and outer join #96

Closed
hadley opened this issue Oct 25, 2013 · 11 comments
Closed

Implement right join and outer join #96

hadley opened this issue Oct 25, 2013 · 11 comments
Assignees
Labels
Milestone

Comments

@hadley
Copy link
Member

@hadley hadley commented Oct 25, 2013

Since they probably are common enough to be worthwhile

@romainfrancois
Copy link
Member

@romainfrancois romainfrancois commented Oct 25, 2013

Quick questions:

  • is right_join( x, y ) is the same as left_join( y, x ) ?
  • what is outer_join

@hadley
Copy link
Member Author

@hadley hadley commented Oct 25, 2013

  1. Yes, in terms of the rows, but the columns will be different orders
  2. It's basically union(left_join(x, y), right_join(x, y)) - i.e. preserve all rows in both data frames.

romainfrancois added a commit that referenced this issue Oct 25, 2013
@hadley hadley self-assigned this Aug 1, 2014
@hadley hadley added this to the 0.4 milestone Sep 12, 2014
@hadley hadley removed this from the 0.3 milestone Sep 12, 2014
@hadley hadley added this to the 0.3 milestone Oct 1, 2014
@hadley hadley removed this from the 0.4 milestone Oct 1, 2014
@hadley hadley assigned romainfrancois and unassigned hadley and romainfrancois Oct 1, 2014
@hadley hadley added this to the 0.3.1 milestone Oct 1, 2014
@hadley hadley removed this from the 0.3 milestone Oct 1, 2014
@hadley
Copy link
Member Author

@hadley hadley commented Oct 1, 2014

Would be nice to have right_join and outer_join for 0.3.1. @romainfrancois if you implement the internals for data frames (I think it just needs outer_join()), I'll add methods for everything else.

@romainfrancois
Copy link
Member

@romainfrancois romainfrancois commented Oct 1, 2014

There is internal code for right_join now. It just needs testing.
I'll get to outer_join.

romainfrancois added a commit that referenced this issue Oct 4, 2014
@romainfrancois
Copy link
Member

@romainfrancois romainfrancois commented Oct 4, 2014

I've added the generic and the methods for data.frame and tbl_df for right_join and some tests.

romainfrancois added a commit that referenced this issue Oct 4, 2014
@romainfrancois
Copy link
Member

@romainfrancois romainfrancois commented Oct 4, 2014

Getting this for outer_join :

> a <- data.frame(x=1:3,y=2:4)
>   b <- data.frame(x=3:5,z=3:5)
>   res <- outer_join(a,b, "x")
>
>
> a
  x y
1 1 2
2 2 3
3 3 4
> b
  x z
1 3 3
2 4 4
3 5 5
>
> res
  x  y  z
1 1  2 NA
2 2  3 NA
3 3  4  3
4 4 NA  4
5 5 NA  5

Let me know if this is what you expect. I'm transferring this one to you @hadley as it's now about making the other various methods and documentation.

@jennybc
Copy link
Member

@jennybc jennybc commented Nov 14, 2014

I get strange results with right_join(). I can illustrate with code scaled down from the current test of right_join(), which only has an expectation for the column order.

(a <- data.frame(x = 1:5, y = 2:6))
(b <- data.frame(x = 3:7, z = 4:8))
right_join(a, b)
left_join(b, a)

Actually run

> (a <- data.frame(x = 1:5, y = 2:6))
  x y
1 1 2
2 2 3
3 3 4
4 4 5
5 5 6
> (b <- data.frame(x = 3:7, z = 4:8))
  x z
1 3 4
2 4 5
3 5 6
4 6 7
5 7 8
> right_join(a, b)
Joining by: "x"
  x  y z
1 3  4 4
2 4  5 5
3 5  6 6
4 3 NA 7
5 3 NA 8
> left_join(b, a)
Joining by: "x"
  x z  y
1 3 4  4
2 4 5  5
3 5 6  6
4 6 7 NA
5 7 8 NA

It seems like the last two elements of x in the joined result should be 6, 7, as they are in the left_join() result, instead of 3, 3 as they are in the right_join() result.

@hadley hadley removed this from the 0.3.1 milestone Nov 18, 2014
@hadley hadley added this to the 0.4 milestone Nov 18, 2014
@hadley hadley added this to the 0.4 milestone Nov 18, 2014
@hadley hadley removed this from the 0.3.1 milestone Nov 18, 2014
@jennybc
Copy link
Member

@jennybc jennybc commented Nov 30, 2014

The fix introduced by c9bea13 clears up the bug explored in my example above, so that's RESOLVED.

@hadley hadley added this to the 0.4 milestone Jan 2, 2015
@hadley hadley removed this from the 0.5 milestone Jan 2, 2015
@hadley hadley closed this Jan 2, 2015
@alexfun
Copy link

@alexfun alexfun commented Jan 29, 2015

is it worth implementing "left" and "right" versions of semi_join and anti_join, as essentially these are "left" versions?

@piccolbo
Copy link

@piccolbo piccolbo commented May 26, 2015

Any chance we'll have methods for tbl_sql for these joins? I am not sure if it's included in "add methods for everything"

@hadley
Copy link
Member Author

@hadley hadley commented May 27, 2015

@piccolbo can you please open a new issue so I don't lose track of this?

@lock lock bot locked as resolved and limited conversation to collaborators Jun 10, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
5 participants