Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

*_join(x, y, by=c()) should do a cross join #2041

Closed
crowding opened this issue Jul 28, 2016 · 7 comments
Closed

*_join(x, y, by=c()) should do a cross join #2041

crowding opened this issue Jul 28, 2016 · 7 comments

Comments

@crowding
Copy link

It makes sense that inner_join with no by or by=NULL will do natural join, but if I explicitly specify a join on no keys i.e. inner_join(x, y, by=c()) I think it should produce the Cartesian product.

@krlmlr
Copy link
Member

krlmlr commented Sep 20, 2016

To me, a separate verb looks like a cleaner solution, and by = c() should perhaps raise an error.

@crowding
Copy link
Author

crowding commented Sep 21, 2016

Perhaps for interactive or one-off analysis. But when I am using dplyr join inside a reusable function, mathematical consistency trumps "looks cleaner."

@krlmlr
Copy link
Member

krlmlr commented Sep 21, 2016

Which result do you expect for a left join of a nonempty data frame with an empty data frame with by = c()?

left_join(iris, mtcars[0,], by = c())

By definition we'd need to return all of iris plus the mtcars column filled with NULL-s. This means we can't simply forward this case to a (future) cross_join() function. I'm also not aware of any SQL DBMS that supports this corner case.

@crowding
Copy link
Author

I agree with that test case. The SQL implementations I've checked with so
far did in fact produce that result from either LEFT JOIN or LEFT JOIN ON
1=1.

On Wed, Sep 21, 2016 at 12:25 PM, Kirill Müller notifications@github.com
wrote:

Which result do you expect for a left join of a nonempty data frame with
an empty data frame with by = c()?

left_join(iris, mtcars[0,], by = c())

By definition we'd need to return all of iris plus the mtcars column
filled with NULL-s. This means we can't simply forward this case to a
(future) cross_join() function. I'm also not aware of any SQL DBMS that
supports this corner case.


You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
#2041 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/ABV1N6Sp9BLGcg7YYw1m6wayv6Y9FfJ1ks5qsYS1gaJpZM4JW7PB
.

@krlmlr
Copy link
Member

krlmlr commented Sep 21, 2016

Then it might be worthwhile indeed to support it in dplyr. Nice 1=1 trick!

@krlmlr
Copy link
Member

krlmlr commented Nov 7, 2016

We should clearly think of this test if/when we do #557. Currently this looks like a lot of effort for little gain especially on the C++ side.

@hadley
Copy link
Member

hadley commented Feb 2, 2017

Part of more general #2240

@hadley hadley closed this as completed Feb 2, 2017
@lock lock bot locked as resolved and limited conversation to collaborators Jun 8, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants