Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generic joins #146

Merged
merged 1 commit into from Feb 9, 2019

Conversation

@krlmlr
Copy link
Member

krlmlr commented Aug 23, 2018

Proposing a syntax to pass arbitrary join predicates, e.g. for working with spatial data on the database. Needs a test.

library(tidyverse)
library(dbplyr)
#> 
#> Attaching package: 'dbplyr'
#> The following objects are masked from 'package:dplyr':
#> 
#>     ident, sql

tbl1 <- memdb_frame(a = 1:3, b = 4:2)
tbl2 <- memdb_frame(c = 1:3, b = 2:0)

tbl12 <- left_join(tbl1, tbl2, by = sql("TBL_LEFT.b < TBL_RIGHT.c"))
tbl12 %>% sql_render()
#> <SQL> SELECT `TBL_LEFT`.`a` AS `a`, `TBL_LEFT`.`b` AS `b.x`, `TBL_RIGHT`.`c` AS `c`, `TBL_RIGHT`.`b` AS `b.y`
#>   FROM `ygyjgeeqdc` AS `TBL_LEFT`
#>   LEFT JOIN `urhwkxzyib` AS `TBL_RIGHT`
#>   ON (TBL_LEFT.b < TBL_RIGHT.c)
tbl12
#> # Source:   lazy query [?? x 4]
#> # Database: sqlite 3.22.0 [:memory:]
#>       a   b.x     c   b.y
#>   <int> <int> <int> <int>
#> 1     1     4    NA    NA
#> 2     2     3    NA    NA
#> 3     3     2     3     0

Created on 2018-08-23 by the reprex package (v0.2.0).

@krlmlr krlmlr requested review from edgararuiz and hadley as code owners Aug 23, 2018
@krlmlr

This comment has been minimized.

Copy link
Member Author

krlmlr commented Aug 31, 2018

@hadley @edgararuiz: Do we want to support joins with custom join predicates? Another way to achieve this is a cross join with a filter, but

  • it would work better if we had a cross_join() verb (full_join() with empty by doesn't work in edge cases where column names overlap)
  • perhaps some databases perform better with JOIN ... ON than with a filter on a cross join in a subquery
@krlmlr

This comment has been minimized.

Copy link
Member Author

krlmlr commented Oct 22, 2018

left_join(tbl1, tbl2, on = sql("TBL_LEFT.b < TBL_RIGHT.c"))

?

@krlmlr

This comment has been minimized.

Copy link
Member Author

krlmlr commented Oct 22, 2018

left_join(tbl1, tbl2, sql_on = "TBL_LEFT.b < TBL_RIGHT.c")

@hadley

This comment has been minimized.

Copy link
Member

hadley commented Oct 22, 2018

Yeah, I like that as a temporary stop gap.

@krlmlr krlmlr changed the title WIP: Generic joins Generic joins Nov 5, 2018
@krlmlr

This comment has been minimized.

Copy link
Member Author

krlmlr commented Nov 5, 2018

A proper on argument that would take an expression would be more in line with the rest of the interface, but is not as useful for spatial joins due to #181.

I wonder if we should rename the aliases to x and y, or if this would cause too much trouble.

@hadley

This comment has been minimized.

Copy link
Member

hadley commented Jan 2, 2019

I don't think we need any extra work on this interface, as it's mostly meant as a stop gap until we come up with a general interface. But could you please add a couple of very simple tests of the SQL generation? (Just to make sure everything is plumbed together correctly)

@krlmlr krlmlr force-pushed the krlmlr:f-generic-joins branch from ffb07bb to c4c78b3 Feb 2, 2019
@krlmlr

This comment has been minimized.

Copy link
Member Author

krlmlr commented Feb 2, 2019

Rebased on current master, added a few integration tests. Do we still need tests for the SQL generation?

@hadley

This comment has been minimized.

Copy link
Member

hadley commented Feb 5, 2019

I think it might be better to turn the test into a regression test on the generated SQL? (i.e. using expect_known_output())

@krlmlr krlmlr force-pushed the krlmlr:f-generic-joins branch from 3b9e590 to 3ae337d Feb 9, 2019
@krlmlr

This comment has been minimized.

Copy link
Member Author

krlmlr commented Feb 9, 2019

Done, also adapted to the new naming.

@hadley hadley merged commit 4b5aab9 into tidyverse:master Feb 9, 2019
3 checks passed
3 checks passed
codecov/patch 92.85% of diff hit (target 82.58%)
Details
codecov/project 82.62% (+0.04%) compared to 8d38547
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@hadley

This comment has been minimized.

Copy link
Member

hadley commented Feb 9, 2019

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants
You can’t perform that action at this time.