-
Notifications
You must be signed in to change notification settings - Fork 186
Closed
Description
In databases one often needs to join multiple tables. In dbplyr this produces rather many nested queries which one would more often write as a single query.
Example
library(dplyr, warn.conflicts = FALSE)
library(dbplyr, warn.conflicts = FALSE)
lf <- lazy_frame(x = 1, a = 1)
lf2 <- lazy_frame(x = 1, b = 2)
lf3 <- lazy_frame(x = 1, c = 3)
left_join(lf, lf2, by = "x") %>%
left_join(lf3, by = "x")Created on 2022-05-10 by the reprex package (v2.0.1)
Currently produces
SELECT `LHS`.`x` AS `x`, `a`, `b`, `c`
FROM (
SELECT `LHS`.`x` AS `x`, `a`, `b`
FROM `df1` AS `LHS`
LEFT JOIN `df2` AS `RHS`
ON (`LHS`.`x` = `RHS`.`x`)
) `LHS`
LEFT JOIN `df3` AS `RHS`
ON (`LHS`.`x` = `RHS`.`x`)It would be nicer if it could produce something like
SELECT `df`.`x` AS `x`, `a`, `b`, `c`
FROM `df1`
LEFT JOIN `df2`
ON (`df1`.`x` = `df2`.`x`)
LEFT JOIN `df3`
ON (`df1`.`x` = `df3`.`x`)Thoughts & Questions
- Is the result of joins in subqueries and multiple joins in one query necessarily the same?
- What about table aliases?
- Now we have
x_asandy_as. I think if they are not provided it might make more sense to not use a table alias. - If
x_asis provided in a join which is not the first join, then maybe a subquery should be generated - If
y_asis provided it can always be used
- Now we have
- A
FULL JOINcan be tricky to combine with other joins:- SQLite does not directly support
FULL JOIN - The columns joined by are
coalesce()
- SQLite does not directly support
- If
semi_join()oranti_join()is followed byleft/right/inner/full_join()they cannot be combined becauseWHEREis evaluated afterJOIN
So, I think that
- a sequence of
left_join()andinner_join()can be combined in one query - a sequence of
semi_join()andanti_join()might be combined (though nested queries might be more efficient)
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels