-
Notifications
You must be signed in to change notification settings - Fork 174
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimise more nested selects #213
Comments
This is going to require a richer data structure in |
One use-case where this becomes particularly relevant is DB's without very "smart" optimizer. At work we are switching from Postgres to CockroachDB and multiple nested queries might become a problem. Postgres is smart enough to optimize, but CockroachDB is likely going to be extra-slow because of them. |
Or does this optimisation need to be performed by Maybe |
I think we should be able to simplify the following SQL: library(dplyr, warn.conflicts = FALSE)
library(dbplyr, warn.conflicts = FALSE)
memdb_frame(x = 1, y = 2) %>%
filter(x > 1) %>%
mutate(z = x + 2) %>%
select(y) %>%
show_query()
#> <SQL>
#> SELECT `y`
#> FROM (SELECT `x`, `y`, `x` + 2.0 AS `z`
#> FROM (SELECT *
#> FROM `dbplyr_yicivbbksg`
#> WHERE (`x` > 1.0))) But we wouldn't try and simplify code this: memdb_frame(x = 1, y = 2) %>%
mutate(z = x + 2) %>%
filter(z > 1) %>%
select(y) %>%
show_query()
#> <SQL>
#> SELECT `y`
#> FROM (SELECT `x`, `y`, `x` + 2.0 AS `z`
#> FROM `dbplyr_owxqpctbzw`)
#> WHERE (`z` > 1.0) Since that would require a full dependency analysis of what variables are used by each stage. |
I think the best way to do this will be to unify the op underlying |
Created on 2019-01-10 by the reprex package (v0.2.1.9000)
The text was updated successfully, but these errors were encountered: