-
Notifications
You must be signed in to change notification settings - Fork 171
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Like mutate, summarise needs to create subqueries when necessary #114
Comments
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This isn't quite as simple as applying the algorithm from |
This comment has been minimized.
This comment has been minimized.
Updated reprex: library(dplyr, warn.conflicts = FALSE)
lf <- dbplyr::lazy_frame(x = 1:5)
lf %>%
mutate(
x1 = mean(x, na.rm = TRUE),
x2 = sum(x, na.rm = TRUE),
x3 = x1 / x2
)
#> <SQL>
#> SELECT `x`, `x1`, `x2`, `x1` / `x2` AS `x3`
#> FROM (SELECT `x`, AVG(`x`) OVER () AS `x1`, SUM(`x`) OVER () AS `x2`
#> FROM `df`) `dbplyr_ksmttkjckb`
lf %>%
summarise(
x1 = mean(x, na.rm = TRUE),
x2 = sum(x, na.rm = TRUE),
x3 = x1 / x2
)
#> <SQL>
#> SELECT AVG(`x`) AS `x1`, SUM(`x`) AS `x2`, `x1` / `x2` AS `x3`
#> FROM `df` Created on 2019-03-14 by the reprex package (v0.2.1.9000) |
Maybe it's enough to make this an error? It's not really clear what the above example means, and we really want to the user to rewrite to: library(dplyr, warn.conflicts = FALSE)
lf <- dbplyr::lazy_frame(x = 1:5)
lf %>%
summarise(
x1 = mean(x, na.rm = TRUE),
x2 = sum(x, na.rm = TRUE)
) %>%
mutate(x3 = x1 / x2)
#> <SQL>
#> SELECT `x1`, `x2`, `x1` / `x2` AS `x3`
#> FROM (SELECT AVG(`x`) AS `x1`, SUM(`x`) AS `x2`
#> FROM `df`) `dbplyr_xejwocofsh` Created on 2019-03-14 by the reprex package (v0.2.1.9000) |
The only confusing thing about this fix is that the "wrong" version still works in a non- Local reprex:
I can file a Edit: or does this fix cover both cases? That wasn't clear from looking at the commit. |
I think it might make sense in R due to differing semantics. It is simply not feasible to make dplyr and a database agree for every possible input. |
@AjarKeen commented on Jan 8, 2018, 9:44 PM UTC:
cc @javierluraschi since I've only tested this with
sparklyr
, not with otherdbplyr
backends.My session info after I run the above:
I'm not sure if this is related to a similar-sounding issue with mutate / rename that was resolved in dbplyr 1.2.0. I can work around it for now, but it will get clunky fast as I get into more complex aggregation operations with
sparklyr
.This issue was moved by krlmlr from tidyverse/dplyr/issues/3295.
The text was updated successfully, but these errors were encountered: