dbplyr generated SQL would be much easier to understand if one could add custom subquery names or even create CTE ("with" clause). For example
library(dplyr, warn.conflicts = FALSE)
library(dbplyr, warn.conflicts = FALSE)
tbl_lazy(nycflights13::flights) %>%
group_by(dest) %>%
summarise(
count = n(),
dist = mean(distance, na.rm = TRUE),
delay = mean(arr_delay, na.rm = TRUE)
) %>%
filter(delay, count > 20, dest != "HNL")
produces (I added some indentation to improve readability already a bit)
SELECT *
FROM (
SELECT `dest`, COUNT(*) AS `count`, AVG(`distance`) AS `dist`, AVG(`arr_delay`) AS `delay`
FROM `df`
GROUP BY `dest`
) `q01`
WHERE ((`delay`) AND (`count` > 20.0) AND (`dest` != 'HNL'))
This would be a bit easier with a custom name for the subquery
SELECT *
FROM (
SELECT `dest`, COUNT(*) AS `count`, AVG(`distance`) AS `dist`, AVG(`arr_delay`) AS `delay`
FROM `df`
GROUP BY `dest`
) `destination_summary`
WHERE ((`delay`) AND (`count` > 20.0) AND (`dest` != 'HNL'))
and even better with a CTE
WITH `destination_summary` AS (
SELECT `dest`, COUNT(*) AS `count`, AVG(`distance`) AS `dist`, AVG(`arr_delay`) AS `delay`
FROM `df`
GROUP BY `dest`
)
SELECT *
FROM `destination_summary`
WHERE ((`delay`) AND (`count` > 20.0) AND (`dest` != 'HNL'))
I haven't thought about syntax yet (especially CTE might be a bit tricky) but if you like the idea I'll try to come up with a concept.
dbplyrgenerated SQL would be much easier to understand if one could add custom subquery names or even create CTE ("with" clause). For exampleproduces (I added some indentation to improve readability already a bit)
This would be a bit easier with a custom name for the subquery
and even better with a CTE
I haven't thought about syntax yet (especially CTE might be a bit tricky) but if you like the idea I'll try to come up with a concept.