Window frame window_frame() is currently ignored for last() and nth():
library(tidyverse)
library(dbplyr)
tbl <- memdb_frame(a = 4:1, g = rep(1:2, 2))
print.tbl_dbi <- function(x, ...) {
message(sql_render(x))
NextMethod()
}
# first() seems to work.
tbl %>%
arrange(a) %>%
group_by(g) %>%
mutate(l = first(a))
#> SELECT `a`, `g`, FIRST_VALUE(`a`) OVER (PARTITION BY `g` ORDER BY `a`) AS `l`
#> FROM (SELECT *
#> FROM `dbplyr_001`
#> ORDER BY `a`)
#> # Source: lazy query [?? x 3]
#> # Database: sqlite 3.29.0 [:memory:]
#> # Groups: g
#> # Ordered by: a
#> a g l
#> <int> <int> <int>
#> 1 2 1 2
#> 2 4 1 2
#> 3 1 2 1
#> 4 3 2 1
# last() doesn't:
tbl %>%
group_by(g) %>%
arrange(a) %>%
mutate(l = last(a))
#> SELECT `a`, `g`, LAST_VALUE(`a`) OVER (PARTITION BY `g` ORDER BY `a`) AS `l`
#> FROM (SELECT *
#> FROM `dbplyr_001`
#> ORDER BY `a`)
#> # Source: lazy query [?? x 3]
#> # Database: sqlite 3.29.0 [:memory:]
#> # Groups: g
#> # Ordered by: a
#> a g l
#> <int> <int> <int>
#> 1 2 1 2
#> 2 4 1 4
#> 3 1 2 1
#> 4 3 2 3
# We need "ROWS BETWEEN CURRENT AND UNBOUNDED FOLLOWING":
tbl %>%
mutate(l = sql(!!win_over(sql("LAST_VALUE(a)"), "g", "a", c(0, Inf), con = tbl$src$con)))
#> SELECT `a`, `g`, LAST_VALUE(a) OVER (PARTITION BY `g` ORDER BY `a` ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING) AS `l`
#> FROM `dbplyr_001`
#> # Source: lazy query [?? x 3]
#> # Database: sqlite 3.29.0 [:memory:]
#> a g l
#> <int> <int> <int>
#> 1 2 1 4
#> 2 4 1 4
#> 3 1 2 3
#> 4 3 2 3
Created on 2019-10-08 by the reprex package (v0.3.0)
@hannesmuehleisen: Do you know if the default range specification for window functions is standardized across databases? SQLite has:
RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW EXCLUDE NO OTHERS
which explains why the example fails, but does it necessarily fail for other databases? Should we always pass an unbounded range to mimic dplyr semantics?
Window frame
window_frame()is currently ignored forlast()andnth():Created on 2019-10-08 by the reprex package (v0.3.0)
@hannesmuehleisen: Do you know if the default range specification for window functions is standardized across databases? SQLite has:
which explains why the example fails, but does it necessarily fail for other databases? Should we always pass an unbounded range to mimic dplyr semantics?