SQL code generated for summarise(quantile()) for SQL Sever is invalid.
library(dplyr, warn.conflicts = F)
library(dbplyr, warn.conflicts = F)
data <- tibble::tibble(ind = rep(1:3, 100),
value = runif(300))
# desired output of the further generated SQL queries
data %>%
group_by(ind) %>%
summarise(q05_value = quantile(value, 0.05))
#> # A tibble: 3 x 2
#> ind q05_value
#> <int> <dbl>
#> 1 1 0.0240
#> 2 2 0.0667
#> 3 3 0.0625
# Translation to Postgres works:
df_postgres <- tbl_lazy(data, con = simulate_postgres())
df_postgres %>%
group_by(ind) %>%
summarise(q05_value = quantile(value, 0.05, na.rm = TRUE)) %>%
show_query()
#> <SQL>
#> SELECT
#> `ind`,
#> PERCENTILE_CONT(0.05) WITHIN GROUP (ORDER BY `value`) AS `q05_value`
#> FROM `df`
#> GROUP BY `ind`
# Translation to SQL Server doesn't work:
df_mssql <- tbl_lazy(data, con = simulate_mssql())
df_mssql %>%
group_by(ind) %>%
summarise(q05_value = quantile(value, 0.05, na.rm = TRUE)) %>%
show_query()
#> <SQL>
#> SELECT
#> `ind`,
#> PERCENTILE_CONT(0.05) WITHIN GROUP (ORDER BY `value`) OVER () AS `q05_value`
#> FROM `df`
#> GROUP BY `ind`
Created on 2023-01-25 with reprex v2.0.2
Attempt to run generated SQL on SQL Server fails with the following error:
Column 'tab.value' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
SQL Server uses window function type of syntax for PERCENTILE_CONT() as opposed to Postgres which utilizes aggregate approach.
Working SQL for SQL Server for the demonstrated query would be the following:
SELECT DISTINCT
ind, PERCENTILE_CONT(0.05) WITHIN GROUP (ORDER BY value) OVER (PARTITION BY ind) AS q05_value
FROM tab;
SQL code generated for
summarise(quantile())for SQL Sever is invalid.Created on 2023-01-25 with reprex v2.0.2
Attempt to run generated SQL on SQL Server fails with the following error:
Column 'tab.value' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.SQL Server uses window function type of syntax for
PERCENTILE_CONT()as opposed to Postgres which utilizes aggregate approach.Working SQL for SQL Server for the demonstrated query would be the following: