You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
SQL code generated for summarise(quantile()) for SQL Sever is invalid.
library(dplyr, warn.conflicts=F)
library(dbplyr, warn.conflicts=F)
data<-tibble::tibble(ind= rep(1:3, 100),
value= runif(300))
# desired output of the further generated SQL queriesdata %>%
group_by(ind) %>%
summarise(q05_value= quantile(value, 0.05))
#> # A tibble: 3 x 2#> ind q05_value#> <int> <dbl>#> 1 1 0.0240#> 2 2 0.0667#> 3 3 0.0625# Translation to Postgres works:df_postgres<- tbl_lazy(data, con= simulate_postgres())
df_postgres %>%
group_by(ind) %>%
summarise(q05_value= quantile(value, 0.05, na.rm=TRUE)) %>%
show_query()
#> <SQL>#> SELECT#> `ind`,#> PERCENTILE_CONT(0.05) WITHIN GROUP (ORDER BY `value`) AS `q05_value`#> FROM `df`#> GROUP BY `ind`# Translation to SQL Server doesn't work:df_mssql<- tbl_lazy(data, con= simulate_mssql())
df_mssql %>%
group_by(ind) %>%
summarise(q05_value= quantile(value, 0.05, na.rm=TRUE)) %>%
show_query()
#> <SQL>#> SELECT#> `ind`,#> PERCENTILE_CONT(0.05) WITHIN GROUP (ORDER BY `value`) OVER () AS `q05_value`#> FROM `df`#> GROUP BY `ind`
Attempt to run generated SQL on SQL Server fails with the following error: Column 'tab.value' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
SQL Server uses window function type of syntax for PERCENTILE_CONT() as opposed to Postgres which utilizes aggregate approach.
Working SQL for SQL Server for the demonstrated query would be the following:
SELECT DISTINCT
ind, PERCENTILE_CONT(0.05) WITHIN GROUP (ORDER BY value) OVER (PARTITION BY ind) AS q05_value
FROM tab;
The text was updated successfully, but these errors were encountered:
Yeah, I did some dbplyr inner working exploration and had the same thought.
But this problem has two faces in some sense. Trying to design dplyr statement which is correctly translated to SQL Server you might use mutate function. But in this case the statement won't work with postgres
library(dplyr, warn.conflicts=F)
library(dbplyr, warn.conflicts=F)
data<-tibble::tibble(ind= rep(1:3, 100),
value= runif(300))
# Translation to SQL Server works:df_mssql<- tbl_lazy(data, con= simulate_mssql())
df_mssql %>%
group_by(ind) %>%
mutate(q05_value= quantile(value, 0.05, na.rm=TRUE)) %>%
select (ind, q05_value) %>%
distinct()
#> <SQL>#> SELECT DISTINCT#> `ind`,#> PERCENTILE_CONT(0.05) WITHIN GROUP (ORDER BY `value`) OVER (PARTITION BY `ind`) AS `q05_value`#> FROM `df`# Translation to Postgres is incorrect:df_postgres<- tbl_lazy(data, con= simulate_postgres())
df_postgres %>%
group_by(ind) %>%
mutate(q05_value= quantile(value, 0.05, na.rm=TRUE)) %>%
select (ind, q05_value) %>%
distinct()
#> <SQL>#> SELECT DISTINCT#> `ind`,#> PERCENTILE_CONT(0.05) WITHIN GROUP (ORDER BY `value`) OVER (PARTITION BY `ind`) AS `q05_value`#> FROM `df`
SQL code generated for
summarise(quantile())
for SQL Sever is invalid.Created on 2023-01-25 with reprex v2.0.2
Attempt to run generated SQL on SQL Server fails with the following error:
Column 'tab.value' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
SQL Server uses window function type of syntax for
PERCENTILE_CONT()
as opposed to Postgres which utilizes aggregate approach.Working SQL for SQL Server for the demonstrated query would be the following:
The text was updated successfully, but these errors were encountered: