Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for quantile() #169

Closed
halldc opened this issue Sep 26, 2018 · 15 comments
Closed

Add support for quantile() #169

halldc opened this issue Sep 26, 2018 · 15 comments
Labels
feature a feature request or enhancement func trans 🌍 Translation of individual functions to SQL help wanted ❤️ we'd love your help! tidy-dev-day 🤓 Tidyverse Developer Day rstd.io/tidy-dev-day
Milestone

Comments

@halldc
Copy link

halldc commented Sep 26, 2018

Would it be possible to add support for the quantile() function?

Quite a few databases support PERCENTILE_CONT and PERCENTILE_DISC, which I think could make this possible (e.g. Google BigQuery, PostgreSQL, and Redshift).

I'd be willing to help of course, but would need some pointers on where to start.

@Prometheus77

This comment has been minimized.

@hadley hadley added feature a feature request or enhancement help wanted ❤️ we'd love your help! func trans 🌍 Translation of individual functions to SQL labels Jan 2, 2019
@hadley

This comment has been minimized.

@hadley hadley added this to the v1.4.0 milestone Jan 9, 2019
@edavidaja

This comment has been minimized.

@edavidaja

This comment has been minimized.

@hadley

This comment has been minimized.

@krlmlr

This comment has been minimized.

@hadley

This comment has been minimized.

@batpigandme batpigandme added the tidy-dev-day 🤓 Tidyverse Developer Day rstd.io/tidy-dev-day label Jan 19, 2019
@halldc

This comment has been minimized.

@edavidaja

This comment has been minimized.

@hadley
Copy link
Member

hadley commented Feb 6, 2019

@edavidaja that is perfect, thank you!

@hadley
Copy link
Member

hadley commented Feb 6, 2019

A few more tweaks to make it a bit easier for me to parse:

Aggregation function:

Window function:

  • sql-server: PERCENTILE_CONT(p) WITHIN GROUP (ORDER BY x)
  • postgres: PERCENTILE_CONT(p) WITHIN GROUP (ORDER BY x)
  • redshift: PERCENTILE_CONT(p) WITHIN GROUP (ORDER BY x)
  • salesforce: PERCENTILE_CONT(p) WITHIN GROUP (ORDER BY x)
  • mariadb: PERCENTILE_CONT(p) WITHIN GROUP (ORDER BY x)

@hadley
Copy link
Member

hadley commented Feb 6, 2019

For the databases that support the PERCENTILE_CONT() window function, I don't see how we can use this in aggregation (i.e. summarise()) context.

The SQL server docs seem to be the only place that addresses this problem and suggests using DISTINCT:

SELECT DISTINCT DepartmentName  
,PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY BaseRate)  
    OVER (PARTITION BY DepartmentName) AS MedianCont  
,PERCENTILE_DISC(0.5) WITHIN GROUP (ORDER BY BaseRate)  
    OVER (PARTITION BY DepartmentName) AS MedianDisc  
FROM dbo.DimEmployee;  

So maybe the best we can do is supply translations for those as window functions, and then suggest the user use distinct()?

@hadley
Copy link
Member

hadley commented Feb 6, 2019

Oops, those aren't window functions, but are "ordered-set aggregate" functions, and they work just fine with GROUP BY:

library(DBI)

con <- dbConnect(RPostgres::Postgres())
dbGetQuery(con, "
  SELECT cyl, PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY mpg) 
  FROM mtcars 
  GROUP BY cyl  
")
#>   cyl percentile_cont
#> 1   4            26.0
#> 2   6            19.7
#> 3   8            15.2

Created on 2019-02-06 by the reprex package (v0.2.1.9000)

@hadley hadley closed this as completed in 19909fa Feb 6, 2019
@halldc
Copy link
Author

halldc commented Feb 6, 2019

Thanks @hadley! 🎉

@hadley
Copy link
Member

hadley commented Mar 17, 2019

Looks like the link to teradata was actually to teradata's distribution of presto. It seems like teradata actually is ansi compliant: https://docs.teradata.com/reader/756LNiPSFdY~4JcCCcR5Cw/RgAqeSpr93jpuGAvDTud3w

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature a feature request or enhancement func trans 🌍 Translation of individual functions to SQL help wanted ❤️ we'd love your help! tidy-dev-day 🤓 Tidyverse Developer Day rstd.io/tidy-dev-day
Projects
None yet
Development

No branches or pull requests

6 participants