New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Emulate PERCENTILE_DISC using PERCENT_RANK and LAST_VALUE #8257

Open
lukaseder opened this Issue Jan 25, 2019 · 0 comments

Comments

Projects
None yet
1 participant
@lukaseder
Copy link
Member

lukaseder commented Jan 25, 2019

Only few databases support PERCENTILE_DISC, and if they do, they might not support the window function version. However, it can be emulated using PERCENT_RANK and FIRST_VALUE. For example, in Oracle, the following query produces the same result for x1 and x2:

Window function emulation without partition by

SELECT
  length,
  percentile_disc(0.5) WITHIN GROUP (ORDER BY length) OVER() x1,
  first_value(length) OVER (ORDER BY CASE WHEN p <= 0.5 THEN p END DESC NULLS LAST) x2
FROM (
  SELECT
    length,
    percent_rank() OVER (ORDER BY length) p
  FROM film
) t
ORDER BY length;

Window function emulation with partition by

SELECT
  rating,
  length,
  percentile_disc(0.2) WITHIN GROUP (ORDER BY length) OVER (PARTITION BY rating) x1,
  first_value(length) OVER (PARTITION BY rating ORDER BY CASE WHEN p <= 0.2 THEN p END DESC NULLS LAST) x2
FROM (
  SELECT
    rating,
    length,
    percent_rank() OVER (PARTITION BY rating ORDER BY length) p
  FROM film
) t
ORDER BY rating, length;

Aggregate function emulation without group by

SELECT
  percentile_disc(0.7) WITHIN GROUP (ORDER BY length) x1,
  MAX(x2) x2
FROM (
  SELECT
    length,
    p,
    first_value(length) OVER (ORDER BY CASE WHEN p <= 0.7 THEN p END DESC NULLS LAST) x2
  FROM (
    SELECT
      length,
      percent_rank() OVER (ORDER BY length) p
    FROM film
  ) t
) t;

Aggregate function emulation with group by

SELECT
  percentile_disc(0.3) WITHIN GROUP (ORDER BY length) x1,
  MAX(x2) x2
FROM (
  SELECT
    rating,
    length,
    p,
    first_value(length) OVER (PARTITION BY rating ORDER BY CASE WHEN p <= 0.3 THEN p END DESC NULLS LAST) x2
  FROM (
    SELECT
      rating,
      length,
      percent_rank() OVER (PARTITION BY rating ORDER BY length) p
    FROM film
  ) t
) t
GROUP BY rating;
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment