Skip to content

Conversation

charettes
Copy link
Member

@charettes charettes commented Nov 6, 2022

Also avoid an unnecessary pushdown when aggregating over a query that doesn't have aggregate annotations.


This optimizes the Query.get_aggregation in the following way:

  1. Only perform a subquery wrapping if there are pre-existing aggregations annotation at the time aggregate is called. If there aren't existing aggregates (as opposed to pre-existing aggregation previously) then as long as the non-summarized aggregates (the ones passed to .aggregate) have their references to other annotations inlined the latter can be pruned.
  2. In cases where a subquery wrapping is still required (e.g. pre-existing aggregations), make sure that all unreferenced annotations are pruned from the inner query.

@charettes charettes force-pushed the count-optimizations branch 3 times, most recently from 6620872 to 52d4fd3 Compare November 6, 2022 09:18
@charettes charettes force-pushed the count-optimizations branch from 80641de to d0744a6 Compare November 6, 2022 09:46
@charettes
Copy link
Member Author

The last remaining failure is unrelated to these changes but to a limitation in 72b23c0.

If the expression wrapper happens not to be directly selected (e.g. say that it is in a Func) then BooleanField.select_format will never be called and thus it will generate empty SQL. The problem manifests itself in this PR because what was previously wrapped in a suqbuery, and thus selected in the inner query, is now inlined in the aggregate function call.

@charettes charettes force-pushed the count-optimizations branch from 52d4fd3 to 6348f42 Compare November 6, 2022 17:12
@charettes
Copy link
Member Author

The full match handling issue should be addressed by #16266, this PR is rebased on top of it.

@charettes charettes marked this pull request as ready for review November 6, 2022 17:33
@charettes charettes force-pushed the count-optimizations branch 4 times, most recently from 7fd9210 to 6617592 Compare November 7, 2022 20:19
Also avoid an unnecessary pushdown when aggregating over a query that doesn't
have aggregate annotations.
@felixxm felixxm force-pushed the count-optimizations branch from 6617592 to 59bea9e Compare November 9, 2022 12:23
Copy link
Member

@felixxm felixxm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@charettes Thanks 👍

@felixxm felixxm merged commit 59bea9e into django:main Nov 9, 2022
@timgraham
Copy link
Member

Hi Simon, this causes a failure of test_filter_count on CockroachDB. Perhaps you can advise.

======================================================================
FAIL: test_filter_count (expressions_window.tests.WindowFunctionTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/tim/code/django/tests/expressions_window/tests.py", line 1160, in test_filter_count
    self.assertEqual(
AssertionError: 12 != 5

before SQL:

SELECT COUNT(*)
FROM
  (SELECT *
   FROM
     (SELECT RANK() OVER (PARTITION BY "expressions_window_employee"."department"
                          ORDER BY "expressions_window_employee"."salary" DESC) AS "department_salary_rank"
      FROM "expressions_window_employee") "qualify"
   WHERE "department_salary_rank" = 1) subquery;

after SQL:

SELECT COUNT(*)
FROM
  (SELECT "col1"
   FROM
     (SELECT *
      FROM
        (SELECT "expressions_window_employee"."id" AS "col1",
                RANK() OVER (PARTITION BY "expressions_window_employee"."department"
                             ORDER BY "expressions_window_employee"."salary" DESC) AS "qual0"
         FROM "expressions_window_employee"
         GROUP BY "col1",
                  "expressions_window_employee"."department") "qualify"
      WHERE "qual0" = 1 ) "qualify_mask") subquery;

@charettes charettes deleted the count-optimizations branch November 10, 2022 01:45
@charettes
Copy link
Member Author

charettes commented Nov 10, 2022

The query is definitely more complex but it seems to maintain its semantic?

SELECT
    "expressions_window_employee"."id" AS "col1",
    RANK() OVER (PARTITION BY "expressions_window_employee"."department" ORDER BY "expressions_window_employee"."salary" DESC) AS "qual0"
FROM
    "expressions_window_employee"
GROUP BY
    "col1",
    "expressions_window_employee"."department"

Is equivalent to

SELECT
    "expressions_window_employee"."id" AS "col1",
    RANK() OVER (PARTITION BY "expressions_window_employee"."department" ORDER BY "expressions_window_employee"."salary" DESC) AS "qual0"
FROM
    "expressions_window_employee"
GROUP BY "col1"

Because "expressions_window_employee"."department" is functionally dependent on "expressions_window_employee"."id" (the (id, department) tuples don't have more results than (id,) tuples) and there are no multi-valued relationship involved so the query is equivalent to

SELECT
    "expressions_window_employee"."id" AS "col1",
    RANK() OVER (PARTITION BY "expressions_window_employee"."department" ORDER BY "expressions_window_employee"."salary" DESC) AS "qual0"
FROM
    "expressions_window_employee"

Because there a no duplicate tuples by id that are produced.

Then the wrapping query is a similar passthrough with a "qual0" = 1 filter, the other one sieves all the columns to only "col1" and the final one performs a COUNT(*) on the number of rows and since the col1 cannot be NULL it semantically correct.

It seems like a bug in CockroachDB to me. I submitted #16277 to reduce the complexity of the query as this was an unexpected change in behaviour from my patch but that's something I would report upstream.

@charettes
Copy link
Member Author

For posterity here are the SQL diff of before and after of this change when running the following test suites annotations expressions queries aggregation aggregation_regress update delete delete_regress expressions_window

--- test_filter_alias_with_double_f (annotations.tests.AliasTests):postgresql:321ecb40f4
+++ test_filter_alias_with_double_f (annotations.tests.AliasTests):postgresql:9bd174b9a7
@@ -11,10 +11,8 @@
 WHERE "annotations_book"."rating" = ("annotations_book"."rating")
 ORDER BY "annotations_book"."id" ASC
 LIMIT 1
-SELECT COUNT(*)
-FROM
-  (SELECT "annotations_book"."id" AS "col1"
-   FROM "annotations_book"
-   WHERE "annotations_book"."rating" = ("annotations_book"."rating")) subquery
 SELECT COUNT(*) AS "__count"
 FROM "annotations_book"
+WHERE "annotations_book"."rating" = ("annotations_book"."rating")
+SELECT COUNT(*) AS "__count"
+FROM "annotations_book"
--- test_aggregate_over_annotation (annotations.tests.NonAggregateAnnotationTestCase):postgresql:321ecb40f4
+++ test_aggregate_over_annotation (annotations.tests.NonAggregateAnnotationTestCase):postgresql:9bd174b9a7
@@ -1,6 +1,4 @@
-SELECT SUM("other_age")
-FROM
-  (SELECT "annotations_author"."age" AS "other_age"
-   FROM "annotations_author") subquery
+SELECT SUM("annotations_author"."age") AS "otherage_sum"
+FROM "annotations_author"
 SELECT SUM("annotations_author"."age") AS "age_sum"
 FROM "annotations_author"
--- test_aggregate_over_full_expression_annotation (annotations.tests.NonAggregateAnnotationTestCase):postgresql:321ecb40f4
+++ test_aggregate_over_full_expression_annotation (annotations.tests.NonAggregateAnnotationTestCase):postgresql:9bd174b9a7
@@ -1,6 +1,4 @@
-SELECT SUM(("selected")::integer)
-FROM
-  (SELECT %s AS "selected"
-   FROM "annotations_book") subquery
+SELECT SUM((%s)::integer) AS "selected__sum"
+FROM "annotations_book"
 SELECT COUNT(*) AS "__count"
 FROM "annotations_book"
--- test_aggregate_rawsql_annotation (expressions.tests.BasicExpressionsTests):postgresql:321ecb40f4
+++ test_aggregate_rawsql_annotation (expressions.tests.BasicExpressionsTests):postgresql:9bd174b9a7
@@ -1,6 +1,2 @@
-SELECT COUNT("__col1")
-FROM
-  (SELECT (SUM(num_chairs) OVER (
-                                 ORDER BY num_employees)) AS "salary",
-          "expressions_company"."id" AS "__col1"
-   FROM "expressions_company") subquery
+SELECT COUNT("expressions_company"."id") AS "count"
+FROM "expressions_company"
--- test_aggregate_subquery_annotation (expressions.tests.BasicExpressionsTests):postgresql:321ecb40f4
+++ test_aggregate_subquery_annotation (expressions.tests.BasicExpressionsTests):postgresql:9bd174b9a7
@@ -1,9 +1,6 @@
-SELECT COUNT("__col1") FILTER (
-                               WHERE "ceo_salary" > %s)
-FROM
-  (SELECT
-     (SELECT U0."salary"
-      FROM "expressions_employee" U0
-      WHERE U0."id" = ("expressions_company"."ceo_id")) AS "ceo_salary",
-          "expressions_company"."id" AS "__col1"
-   FROM "expressions_company") subquery
+SELECT COUNT("expressions_company"."id") FILTER (
+                                                 WHERE
+                                                     (SELECT U0."salary"
+                                                      FROM "expressions_employee" U0
+                                                      WHERE U0."id" = ("expressions_company"."ceo_id")) > %s) AS "ceo_salary_gt_20"
+FROM "expressions_company"
--- test_annotate_values_aggregate (expressions.tests.BasicExpressionsTests):postgresql:321ecb40f4
+++ test_annotate_values_aggregate (expressions.tests.BasicExpressionsTests):postgresql:9bd174b9a7
@@ -1,7 +1,3 @@
-SELECT SUM(("salaries" + "__col1"))
-FROM
-  (SELECT "expressions_company"."num_employees" AS "col1",
-          "expressions_employee"."salary" AS "salaries",
-          "expressions_company"."num_employees" AS "__col1"
-   FROM "expressions_company"
-   INNER JOIN "expressions_employee" ON ("expressions_company"."ceo_id" = "expressions_employee"."id")) subquery
+SELECT SUM(("expressions_employee"."salary" + "expressions_company"."num_employees")) AS "result"
+FROM "expressions_company"
+INNER JOIN "expressions_employee" ON ("expressions_company"."ceo_id" = "expressions_employee"."id")
--- test_annotate_values_count (expressions.tests.BasicExpressionsTests):postgresql:321ecb40f4
+++ test_annotate_values_count (expressions.tests.BasicExpressionsTests):postgresql:9bd174b9a7
@@ -1,4 +1,2 @@
-SELECT COUNT(*)
-FROM
-  (SELECT (%s) AS "foo"
-   FROM "expressions_company") subquery
+SELECT COUNT(*) AS "__count"
+FROM "expressions_company"
--- test_filtering_on_annotate_that_uses_q (expressions.tests.BasicExpressionsTests):postgresql:321ecb40f4
+++ test_filtering_on_annotate_that_uses_q (expressions.tests.BasicExpressionsTests):postgresql:9bd174b9a7
@@ -1,5 +1,3 @@
-SELECT COUNT(*)
-FROM
-  (SELECT "expressions_company"."num_employees" > %s AS "num_employees_check"
-   FROM "expressions_company"
-   WHERE ("expressions_company"."num_employees" > %s)) subquery
+SELECT COUNT(*) AS "__count"
+FROM "expressions_company"
+WHERE ("expressions_company"."num_employees" > %s)
--- test_filtered_aggregate_ref_annotation (aggregation.test_filter_argument.FilteredAggregateTests):postgresql:321ecb40f4
+++ test_filtered_aggregate_ref_annotation (aggregation.test_filter_argument.FilteredAggregateTests):postgresql:9bd174b9a7
@@ -1,6 +1,3 @@
-SELECT COUNT("__col1") FILTER (
-                               WHERE "double_age" > %s)
-FROM
-  (SELECT ("aggregation_author"."age" * %s) AS "double_age",
-          "aggregation_author"."id" AS "__col1"
-   FROM "aggregation_author") subquery
+SELECT COUNT("aggregation_author"."id") FILTER (
+                                                WHERE ("aggregation_author"."age" * %s) > %s) AS "cnt"
+FROM "aggregation_author"
--- test_filtered_aggregate_ref_multiple_subquery_annotation (aggregation.test_filter_argument.FilteredAggregateTests):postgresql:321ecb40f4
+++ test_filtered_aggregate_ref_multiple_subquery_annotation (aggregation.test_filter_argument.FilteredAggregateTests):postgresql:9bd174b9a7
@@ -1,23 +1,18 @@
-SELECT MAX("__col1") FILTER (
-                             WHERE (NOT "authors_have_other_books"
-                                    AND "has_authors"))
-FROM
-  (SELECT "aggregation_book"."publisher_id" AS "col1",
-          EXISTS
-     (SELECT %s AS "a"
-      FROM "aggregation_book_authors" U0
-      WHERE U0."book_id" = ("aggregation_book"."id")
-      LIMIT 1) AS "has_authors",
-          EXISTS
-     (SELECT %s AS "a"
-      FROM "aggregation_book" V0
-      INNER JOIN "aggregation_book_authors" V1 ON (V0."id" = V1."book_id")
-      WHERE (V1."author_id" IN
-               (SELECT U0."id"
-                FROM "aggregation_author" U0
-                INNER JOIN "aggregation_book" U1 ON (U0."id" = U1."contact_id")
-                WHERE U1."id" = ("aggregation_book"."id"))
-             AND NOT (V0."id" = ("aggregation_book"."id")))
-      LIMIT 1) AS "authors_have_other_books",
-          "aggregation_book"."rating" AS "__col1"
-   FROM "aggregation_book") subquery
+SELECT MAX("aggregation_book"."rating") FILTER (
+                                                WHERE (NOT EXISTS
+                                                         (SELECT %s AS "a"
+                                                          FROM "aggregation_book" V0
+                                                          INNER JOIN "aggregation_book_authors" V1 ON (V0."id" = V1."book_id")
+                                                          WHERE (V1."author_id" IN
+                                                                   (SELECT U0."id"
+                                                                    FROM "aggregation_author" U0
+                                                                    INNER JOIN "aggregation_book" U1 ON (U0."id" = U1."contact_id")
+                                                                    WHERE U1."id" = ("aggregation_book"."id"))
+                                                                 AND NOT (V0."id" = ("aggregation_book"."id")))
+                                                          LIMIT 1)
+                                                       AND EXISTS
+                                                         (SELECT %s AS "a"
+                                                          FROM "aggregation_book_authors" U0
+                                                          WHERE U0."book_id" = ("aggregation_book"."id")
+                                                          LIMIT 1))) AS "max_rating"
+FROM "aggregation_book"
--- test_filtered_aggregate_ref_subquery_annotation (aggregation.test_filter_argument.FilteredAggregateTests):postgresql:321ecb40f4
+++ test_filtered_aggregate_ref_subquery_annotation (aggregation.test_filter_argument.FilteredAggregateTests):postgresql:9bd174b9a7
@@ -1,12 +1,9 @@
-SELECT COUNT("__col1") FILTER (
-                               WHERE "earliest_book_year" = %s)
-FROM
-  (SELECT
-     (SELECT EXTRACT(%s
-                     FROM U0."pubdate")
-      FROM "aggregation_book" U0
-      WHERE U0."contact_id" = ("aggregation_author"."id")
-      ORDER BY U0."pubdate" ASC
-      LIMIT 1) AS "earliest_book_year",
-          "aggregation_author"."id" AS "__col1"
-   FROM "aggregation_author") subquery
+SELECT COUNT("aggregation_author"."id") FILTER (
+                                                WHERE
+                                                    (SELECT EXTRACT(%s
+                                                                    FROM U0."pubdate")
+                                                     FROM "aggregation_book" U0
+                                                     WHERE U0."contact_id" = ("aggregation_author"."id")
+                                                     ORDER BY U0."pubdate" ASC
+                                                     LIMIT 1) = %s) AS "cnt"
+FROM "aggregation_author"
--- test_aggregation_default_after_annotation (aggregation.tests.AggregateTestCase):postgresql:321ecb40f4
+++ test_aggregation_default_after_annotation (aggregation.tests.AggregateTestCase):postgresql:9bd174b9a7
@@ -1,4 +1,2 @@
-SELECT COALESCE(SUM("double_num_awards"), %s)
-FROM
-  (SELECT ("aggregation_publisher"."num_awards" * %s) AS "double_num_awards"
-   FROM "aggregation_publisher") subquery
+SELECT COALESCE(SUM(("aggregation_publisher"."num_awards" * %s)), %s) AS "value"
+FROM "aggregation_publisher"
--- test_aggregation_default_not_in_aggregate (aggregation.tests.AggregateTestCase):postgresql:321ecb40f4
+++ test_aggregation_default_not_in_aggregate (aggregation.tests.AggregateTestCase):postgresql:9bd174b9a7
@@ -1,7 +1,6 @@
 SELECT SUM("__col1")
 FROM
-  (SELECT COALESCE(AVG("aggregation_book"."rating"), %s) AS "avg_rating",
-          "aggregation_publisher"."num_awards" AS "__col1"
+  (SELECT "aggregation_publisher"."num_awards" AS "__col1"
    FROM "aggregation_publisher"
    LEFT OUTER JOIN "aggregation_book" ON ("aggregation_publisher"."id" = "aggregation_book"."publisher_id")
    GROUP BY "aggregation_publisher"."id") subquery
--- test_aggregation_subquery_annotation_multivalued (aggregation.tests.AggregateTestCase):postgresql:321ecb40f4
+++ test_aggregation_subquery_annotation_multivalued (aggregation.tests.AggregateTestCase):postgresql:9bd174b9a7
@@ -1,17 +1,9 @@
 SELECT COUNT(*)
 FROM
-  (SELECT
-     (SELECT U0."id"
-      FROM "aggregation_author" U0
-      INNER JOIN "aggregation_book_authors" U1 ON (U0."id" = U1."author_id")
-      INNER JOIN "aggregation_book" U2 ON (U1."book_id" = U2."id")
-      WHERE (U2."name" = ("aggregation_book"."name")
-             AND U0."id" = ("aggregation_author"."id"))) AS "subquery_id",
-          COUNT("aggregation_book_authors"."book_id") AS "count"
+  (SELECT "aggregation_author"."id" AS "col1"
    FROM "aggregation_author"
    LEFT OUTER JOIN "aggregation_book_authors" ON ("aggregation_author"."id" = "aggregation_book_authors"."author_id")
    LEFT OUTER JOIN "aggregation_book" ON ("aggregation_book_authors"."book_id" = "aggregation_book"."id")
-   GROUP BY "aggregation_author"."id",
-            "subquery_id") subquery
+   GROUP BY "col1") subquery
 SELECT COUNT(*) AS "__count"
 FROM "aggregation_author"
--- test_annotate_values_aggregate (aggregation.tests.AggregateTestCase):postgresql:321ecb40f4
+++ test_annotate_values_aggregate (aggregation.tests.AggregateTestCase):postgresql:9bd174b9a7
@@ -1,6 +1,4 @@
-SELECT SUM("age_alias")
-FROM
-  (SELECT "aggregation_author"."age" AS "age_alias"
-   FROM "aggregation_author") subquery
 SELECT SUM("aggregation_author"."age") AS "sum_age"
 FROM "aggregation_author"
+SELECT SUM("aggregation_author"."age") AS "sum_age"
+FROM "aggregation_author"
--- test_more_aggregation (aggregation.tests.AggregateTestCase):postgresql:321ecb40f4
+++ test_more_aggregation (aggregation.tests.AggregateTestCase):postgresql:9bd174b9a7
@@ -32,8 +32,7 @@
 WHERE "aggregation_book"."id" = %s
 SELECT AVG("__col1")
 FROM
-  (SELECT COUNT("aggregation_book_authors"."author_id") AS "num_authors",
-          "aggregation_book"."rating" AS "__col1"
+  (SELECT "aggregation_book"."rating" AS "__col1"
    FROM "aggregation_book"
    LEFT OUTER JOIN "aggregation_book_authors" ON ("aggregation_book"."id" = "aggregation_book_authors"."book_id")
    INNER JOIN "aggregation_book_authors" T4 ON ("aggregation_book"."id" = T4."book_id")
--- test_aggregate_on_relation (aggregation_regress.tests.AggregationTests):postgresql:321ecb40f4
+++ test_aggregate_on_relation (aggregation_regress.tests.AggregationTests):postgresql:9bd174b9a7
@@ -1,7 +1,6 @@
 SELECT SUM("__col1")
 FROM
-  (SELECT AVG("aggregation_regress_book"."price") AS "avg_price",
-          "aggregation_regress_publisher"."num_awards" AS "__col1"
+  (SELECT "aggregation_regress_publisher"."num_awards" AS "__col1"
    FROM "aggregation_regress_book"
    INNER JOIN "aggregation_regress_publisher" ON ("aggregation_regress_book"."publisher_id" = "aggregation_regress_publisher"."id")
    GROUP BY "aggregation_regress_book"."id",
--- test_annotate_and_join (aggregation_regress.tests.AggregationTests):postgresql:321ecb40f4
+++ test_annotate_and_join (aggregation_regress.tests.AggregationTests):postgresql:9bd174b9a7
@@ -1,6 +1,6 @@
 SELECT COUNT(*)
 FROM
-  (SELECT COUNT(T3."name") AS "c"
+  (SELECT "aggregation_regress_author"."id" AS "col1"
    FROM "aggregation_regress_author"
    LEFT OUTER JOIN "aggregation_regress_author_friends" ON ("aggregation_regress_author"."id" = "aggregation_regress_author_friends"."from_author_id")
    LEFT OUTER JOIN "aggregation_regress_author" T3 ON ("aggregation_regress_author_friends"."to_author_id" = T3."id")
@@ -11,6 +11,6 @@
                  WHERE (U2."name" = %s
                         AND U1."from_author_id" = ("aggregation_regress_author"."id"))
                  LIMIT 1))
-   GROUP BY "aggregation_regress_author"."id") subquery
+   GROUP BY "col1") subquery
 SELECT COUNT(*) AS "__count"
 FROM "aggregation_regress_author"
--- test_annotated_conditional_aggregate (aggregation_regress.tests.AggregationTests):postgresql:321ecb40f4
+++ test_annotated_conditional_aggregate (aggregation_regress.tests.AggregationTests):postgresql:9bd174b9a7
@@ -1,8 +1,5 @@
 SELECT AVG(CASE
-               WHEN "__col1" < %s THEN "discount_price"
+               WHEN "aggregation_regress_book"."pages" < %s THEN ("aggregation_regress_book"."price" * %s)
                ELSE NULL
-           END)
-FROM
-  (SELECT ("aggregation_regress_book"."price" * %s) AS "discount_price",
-          "aggregation_regress_book"."pages" AS "__col1"
-   FROM "aggregation_regress_book") subquery
+           END) AS "test"
+FROM "aggregation_regress_book"
--- test_conditional_aggregate (aggregation_regress.tests.AggregationTests):postgresql:321ecb40f4
+++ test_conditional_aggregate (aggregation_regress.tests.AggregationTests):postgresql:9bd174b9a7
@@ -1,10 +1,10 @@
 SELECT SUM(CASE
-               WHEN "__col1" > %s THEN %s
+               WHEN "c" > %s THEN %s
                ELSE NULL
            END)
 FROM
   (SELECT "aggregation_regress_book"."id" AS "col1",
-          COUNT("aggregation_regress_book_authors"."author_id") AS "__col1"
+          COUNT("aggregation_regress_book_authors"."author_id") AS "c"
    FROM "aggregation_regress_book"
    LEFT OUTER JOIN "aggregation_regress_book_authors" ON ("aggregation_regress_book"."id" = "aggregation_regress_book_authors"."book_id")
    GROUP BY "col1") subquery
--- test_more (aggregation_regress.tests.AggregationTests):postgresql:321ecb40f4
+++ test_more (aggregation_regress.tests.AggregationTests):postgresql:9bd174b9a7
@@ -1,9 +1,9 @@
 SELECT COUNT(*)
 FROM
-  (SELECT COUNT("aggregation_regress_book_authors"."author_id") AS "num_authors"
+  (SELECT "aggregation_regress_book"."id" AS "col1"
    FROM "aggregation_regress_book"
    LEFT OUTER JOIN "aggregation_regress_book_authors" ON ("aggregation_regress_book"."id" = "aggregation_regress_book_authors"."book_id")
-   GROUP BY "aggregation_regress_book"."id") subquery
+   GROUP BY "col1") subquery
 SELECT MAX("num_authors")
 FROM
   (SELECT COUNT("aggregation_regress_book_authors"."author_id") AS "num_authors"
--- test_more_more_more (aggregation_regress.tests.AggregationTests):postgresql:321ecb40f4
+++ test_more_more_more (aggregation_regress.tests.AggregationTests):postgresql:9bd174b9a7
@@ -30,8 +30,7 @@
 ORDER BY "sheets" ASC
 SELECT COUNT(*)
 FROM
-  (SELECT "aggregation_regress_book"."publisher_id" AS "col1",
-          COUNT("aggregation_regress_book"."publisher_id") AS "publisher__count"
+  (SELECT "aggregation_regress_book"."publisher_id" AS "col1"
    FROM "aggregation_regress_book"
    GROUP BY "col1") subquery
 SELECT COUNT(*)
--- test_q_annotation_aggregate (aggregation_regress.tests.AggregationTests):postgresql:321ecb40f4
+++ test_q_annotation_aggregate (aggregation_regress.tests.AggregationTests):postgresql:9bd174b9a7
@@ -1,4 +1,2 @@
-SELECT COUNT(*)
-FROM
-  (SELECT "aggregation_regress_book"."id" IS NOT NULL AS "has_pk"
-   FROM "aggregation_regress_book") subquery
+SELECT COUNT(*) AS "__count"
+FROM "aggregation_regress_book"
--- test_filter_count (expressions_window.tests.WindowFunctionTests):postgresql:321ecb40f4
+++ test_filter_count (expressions_window.tests.WindowFunctionTests):postgresql:9bd174b9a7
@@ -1,8 +1,13 @@
 SELECT COUNT(*)
 FROM
-  (SELECT *
+  (SELECT "col1"
    FROM
-     (SELECT RANK() OVER (PARTITION BY "expressions_window_employee"."department"
-                          ORDER BY "expressions_window_employee"."salary" DESC) AS "department_salary_rank"
-      FROM "expressions_window_employee") "qualify"
-   WHERE "department_salary_rank" = %s) subquery
+     (SELECT *
+      FROM
+        (SELECT "expressions_window_employee"."id" AS "col1",
+                RANK() OVER (PARTITION BY "expressions_window_employee"."department"
+                             ORDER BY "expressions_window_employee"."salary" DESC) AS "qual0"
+         FROM "expressions_window_employee"
+         GROUP BY "col1",
+                  "expressions_window_employee"."department") "qualify"
+      WHERE "qual0" = %s ) "qualify_mask") subquery
--- test_related_ordering_with_count (expressions_window.tests.WindowFunctionTests):postgresql:321ecb40f4
+++ test_related_ordering_with_count (expressions_window.tests.WindowFunctionTests):postgresql:9bd174b9a7
@@ -1,6 +1,6 @@
 SELECT COUNT(*)
 FROM
-  (SELECT SUM("expressions_window_employee"."salary") OVER (PARTITION BY "expressions_window_employee"."department"
-                                                            ORDER BY "expressions_window_classification"."code") AS "department_sum"
+  (SELECT "expressions_window_employee"."id" AS "col1"
    FROM "expressions_window_employee"
-   LEFT OUTER JOIN "expressions_window_classification" ON ("expressions_window_employee"."classification_id" = "expressions_window_classification"."id")) subquery
+   LEFT OUTER JOIN "expressions_window_classification" ON ("expressions_window_employee"."classification_id" = "expressions_window_classification"."id")
+   GROUP BY "col1") subquery

The expressions_window seems like the only undesirable change and is tackled by #16277

@charettes
Copy link
Member Author

It's interesting to see that there is still room for optimization when pruning aggregate annotations that were referring to JOINs.

aggregation.tests.AggregateTestCase.test_aggregation_default_not_in_aggregate is a good example of that

--- test_aggregation_default_not_in_aggregate (aggregation.tests.AggregateTestCase):postgresql:321ecb40f4
+++ test_aggregation_default_not_in_aggregate (aggregation.tests.AggregateTestCase):postgresql:9bd174b9a7
@@ -1,7 +1,6 @@
 SELECT SUM("__col1")
 FROM
-  (SELECT COALESCE(AVG("aggregation_book"."rating"), %s) AS "avg_rating",
-          "aggregation_publisher"."num_awards" AS "__col1"
+  (SELECT "aggregation_publisher"."num_awards" AS "__col1"
    FROM "aggregation_publisher"
    LEFT OUTER JOIN "aggregation_book" ON ("aggregation_publisher"."id" = "aggregation_book"."publisher_id")
    GROUP BY "aggregation_publisher"."id") subquery

The COALESCE(AVG("aggregation_book"."rating"), %s) is pruned so there is no need for the LEFT OUTER JOIN "aggregation_book" anymore that forces the use of the GROUP BY and of the subquery pushdown.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants