Fixed #35235 -- ArrayAgg() doesn't return default when filter contains __in=[]. #17890

sharonwoo · 2024-02-20T08:13:34Z

https://code.djangoproject.com/ticket/35235

New to Django core here and wanting to try for next djangonaut run, thought I could try to triage by writing some quickie tests to see if I could replicate the issue... and I can't. My 2 additional tests (and all of them, actually) pass locally.

Let me know if this is not a good approach.

github-actions

Hello! Thank you for your contribution 💪

As it's your first contribution be sure to check out the patch review checklist.

If you're fixing a ticket from Trac make sure to set the "Has patch" flag and include a link to this PR in the ticket!

If you have any design or process questions then you can ask in the Django forum.

Welcome aboard ⛵️!

tests/postgres_tests/test_aggregates.py

sharonwoo · 2024-02-22T09:42:50Z

tests/postgres_tests/test_aggregates.py

+            )
+            self.assertEqual([], result)
+
+        self.assertIn("[] != '{}'", str(context.exception))


This test, with _output_field_or_none as @cached_property, raises an exception as result gives '{}'. I then call the context.exception to show what the result is... a bit clumsy I know

This is only for experiments, right?
Normally, I'd use:

qs = AggregateTestModel.objects.annotate( test_array_agg=ArrayAgg( "stattestmodel__int1", filter=Q(pk__in=[]), default=Value([]), ) ) self.assertSequenceEqual(qs.get().test_array_agg, [])

felixxm · 2024-02-22T09:49:32Z

tests/postgres_tests/test_aggregates.py

+                .first()
+                .test_array_agg
+            )
+            self.assertEqual([], result)


It's preferred to have an expected value as a second argument.

Suggested change

self.assertEqual([], result)

self.assertSequenceEqual(result, [])

shangxiao · 2024-02-22T09:54:04Z

tests/postgres_tests/test_aggregates.py

+                    test_array_agg=ArrayAgg(
+                        "stattestmodel__int1",
+                        filter=Q(pk__in=[]),
+                        default=Value([]),


An optional idea: maybe setup this as a subtest to test the conditions:

filter value:

[-1]

[]

default value:

[]

Value([])

This increases our "black box" coverage. (Simon mentioned that Value(…) is no longer necessary for defaults and we may want to update the docs to reflect that but it still may be valuable to use for coverage.)

Edit: Clarified conditions as filter value × default value as per Mariusz' setup in the ticket.

shangxiao · 2024-02-22T10:03:13Z

I wonder if we should also add a test around _output_field_or_none though it could be a bit too trivial 🤔

eg

def test_output_field_or_none(self):
    expression = FooExpression()
    self.assertIsNone(assert expression._output_field_or_none)
    new_output_field = BooleanField()
    expression.output_field = new_output_field
    self.assertIs(expression._output_field_or_none, new_output_field)

shangxiao · 2024-02-23T03:49:11Z

tests/postgres_tests/test_aggregates.py

+    def test_array_agg_with_empty_filter_and_default_values(self):
+        for filter_value in ([-1], []):
+            for default_value in ([], Value([])):
+                with self.subTest(filter=filter_value, default=default_value):


charettes · 2024-02-23T04:05:03Z

@shangxiao

I wonder if we should also add a test around _output_field_or_none though it could be a bit too trivial 🤔

I think it might be worth it yes just to capture that is it an unexpected behaviour of BaseExpression

Also do you know how to trigger a benchmark run on a PR? I figured it might be worth a run to assert our expectations wrt/to _output_field_or_none uncaching hold.

My expectations are that since every expression is composed of other expressions that they resolve their output field from and eventually cache them it should have no impact but it might be valuable to confirm prior to merging if it's just a few more CPU cycle away.

charettes · 2024-02-23T04:09:15Z

BTW I also just noticed _output_field_resolved_to_none which gets assigned in output_field and it seems that flag might also warrant clearing? I wonder if its existence warrants the usage of a slightly more involved solution that has output_field assignment clear both _output_field_or_none and _output_field_resolved_to_none instead.

shangxiao · 2024-02-23T07:04:05Z

BTW I also just noticed _output_field_resolved_to_none which gets assigned in output_field and it seems that flag might also warrant clearing? I wonder if its existence warrants the usage of a slightly more involved solution that has output_field assignment clear both _output_field_or_none and _output_field_resolved_to_none instead.

I saw that. Initially I figured we could ignore it as it's only used in that module, but as folks always say we don't know how this module gets used out in the wild 😝

A simpler (and possibly more Pythonic?) approach could just be to remove flags altogether and raise a different exception type, and use that as the distinguishing factor as to whether _output_field_or_none() reraises.

eg

--- a/django/db/models/expressions.py
+++ b/django/db/models/expressions.py
@@ -172,7 +172,6 @@ class BaseExpression:
     empty_result_set_value = NotImplemented
     # aggregate specific fields
     is_summary = False
-    _output_field_resolved_to_none = False
     # Can the expression be used in a WHERE clause?
     filterable = True
     # Can the expression can be used as a source expression in Window?
@@ -310,8 +309,7 @@ class BaseExpression:
         """Return the output type of this expressions."""
         output_field = self._resolve_output_field()
         if output_field is None:
-            self._output_field_resolved_to_none = True
-            raise FieldError("Cannot resolve expression type, unknown output_field")
+            raise ValueError("Cannot resolve expression type, unknown output_field")
         return output_field

     @property
@@ -323,8 +321,7 @@ class BaseExpression:
         try:
             return self.output_field
         except FieldError:
-            if not self._output_field_resolved_to_none:
-                raise
+            raise

     def _resolve_output_field(self):
         """

shangxiao · 2024-02-23T07:18:10Z

Or maybe this is better as it keeps the API consistent - this way it doesn't disrupt folks expecting FieldError from .output_field.

I know Mariusz doesn't like specific exceptions but this is a practice I find valuable - a practice which I got from old mate Raymond Hettinger in one of his presentations.

Anyway though the point I'm making is that "state is the root of all evil" (aside from premature optimisation that is 😁)

--- a/django/db/models/expressions.py
+++ b/django/db/models/expressions.py
@@ -166,13 +166,16 @@ class Combinable:
         return NegatedExpression(self)


+class OutputFieldIsNoneError(FieldError):
+    pass
+
+
 class BaseExpression:
     """Base class for all query expressions."""

     empty_result_set_value = NotImplemented
     # aggregate specific fields
     is_summary = False
-    _output_field_resolved_to_none = False
     # Can the expression be used in a WHERE clause?
     filterable = True
     # Can the expression can be used as a source expression in Window?
@@ -310,8 +313,9 @@ class BaseExpression:
         """Return the output type of this expressions."""
         output_field = self._resolve_output_field()
         if output_field is None:
-            self._output_field_resolved_to_none = True
-            raise FieldError("Cannot resolve expression type, unknown output_field")
+            raise OutputFieldIsNoneError(
+                "Cannot resolve expression type, unknown output_field"
+            )
         return output_field

     @property
@@ -322,9 +326,10 @@ class BaseExpression:
         """
         try:
             return self.output_field
+        except OutputFieldIsNoneError:
+            return
         except FieldError:
-            if not self._output_field_resolved_to_none:
-                raise
+            raise

     def _resolve_output_field(self):
         """

charettes · 2024-02-23T15:09:28Z

I like the OutputFieldIsNoneError approach a lot, it seems like it should have been this way in the first place.

One small tweak I would make; the whole except FieldError: raise branch is now unnecessary

--- a/django/db/models/expressions.py
+++ b/django/db/models/expressions.py
@@ -166,13 +166,16 @@ class Combinable:
         return NegatedExpression(self)


+class OutputFieldIsNoneError(FieldError):
+    pass
+
+
 class BaseExpression:
     """Base class for all query expressions."""

     empty_result_set_value = NotImplemented
     # aggregate specific fields
     is_summary = False
-    _output_field_resolved_to_none = False
     # Can the expression be used in a WHERE clause?
     filterable = True
     # Can the expression can be used as a source expression in Window?
@@ -310,8 +313,9 @@ class BaseExpression:
         """Return the output type of this expressions."""
         output_field = self._resolve_output_field()
         if output_field is None:
-            self._output_field_resolved_to_none = True
-            raise FieldError("Cannot resolve expression type, unknown output_field")
+            raise OutputFieldIsNoneError(
+                "Cannot resolve expression type, unknown output_field"
+            )
         return output_field

     @property
@@ -322,9 +326,10 @@ class BaseExpression:
         """
         try:
             return self.output_field
+        except OutputFieldIsNoneError:
+            return
-         except FieldError:
-            if not self._output_field_resolved_to_none:
-                raise

     def _resolve_output_field(self):
         """

shangxiao · 2024-02-23T16:38:51Z

Ah yes of course you're right, one of those forest-trees moments 😂

shangxiao · 2024-02-25T07:42:30Z

@sharonwoo Just need some tests around _output_field_or_none 👍

sharonwoo · 2024-02-28T17:29:24Z

tests/expressions/tests.py

-        with self.assertRaisesMessage(FieldError, msg):
-            Value(object()).output_field
+    def test_output_field_resolution(self):
+        # why does this fail, but test_resolve_output_field_with_null pass?


For my own understanding - to understand the difference between this test and the set of tests containing test_resolve_output_field_with_null

Still reading https://code.djangoproject.com/ticket/33397 and #15271 Q__Q

test_output_field_resolution should see the expression combinations put into the annotation raise OutputFieldIsNoneError EXCEPT the three handled by db.models.expressions._connector_combinations, 'IntegerField', 'DecimalField', 'FloatField'.

… __in=[]

sharonwoo · 2024-05-07T07:18:14Z

Sorry, let me come back to this this or next week :(

sharonwoo marked this pull request as draft February 20, 2024 08:14

github-actions bot reviewed Feb 20, 2024

View reviewed changes

percarlsen reviewed Feb 20, 2024

View reviewed changes

tests/postgres_tests/test_aggregates.py Outdated Show resolved Hide resolved

shangxiao reviewed Feb 20, 2024

View reviewed changes

tests/postgres_tests/test_aggregates.py Outdated Show resolved Hide resolved

sharonwoo commented Feb 22, 2024

View reviewed changes

felixxm reviewed Feb 22, 2024

View reviewed changes

shangxiao reviewed Feb 22, 2024

View reviewed changes

sharonwoo changed the title ~~ArrayAgg inconsistent return value when no results - could not replicate issue~~ Fixed #35235 -- ArrayAgg() doesn't return default when filter contains __in=[]. Feb 23, 2024

sharonwoo force-pushed the arrayagg-inconsistent-return-value-ticket-35235 branch from 51f14bf to c4c12a0 Compare February 23, 2024 03:18

sharonwoo marked this pull request as ready for review February 23, 2024 03:27

sharonwoo force-pushed the arrayagg-inconsistent-return-value-ticket-35235 branch from c4c12a0 to 80fab95 Compare February 23, 2024 03:33

shangxiao reviewed Feb 23, 2024

View reviewed changes

felixxm added the benchmark Apply to have ASV benchmarks run on a PR label Feb 23, 2024

sharonwoo force-pushed the arrayagg-inconsistent-return-value-ticket-35235 branch from 80fab95 to b148ed7 Compare February 24, 2024 14:48

sharonwoo force-pushed the arrayagg-inconsistent-return-value-ticket-35235 branch 3 times, most recently from 1e346f5 to 407d534 Compare February 27, 2024 05:11

sharonwoo marked this pull request as draft February 27, 2024 05:13

sharonwoo mentioned this pull request Feb 28, 2024

Fixed #35257 -- Corrected resolving output_field for IntegerField/DecimalField with NULL. #17916

Merged

sharonwoo commented Feb 28, 2024

View reviewed changes

sharonwoo added 2 commits March 4, 2024 16:35

Fixed #35235 -- ArrayAgg() not returning default when filter contains…

145af2b

… __in=[]

Fixed #35235 -- ArrayAgg() not returning default when filter contains…

14d3065

… __in=[]

sharonwoo force-pushed the arrayagg-inconsistent-return-value-ticket-35235 branch from 407d534 to 14d3065 Compare March 4, 2024 08:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixed #35235 -- ArrayAgg() doesn't return default when filter contains __in=[]. #17890

Fixed #35235 -- ArrayAgg() doesn't return default when filter contains __in=[]. #17890

sharonwoo commented Feb 20, 2024

github-actions bot left a comment

sharonwoo Feb 22, 2024 •

edited

Loading

felixxm Feb 22, 2024

felixxm Feb 22, 2024

shangxiao Feb 22, 2024 •

edited

Loading

shangxiao commented Feb 22, 2024

shangxiao Feb 23, 2024

charettes commented Feb 23, 2024 •

edited

Loading

charettes commented Feb 23, 2024 •

edited

Loading

shangxiao commented Feb 23, 2024

shangxiao commented Feb 23, 2024 •

edited

Loading

charettes commented Feb 23, 2024 •

edited

Loading

shangxiao commented Feb 23, 2024

shangxiao commented Feb 25, 2024

sharonwoo Feb 28, 2024

sharonwoo Mar 1, 2024 •

edited

Loading

sharonwoo commented May 7, 2024

	self.assertEqual([], result)
	self.assertSequenceEqual(result, [])

Fixed #35235 -- ArrayAgg() doesn't return default when filter contains __in=[]. #17890

Are you sure you want to change the base?

Fixed #35235 -- ArrayAgg() doesn't return default when filter contains __in=[]. #17890

Conversation

sharonwoo commented Feb 20, 2024

github-actions bot left a comment

Choose a reason for hiding this comment

sharonwoo Feb 22, 2024 • edited Loading

Choose a reason for hiding this comment

felixxm Feb 22, 2024

Choose a reason for hiding this comment

felixxm Feb 22, 2024

Choose a reason for hiding this comment

shangxiao Feb 22, 2024 • edited Loading

Choose a reason for hiding this comment

shangxiao commented Feb 22, 2024

shangxiao Feb 23, 2024

Choose a reason for hiding this comment

charettes commented Feb 23, 2024 • edited Loading

charettes commented Feb 23, 2024 • edited Loading

shangxiao commented Feb 23, 2024

shangxiao commented Feb 23, 2024 • edited Loading

charettes commented Feb 23, 2024 • edited Loading

shangxiao commented Feb 23, 2024

shangxiao commented Feb 25, 2024

sharonwoo Feb 28, 2024

Choose a reason for hiding this comment

sharonwoo Mar 1, 2024 • edited Loading

Choose a reason for hiding this comment

sharonwoo commented May 7, 2024

sharonwoo Feb 22, 2024 •

edited

Loading

shangxiao Feb 22, 2024 •

edited

Loading

charettes commented Feb 23, 2024 •

edited

Loading

charettes commented Feb 23, 2024 •

edited

Loading

shangxiao commented Feb 23, 2024 •

edited

Loading

charettes commented Feb 23, 2024 •

edited

Loading

sharonwoo Mar 1, 2024 •

edited

Loading