Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed #27849 -- Added filtering support to aggregates. #8352

Merged
merged 1 commit into from Aug 12, 2017
Merged

Fixed #27849 -- Added filtering support to aggregates. #8352

merged 1 commit into from Aug 12, 2017

Conversation

orf
Copy link
Contributor

@orf orf commented Apr 12, 2017

Re: #8306, this MR is an initial stab at supporting .filter and .exclude on any Aggregate.

We do this in two ways: in the case of postgres we use the SQL 2003 FILTER syntax: AGG(field) FILTER (WHERE ...), and on other databases we emulate this with AGG(CASE ... THEN field ELSE NULL). This is currently only supported in Postgres.

The reason for supporting both, despite being functionally equivalent, is that the postgres syntax is faster.

I'm pretty sure the implementation might make someone more knowledgeable about django aggregates sick, but I think it's a good initial draft. It's slightly tricky because while the CASE syntax is nested within the aggregate the FILTER syntax appears outside, so I think we need a custom AggregateFilter expression to handle these two types.

Code quality is poor, I just wanted to use this as a proof of concept to spur a discussion about how (or even if) to best support this.

Made up example use-case.

one_week_ago = timezone.now().date() - timedelta(days=7)

Mailboxes.objects.annotate(
   read_emails=Count('emails').filter(unread=False),
   unread_emails=Count('emails').filter(unread=True),
   recent_emails=Count('emails').filter(received_date__lt=one_week_ago)
)

return compiler.compile(FilterWhere(self.value, self.condition, output_field=self.output_field))

def filter(self, **kwargs):
return AggregateFilter(value=self.value, condition=self.condition | Q(**kwargs))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think chaining filter/exclude should be an AND not an OR. It makes more sense to me and I'm pretty sure that what queryset filter/exclude do.

Copy link
Member

@charettes charettes Apr 12, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will also have to take into account filter(m2m__field=val1, criteria=m2m__otherfield=val2) != filter(m2m__field=val1).filter(m2m__otherfield=val2) as explained in spanning multi-valued relationships

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hah, yeah, I'm not sure why I made it an OR. I've updated it. I'm not sure how to handle the spanning of relationships like that though :/


def test_double_filtered_aggregates(self):
agg = Sum('age').filter(name='test2').filter(name='test')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i.e. here I would expect it to filter out everything.

@charettes
Copy link
Member

I'm still wondering what's the use case for chaining aggregate filters. Is there a reason why allowing aggregates to accept a filter parameter accepting Q objects is not enough?

What if we want to add an AggregateOrderBy as well? How would the template be defined to make sure both ordering and filtering be used at the same time?

@orf
Copy link
Contributor Author

orf commented Apr 12, 2017

Hey @charettes,
I'm unclear of the nicest API for this feature. I personally quite like the idea of a .filter and .exclude method, it makes it feel quite natural and like usual querysets (and with the usual benefits). However if chaining them is too complex then maybe we can not support that - but then it's not quite like a queryset :(

My reasoning behind a separate method is also down to the implementation. As far as I can tell allowing aggregates to accept a Q object in the constructor wouldn't be possible without overriding the as_sql method of the Function expression. I wanted to avoid this as I couldn't work out how to do that nicely. Thus the filter (and exclude) methods return a separate Expression that wraps the inner aggregate.

Again, this is just a draft and is most likely not the best way to go implementation wise. If you think there is a better way and could share a gist or some more details that would be awesome!

@orf
Copy link
Contributor Author

orf commented Apr 12, 2017

Ahh, I see that the FILTER syntax is standardized in SQL:2003. That means the FilterWhere should be moved out of the contrib.postgres package and into the aggregate file itself.

As it's standardized, perhaps it should be the cornerstone of this. As in, the FilterWhere class itself is responsible for falling back to a CASE WHEN emulation depending on a database feature flag? Rather than the current code with a class that produces a FilterWhere for postgres or a CASE WHEN expression for all others.

Thoughts?

@orf
Copy link
Contributor Author

orf commented Apr 23, 2017

Hey @charettes, I've changed my merge request to accept a Q object via a filter keyword now as discussed previously. Can you give me some feedback?

I'm not sure if I like adding WhereNode to the list of allowed types that When accepts, perhaps there is a better way? Also how should I test this, should I write two tests for each and every aggregate? Should I add them as a separate TestCase or inline them?

Copy link
Member

@charettes charettes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if I like adding WhereNode to the list of allowed types that When accepts, perhaps there is a better way?

You shouldn't have to do that I believe, I'll give it a try during the week.

Also how should I test this, should I write two tests for each and every aggregate? Should I add them as a separate TestCase or inline them?

I don't think it's worth adding a test case for every existing aggregates. A single test case for the filtering behavior of Aggregate and and subclasses that override as_sql() should be enough (I haven't found any so far).


if not supports_filter:
case_statement = self.get_case_expression()
self.set_source_expressions([case_statement])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you should should avoid altering self in this case, you should clone self and call set_source_expressions on the clone.

@orf
Copy link
Contributor Author

orf commented Apr 24, 2017

Hey, I've added a few more tests, including one that supports using both a CASE statement and a filter parameter. It produces some weird query akin to:

SUM(CASE WHEN x THEN CASE WHEN y THEN z ELSE NULL END ELSE NULL END)

I thought this would fail to be honest but it seems to work fine. We could try and merge the query with the inner CASE to produce a single statement, but to be honest I don't think it's worth it.

I also changed the __repr__ of Aggregate to include the query:

Sum(F(test)) WHERE (AND: ('name', 'test')).

Also using Count(*) with 'filter' isn't supported, at least with CASE statements. I think it would work with postgres and the FILTER (WHERE...), but it's easiest to just catch it in the __init__ for all cases I think.

How can I do a test run on Oracle? It would be good to just do one test run about now to see if it passes.

@charettes
Copy link
Member

Hey @orf,

We could try and merge the query with the inner CASE to produce a single statement, but to be honest I don't think it's worth it.

As you said, I don't think it's worth it.

Here's some adjustments that get rid of a lot boiler plate.

master...charettes:filter-aggregation

How can I do a test run on Oracle? It would be good to just do one test run about now to see if it passes.

I'll do that just now.

It would be great to have tests for aggregate that takes more than a single expression as well. E.g. a test GROUP_CONCAT might be working on all supported backend (GroupConcat(F('field'), Value(';'), filter=Q(field__contains='data')))

@charettes
Copy link
Member

buildbot, test on oracle.

@orf
Copy link
Contributor Author

orf commented Apr 29, 2017

Thank you very much for those changes @charettes, they look excellent. I have cherry picked them and rebased on master.

I agree about the tests, but I could not find any core aggregates that support multiple parameters. GROUP_CONCAT is supported on some backends but I'm not sure which ones. Do you want me to add a custom GroupConcat expression just for the tests, or am I getting the wrong end of the stick? I did add a test for the postgres StringAgg though.

I also cannot see the results of the oracle test, I had a look in Jenkins but could see no build triggered. Assuming that it passes, and that multiple parameter filtered expressions work as expected, shall I start adding some documentation?

Edit: I did change your __repr__ method to call super().__repr__ if no filter is specified. A few tests broke as it output filter=None, and I think it's best to only output filter=x if filter is specified. If you disagree I can fix up the tests to expect filter=x from the repr() calls.

Copy link
Member

@charettes charettes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would be great to have test coverage for the annotate() case and filtering across relations and m2m fields as well.

This will also require documentation.

from django.contrib.postgres.aggregates import StringAgg
q = Q(name='test')
agg = StringAgg(F('name'), delimiter=';', filter=q)
self.assertEqual(Author.objects.aggregate(name=agg)['name'], 'test')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since StringAgg is only implemented on PostgreSQL which uses FILTER and not the CASE(WHEN()) fallback this doesn't actually test the source_expressions[0] wrapping. Since we don't have any builtin aggregate with multiple argument this test might not be required after all.

@orf
Copy link
Contributor Author

orf commented May 7, 2017

I added some extra tests for m2m and foreign key relations, and added an initial bit of documentation, however I found an issue with the current implementation. Running this query:

pages_annotate = Sum('book__pages', filter=Q(book__rating__gt=3))
age_agg = Sum('age', filter=Q(total_pages__gte=400))
Author.objects.annotate(total_pages=pages_annotate).aggregate(summed_age=age_agg)

will throw an error. I added a test called test_filtered_aggregate_on_annotate for it, and the reason it errors as far as I can tell is that when you call .aggregate() that relies on an annotated field it uses the source_expression of the annotation to get the field, which winds up throwing an error in Aggregate.resolve_expression as the .filter attribute is already resolved to a WhereNode.

If you bypass this error it will generate a query similar to this:

SELECT sum(CASE WHERE ... THEN sum(..)) FROM (... inner query)

Which throws a SQL error about misusing the sum() aggregate.

I've spent half a day trying to dig into this but I can't find a suitable solution. I will work on some more documentation this week.

@charettes
Copy link
Member

@orf, thanks for your continued efforts on this, I'll try to have a look at the annotate().aggregate() issue this week!

@orf
Copy link
Contributor Author

orf commented May 18, 2017 via email

@charettes
Copy link
Member

nice work @orf, I'll try to have a look at this issue this week-end.

@charettes
Copy link
Member

Hey @orf! I spent some time tonight trying to figure out what could be wrong and after 2 hours of hair pulling I identified the issue.

When the ORM performs annotation aggregation it pushes the annotation into a subquery and perform the aggrations on it. The code that takes care of the alias rewrites was simply not aware of the filter attribute and wasn't rewriting it's content.

I've submited a PR to your branch with the fix but it's really just a hack. I think we should try shoving being a better get_source_expressions()/set_source_expression() citizen instead and stuff filter in it but my initial attempts are breaking other stuff.

@charettes charettes changed the title [WIP] Support shorthand syntax for filtered aggregates Fixed #27849 -- Added filtering support to aggregates. Jun 9, 2017
@charettes
Copy link
Member

charettes commented Jun 9, 2017

Thanks for merging the last adjustments!

I'll give it a review in the next few days but given it includes code from me as well I think it would be great if we could find someone else to review my changes.

Do you think you could take care of fixing the flake8/docs failures and the probable adjustments to the documentation for the remaining part of the review?

return [e._output_field_or_none for e in super().get_source_expressions()]

def get_source_expressions(self):
source_expressions = super(Aggregate, self).get_source_expressions()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be a bare super().

if self.filter:
self.filter = exprs[-1]
exprs = exprs[:-1]
return super(Aggregate, self).set_source_expressions(exprs)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one as well.

if not summarize:
expressions = c.get_source_expressions()
expressions = super(Aggregate, c).get_source_expressions()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might want to mention we use super() to avoid checking against self.filter.

@@ -226,6 +226,11 @@ Models
number of rows fetched by the Python database client when streaming results
from the database. For databases that don't support server-side cursors, it
controls the number of results Django fetches from the database adapter.
Aggregations
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs a new line. Maybe drop the s?

Aggregations
~~~~~~~~~~~~

* Built in aggregates now accept a `filter` parameter that can be used to
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Double `` around filter. You might want to mention it's using the SQL 2003 FILTER clause when the backend supports it and fallback to When(Case())?


.. versionchanged:: 2.0

The filter argument to annotations was added
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a trailing .?

@@ -359,7 +379,7 @@ Given:

Here's an example with the ``Count`` aggregate::

>>> a, b = Publisher.objects.annotate(num_books=Count('book', distinct=True)).filter(book__rating__gt=3.0)
>>> a, b = Publisher.objects.annotate(num_books=Count('book', distinct=True)).filter(book__rating__gte=3.0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change looks unrelated to our work here?

@orf
Copy link
Contributor Author

orf commented Jun 9, 2017

Thank you so much @charettes! I'll fix up the comments you made and the flake8 etc. I'm also going to try and dive into what your changes exactly do 👍

I could send out a message on the django-dev mailing list to see if anyone is willing to review.

@orf
Copy link
Contributor Author

orf commented Jun 9, 2017

I'm not sure what's occuring here, but I fixed the document and flake8 errors, but it's still failing. It seems like the CI jobs are running the old code!

@charettes
Copy link
Member

It looks like you still have conflicts with docs/releases/2.0.txt. Try rebasing on the latest master and force pushing again.

@orf
Copy link
Contributor Author

orf commented Jun 9, 2017

Thanks! Ok, I've made the changes you requested, I still need to dig into the commits you made, but if you're happy and the tests pass I think it's good to be merged? Anything else you need me to change or do?

About the docs: is the current amount of docs OK? I'm not sure if a separate section is needed, or something more than just the documentation on the parameter. Maybe this could be added later though.

Copy link
Member

@charettes charettes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good, thanks!

The actual documentation is looking good but I'll defer to @timgraham for that as he's the expert here :) I suppose a mention in the conditional aggregation section would be good! How about replacing the current example to use Count('pk', filter=Q(account_type=Client.REGULAR)) instead?

Makes we wonder if we should be using when instead of filter as kwarg here. It seems easier to parse then filter: Count('pk', when=Q(account_type=Client.REGULAR)) 🤷‍♂️

@orf
Copy link
Contributor Author

orf commented Jun 9, 2017 via email

@charettes
Copy link
Member

Would a code sample in the release notes be good idea as well?

Not sure about that. I mean showing up how Count(Case(When(foo='bar', then='field')) can be now be reduced to Count('field', filter=Q(foo='bar')) might be interesting but I'm not we usually include such examples in the release notes.

@orf
Copy link
Contributor Author

orf commented Jun 18, 2017

Sorry, noted! I've amended and rebased on master. Let me know if there is anything else.

Each Author object in the annotated result set will have a ``num_books``
as well as a ``highly_rated_books`` attribute.

.. note::
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a blank line in-between. The line limit is 79 for documentation, so you can easily squeeze the paragraph a little bit.

@orf
Copy link
Contributor Author

orf commented Jun 25, 2017

@atombrella I'm not entirely sure what your comment was referring to. As far as I can see all the text is below the 79 character limit (with the exception of the code snippet). Did you want another blank line between the paragraph and the .. note::?

In any case I changed the note to be a warning, as it is a warning and not a note, and rebased on master.

@atombrella
Copy link
Contributor

@orf Yes, a blank line between warning (or note). It seems you wrapped the paragraph at around 72 characters. You can increase that to 79. My comments are mostly trivial style.

Code looks great :)

... )
... regular=Count('pk', filter=Q(account_type=Client.REGULAR)),
... gold=Count('pk', filter=Q(account_type=Client.GOLD)),
... platinum=Count('pk', filter=Q(account_type=Client.PLATINUM)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a trailing comma

@@ -31,11 +58,39 @@ def default_alias(self):
expressions = self.get_source_expressions()
if len(expressions) == 1 and hasattr(expressions[0], 'name'):
return '%s__%s' % (expressions[0].name, self.name.lower())
raise TypeError("Complex expressions require an alias")
raise TypeError('Complex expressions require an alias')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please revert this unrelated change.

@@ -373,6 +373,15 @@ should define the desired ``output_field``. For example, adding an
The ``**extra`` kwargs are ``key=value`` pairs that can be interpolated
into the ``template`` attribute.

The ``filter`` argument requires a ``django.db.Q`` object which will be used
to filter the rows that are aggregated. See the
:doc:`Aggregation guide </topics/db/aggregation>` for more information on
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would link to the section where it's discussed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I know how, I've been grepping but cannot find an example I can use. I think I'm being dense though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Filtering on annotations

You should probably make a link for that section.

@@ -373,6 +373,15 @@ should define the desired ``output_field``. For example, adding an
The ``**extra`` kwargs are ``key=value`` pairs that can be interpolated
into the ``template`` attribute.

The ``filter`` argument requires a ``django.db.Q`` object which will be used
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

django.db.Q isn't the correct path. Please make it a link and that will fix it.


.. versionchanged:: 2.0

The filter argument to annotations was added.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add doublebackticks around filter.
chop "to annotations"

cls.b1 = Book.objects.create(
isbn='159059725', name='The Definitive Guide to Django: Web Development Done Right',
pages=447, rating=4.5, price=Decimal('30.00'), contact=cls.a1, publisher=cls.p1,
pubdate=datetime.date(2007, 12, 6)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a trailing comma

if self.filter is None:
return super().__repr__()

return '{}({}, filter={})'.format(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't be part of this PR, but it looks like redundancy in __repr__() methods would be a good candidate for a refactor.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure? This just displays the filter in the Aggregate's repr, which I think is pretty important. Can you suggest another way to include it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, sorry, I misread your comment. I thought you where saying this change should not be part of this PR. I agree about the redundancy in __repr__ being refactored.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that you can add filter to kwargs in __init__ method and remove redundant __repr__ (see #8759).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Passing filter to kwarg will cause it to be in self.extra as well which could interfere as_sql() formatting.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, thanks. I proposed Aggregate.__repr__() refactoring in #8759.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@orf Please rebase from master and use _get_repr_options() instead of __repr__().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@@ -236,6 +236,14 @@ Models
from the database. For databases that don't support server-side cursors, it
controls the number of results Django fetches from the database adapter.

Aggregation
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can chop this heading. The sections are alphabetized anyway.


* Built in aggregates now accept a ``filter`` parameter that can be used to
exclude elements from an aggregate based on a condition. On backends that
support it the SQL 2003 ``FILTER WHERE`` syntax is used, otherwise a standard
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this implementation detail is important, it seems odd to mention it in the release notes but not in the docs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's good to note somewhere in the documentation, maybe not in the release notes however.

Aggregation
~~~~~~~~~~~

* Built in aggregates now accept a ``filter`` parameter that can be used to
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please link to the docs that discuss this in more detail.

@@ -31,11 +58,39 @@ def default_alias(self):
expressions = self.get_source_expressions()
if len(expressions) == 1 and hasattr(expressions[0], 'name'):
return '%s__%s' % (expressions[0].name, self.name.lower())
raise TypeError("Complex expressions require an alias")
raise TypeError('Complex expressions require an alias')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please undo this change (it's the only double-quote to single-quote change left).

... )
... regular=Count('pk', filter=Q(account_type=Client.REGULAR)),
... gold=Count('pk', filter=Q(account_type=Client.GOLD)),
... platinum=Count('pk', filter=Q(account_type=Client.PLATINUM)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You miss a closing parenthesis.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 well spotted!

@orf
Copy link
Contributor Author

orf commented Jun 30, 2017

I've made most of the changes Tim requested. I feel the docs are still a bit shoddy, but I'm unsure of how to fix that. For example, the most interesting example use case is under the Conditional Expressions documentation, but it seems weird to link to that directly from the changelog.

I'm also not sure how to link to specific documentation sub-sections 👎

On backends that support it the `SQL 2003 FILTER WHERE`_ syntax is used, otherwise
this is emulated using a ``CASE`` statement.

.. _`SQL 2003 FILTER WHERE`: http://modern-sql.com/feature/filter
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the policy is not to include links to third-party pages, but only to official documentation or Wikipedia, but the closest thing in the SQL:2003 article was a highlight about increased OLAP functionality.

Perhaps a SQL example is better to illustrate what it translates into if the SQL:2003 filter is implemented?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, I'll add an example

@@ -373,6 +373,15 @@ should define the desired ``output_field``. For example, adding an
The ``**extra`` kwargs are ``key=value`` pairs that can be interpolated
into the ``template`` attribute.

The ``filter`` argument requires a ``django.db.Q`` object which will be used
to filter the rows that are aggregated. See the
:doc:`Aggregation guide </topics/db/aggregation>` for more information on
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Filtering on annotations

You should probably make a link for that section.

@orf
Copy link
Contributor Author

orf commented Jul 5, 2017

I've finished making the changes requested in the review. I've added an example of the SQL that is generated when the backend supports FILTER WHERE and when it doesn't to the documentation, I can remove it if it's too much information (but I personally quite like having it there).

I also added section links, re-worked the changelog message and rebased on master.

On backends that support it this aggregate will produce a query leveraging
the SQL 2003 FILTER WHERE syntax and produce a query similar to this::

SELECT count('id') FILTER (WHERE account_type=1) as regular,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a small thing, but the following should be added to the SQL blocks.

.. code-block:: sql

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed, thanks!

Copy link
Member

@felixxm felixxm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you should also add tests with filter argument to the expressions.tests.ReprTests.test_aggregates.

If you need two annotations with two separate filters you can use
the ``filter`` argument with any aggregate. For example, to generate
a list of authors with a count of highly rated books you can
issue the query::
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please wrap docs at 79 chars, e.g.

If you need two annotations with two separate filters you can use the
``filter`` argument with any aggregate. For example, to generate a list of
authors with a count of highly rated books you can issue the query::

@orf
Copy link
Contributor Author

orf commented Jul 20, 2017

I've made the changes as requested, and can add tests for the _get_repr_options stuff once #8793 is merged.

I used the {**x} syntax which is 3.5+, which should be OK for master, but the tests seem to be running on 3.4 which obviously fails. For now I will replace {**...} with dict(), but I'm loathe to do so.

@timgraham
Copy link
Member

The decision about whether or not to support Python 3.4 for Django 2.0 is still up for debate.

self.assertEqual(repr(StdDev('a', filter=filter)), "StdDev(F(a), filter=(AND: ('a', 1)), sample=False)")
self.assertEqual(repr(Sum('a', filter=filter)), "Sum(F(a), filter=(AND: ('a', 1)))")
self.assertEqual(repr(Variance('a', sample=True, filter=filter)),
"Variance(F(a), filter=(AND: ('a', 1)), sample=True)")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use hanging indentation due to the coding-style:

self.assertEqual(
    repr(Variance('a', sample=True, filter=filter)),
    "Variance(F(a), filter=(AND: ('a', 1)), sample=True)",
)

def test_aggregate_str(self):
q = Q(name='test')
agg = Sum('test', filter=q)
self.assertIn(str(q), str(agg))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think test_aggregate_str can be removed.

@@ -15,6 +15,74 @@
from .models import Author, Book, Publisher, Store


class FilteredAggregateTests(TestCase):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO it would be worth to add another Author e.g.:

cls.a3 = Author.objects.create(name='name', age=40)
cls.a1.friends.add(cls.a3)

because in most cases you test only single row aggregations. Maybe e.g.

def test_filtered_aggregates(self):
    agg = Sum('age', filter=Q(name__startswith='test'))
    self.assertEqual(Author.objects.aggregate(age=agg)['age'], 100)

or

def test_case_aggregate(self):
    agg = Sum(Case(When(friends__age=40, then=F('friends__age'))), filter=Q(friends__name='test'))
    self.assertEqual(Author.objects.aggregate(age=agg)['age'], 40)

instead of current versions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, I've changed the tests to try and rely on more than one row.