Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Aggregations API #21

Open
dvirsky opened this issue Feb 15, 2018 · 4 comments
Open

Aggregations API #21

dvirsky opened this issue Feb 15, 2018 · 4 comments
Assignees

Comments

@dvirsky
Copy link
Contributor

dvirsky commented Feb 15, 2018

This is a proposed API for aggregations.

The aggregation pipeline is very suitable for a builder/fluent style API. It basically includes the following elements, repeating and transforming the pipeline:

  1. The base filter query (non repeating)
  2. Load - load properties from the document (if they are not in the sortables)
  3. Group by (with its reducers)
  4. Sort by
  5. Apply expression on values
  6. Limit

These are chained repeatably to transform the pipeline. So here's what I have in mind:

# query can be a string or a structured query object
req = AggregateRequest(query)
         .load('@foo', '@bar')
         .group_by(('@foo', '@bar'), 
                   #reducers 
                   count().as('total'),
                  count_distinct('@bar').as('num_bars'),
                  # alternative proposal
                  num_bars = count_distinct('@bar')
          )
          .apply("sqrt(@foo/@num_bars)", as='sqr')
          .sort_by(Sort.desc('@sqr'), Sort.asc('@other'), max_results = 100)
          .limit(0, 10)
          
resp = client.aggregate(req)  
@filipecosta90
Copy link
Collaborator

filipecosta90 commented Sep 25, 2019

@mnunberg filter expressions are still not supported on the python client correct? Example from documentation.

FT.AGGREGATE 
  ...
  FILTER "@name=='foo' && @age < 20"
  ...

@stryt2
Copy link

stryt2 commented Sep 27, 2019

Hello,

Under the APPLY {expr} AS {name} section in parameters in detail from the official Redisearch page, it is said that

APPLY ... can be referenced by further APPLY / SORTBY / GROUPBY / REDUCE operations down the pipeline.

However, looking at the current build_args method for the AggregateRequest, the GROUPBY keyword and its fields always come before the APPLY keyword and its fields. E.g.

import redisearch

aggregate_request = redisearch.aggregation.AggregateRequest()
# Call 1
aggregate_request.apply(foo="@bar / 2").group_by("@foo", redisearch.reducers.count())
# or Call 2
# aggregate_request.group_by("@foo", redisearch.reducers.count()).apply(foo="@bar / 2")

print(aggregate_request.build_args())

would have 2 calls (Call 1 and Call 2) both resulting as (irrespective of the order of methods),

['*', 'GROUPBY', '1', '@baz', 'REDUCE', 'COUNT', '0', 'APPLY', '@bar / 2', 'AS', 'foo']

 

However, shouldn't the expected behaviour of Call 1 being

['*', 'APPLY', '@bar / 2', 'AS', 'foo', 'GROUPBY', '1', '@baz', 'REDUCE', 'COUNT', '0']

i.e. the order of the keywords will be dependant upon the order of call?

 

The reason why this is an issue is that Call 2 would result an error (if field foo does not originally exist) saying

No such property foo

whereas Call 1 should not.

 

Any help to get around this issue is greatly appreciated. Thanks.

@filipecosta90
Copy link
Collaborator

hi there @stryt2, we were discussing the same error internally as we're revising this client and extending to redisearch-go client ( trying to make it have the same look and feel like this one ) and found the exact same problem as you.
We should have a PR to correct the above very soon.

@stryt2
Copy link

stryt2 commented Sep 27, 2019

Glad to hear that. Thanks very much.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants