Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added `filters` aggregation #6119

Closed
wants to merge 1 commit into from

Conversation

Projects
None yet
7 participants
@uboness
Copy link
Contributor

uboness commented May 11, 2014

A multi-bucket aggregation where multiple filters can be defined (each filter defines a bucket). The buckets will collect all the documents that match their associated filter.

This aggregation can be very useful when one wants to compare analytics between different criterias. It can also be accomplished using multiple definitions of the single filter aggregation, but here, the user will only need to define the sub-aggregations only once.

Closes #6118

Added Filters aggregation
A multi-bucket aggregation where multiple filters can be defined (each filter defines a bucket). The buckets will collect all the documents that match their associated filter.

This aggregation can be very useful when one wants to compare analytics between different criterias. It can also be accomplished using multiple definitions of the single filter aggregation, but here, the user will only need to define the sub-aggregations only once.

Closes #6118

@uboness uboness added the review label May 11, 2014

@uboness

This comment has been minimized.

Copy link
Contributor Author

uboness commented May 11, 2014

"filters" : {
"filters" : {
"errors" : { "query" : { "match" : { "body" : "error" } } },
"warnings" : { "query" : { "match" : { "body" : "error" } } }

This comment has been minimized.

Copy link
@jpountz

jpountz May 12, 2014

Contributor

I think the 2nd query should look for warning in the body instead of error?

This comment has been minimized.

Copy link
@uboness

uboness May 12, 2014

Author Contributor

indeed

"aggs" : {
"messages" : {
"filters" : {
"filters" : {

This comment has been minimized.

Copy link
@jpountz

jpountz May 12, 2014

Contributor

Should it be consistent with the filter aggregation and expect filters to be directly defined in the body of the aggregation definition instead of repeating filters twice?

This comment has been minimized.

Copy link
@uboness

uboness May 12, 2014

Author Contributor

problematic... you can also specify addition settings, such as keyed... this requires additional level. I couldn't come up with a proper naming for the filters object...

This comment has been minimized.

Copy link
@jpountz

jpountz May 12, 2014

Contributor

Do we really need keyed here? Maybe it would make sense to enforce a keyed output given that it perfectly matches the way the request is expressed?

This comment has been minimized.

Copy link
@uboness

uboness May 12, 2014

Author Contributor

we could potentially, though it won't be consistent with other buckets aggs that we have, and also, I'm not really sure what's the preferred way ppl would like the response... @rashidkpc any feedback?

public static interface Bucket extends MultiBucketsAggregation.Bucket {
}

Collection<? extends Bucket> getBuckets();

This comment has been minimized.

Copy link
@jpountz

jpountz May 12, 2014

Contributor

It should be just Collection<Bucket>.

This comment has been minimized.

Copy link
@uboness

uboness May 12, 2014

Author Contributor

indeed


private final long bucketOrd(long owningBucketOrdinal, int filterOrd) {
return owningBucketOrdinal * filters.length + filterOrd;
}

This comment has been minimized.

Copy link
@jpountz

jpountz May 12, 2014

Contributor

👍

Bucket bucket = buckets.get(0);
bucket.aggregations.reduce(bigArrays);
return bucket;
}

This comment has been minimized.

Copy link
@jpountz

jpountz May 12, 2014

Contributor

I think we could safely remove this special condition and rely on the logic below without performance penalty in the case of a single bucket?

This comment has been minimized.

Copy link
@uboness

uboness May 12, 2014

Author Contributor

yeah..... we could

@jpountz

This comment has been minimized.

Copy link
Contributor

jpountz commented May 12, 2014

I like the way that this aggregation manages buckets (like the range agg). I left a couple of questions inlined.

@rashidkpc

This comment has been minimized.

Copy link
Member

rashidkpc commented May 12, 2014

Love it, exactly what we needed.

@radu-gheorghe

This comment has been minimized.

You probably want to say "body": "warning" here for the "warnings" filter?

Nice feature though, I'm just looking through the docs :)

@jpountz jpountz added v1.3.0 and removed v2.0.0 labels Jun 13, 2014

@s1monw

This comment has been minimized.

Copy link
Contributor

s1monw commented Jun 18, 2014

@uboness what's the status here? I remove the review tag for now...

@s1monw s1monw removed the review label Jun 18, 2014

@uboness

This comment has been minimized.

Copy link
Contributor Author

uboness commented Jun 18, 2014

I think it's pretty much ready... there's only a debate about the json structur... maybe @clintongormley can have a look

@s1monw

This comment has been minimized.

Copy link
Contributor

s1monw commented Jun 18, 2014

+1 thanks @uboness

@clintongormley

This comment has been minimized.

Copy link
Member

clintongormley commented Jun 19, 2014

@uboness I think it should be:

{
     "aggs" : {
         "messages" : {
             "filters" : {
                 "errors" : { "query" : { "match" : { "body" : "error" } } },
                 "warnings" : { "query" : { "match" : { "body" : "error" } } }
             },
             "aggs" : { "monthly" : { "histogram" : { "field" : "timestamp", "interval" : "1M" } } }
         }
     }
 }

This fits nicely with the filter agg. It doesn't make sense to allow keyed because there is no intrinsic order to the keys of a hash in JSON. If you want the equivalent of keyed: false you could potentially support this format too:

{
     "aggs" : {
         "messages" : {
             "filters" : [
                 { "query" : { "match" : { "body" : "error" } } },
                 { "query" : { "match" : { "body" : "error" } } }
             ],
             "aggs" : { "monthly" : { "histogram" : { "field" : "timestamp", "interval" : "1M" } } }
         }
     }
 }

... but I think that's just providing options for the sake of it.

@uboness

This comment has been minimized.

Copy link
Contributor Author

uboness commented Jun 19, 2014

@clintongormley the idea behind the "keyed" version was to be future proof... it's consistent with other aggs, in the future if we'll need to add meta data to the application (which we actually working on doing in another PR #6465) it'll be mixed with the aggs data... and that's no good

@clintongormley

This comment has been minimized.

Copy link
Member

clintongormley commented Jun 19, 2014

@uboness PR #6465 will work happily with my example structure. The meta element isn't part of the agg body:

{
     "aggs" : {
         "messages" : {
             "filters" : {
                 "errors" : { "query" : { "match" : { "body" : "error" } } },
                 "warnings" : { "query" : { "match" : { "body" : "error" } } }
             },
             "meta": { "foo": "bar" },
             "aggs" : { "monthly" : { "histogram" : { "field" : "timestamp", "interval" : "1M" } } }
         }
     }
 }

The filter agg doesn't have space for any extra params in the body either:

{
     "aggs" : {
         "messages" : {
             "filter" : {"query" : { "match" : { "body" : "error" } } }
             "meta": { "foo": "bar" }
         }
     }
 }

So the only distinction between these filter and filters is the existence of a key, which predicates returning the result as {keyed: true} (as {keyed: false} would be meaningless here.

@uboness

This comment has been minimized.

Copy link
Contributor Author

uboness commented Jun 19, 2014

@clintongormley actually, you're right... my bad.. will work to see how it works out, thx

@colings86 colings86 self-assigned this Jul 16, 2014

@colings86

This comment has been minimized.

Copy link
Member

colings86 commented Aug 1, 2014

Closing as PR 6974 supersedes this

@colings86 colings86 closed this Aug 1, 2014

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.