Percentiles aggregations are always keyed and suggestion on non keyed response #5870

Mpdreamz · 2014-04-18T09:56:08Z

I realize the percentiles aggregations is still experimental which is probably the cause for this:

https://github.com/elasticsearch/elasticsearch/blob/master/src/main/java/org/elasticsearch/search/aggregations/metrics/percentiles/PercentilesParser.java#L60

The routine that is currently in place to write the percentiles non_keyed output will write the aggregation like this:

"aggs" : {
  "my_percentiles": [
       { .. }, 
       { .. }
   ]
}

https://github.com/elasticsearch/elasticsearch/blob/master/src/main/java/org/elasticsearch/search/aggregations/metrics/percentiles/InternalPercentiles.java#L131

Making it the only aggregation to directly return an array instead of within a wrapped object.

"aggs" : {
    "my_percentiles": { 
        percentiles: [
           { .. }, 
           { .. }
       ]
    }
}

Which makes the response very similar the non keyed range aggregation response:

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-aggregations-bucket-range-aggregation.html#search-aggregations-bucket-range-aggregation

The text was updated successfully, but these errors were encountered:

uboness · 2014-04-18T10:32:04Z

@Mpdreamz percentiles are not a bucketing agg. It's a metrics agg, and like all other metrics aggs, it's not keyed (see min/stats/etc...)

Mpdreamz · 2014-04-18T10:54:45Z

I realise its not a bucketing agg :)

The code path for the percentiles agg always hits the if (keyed)` code path. Making the else routine dead code.

The else routine introduces an array at a position all the other aggregations introduce an object which would make parsing the aggregations generically much much harder.

i.e "name_of_agg" : [ vs "name_of_agg : {

If we can remove the dead code and thus the chance to introduce an array at that position that would be great too.

uboness · 2014-04-18T11:06:05Z

@Mpdreamz Not sure I'm following you tbh... maybe I'm missing something...

if you send "keyed" : false, the else path is executed, so it's not really a dead code. We decided to make the object form the default (ie. "keyed" : true) as we believe it's probably the form most ppl would like to get back. Like with other aggs, we did leave an option to get the percentiles as an array of values

Mpdreamz · 2014-04-18T12:44:55Z

Ok my bad since keyed usually defaults to false (i.e range aggregation).

keyed also usually specifies the behaviour of an inner property (i.e buckets property inside a range aggregation) where as with percentiles it controls how the entire aggregation is returned.

More specifically:

"aggregations": {
      "myagg": [
         {
            "key": 1,
            "value": 60.4
         },
         {
            "key": 5,
            "value": 62
         },
         {
            "key": 25,
            "value": 70
         },

For keyed:false responses I would much rather see it return

"aggregations": {
      "myagg": {
         values: [
         {
            "key": 1,
            "value": 60.4
         },
         {
            "key": 5,
            "value": 62
         },
         {
            "key": 25,
            "value": 70
         }
         ]
     }
}

All other aggregations follow the pattern "name_of_agg": <start_of_object> even simple metrics such as min/max

"max" : {
    value: 10
}

The way nonkeyed percentiles are implemented right now feels like this:

"max" : 10

And (as far as I could tell) non keyed percentiles are the only ones breaking the pattern here.

uboness · 2014-04-18T14:25:23Z

yeah.. agree, I think "values" : [] is more consistent and also makes more sense (as it's future proof)

uboness · 2014-05-07T16:02:16Z

We decided to change the response structure and instead of nesting all the percentiles directly under the aggregation name, nest it under an intermediate values object (or when the keyed flag is set false under a values array).

This is a breaking change but we feel it's important to make it while percentiles are still considered experimental. The new format is more future proof as it'll allow us to potentially add additional info under the aggregation in a later phase if there'll be a need for it. The new format is also somewhat more consistent with the other metrics aggs.

…ow all the percentiles are placed under a `values` object (or `values` array in case the `keyed` flag is set to `false` Closes #5870

jpountz added v1.2.0 labels May 2, 2014

uboness self-assigned this May 7, 2014

uboness mentioned this issue May 7, 2014

Changed the respnose structure of the percentiles aggregation #6079

Closed

jpountz added the breaking label May 7, 2014

uboness added a commit that referenced this issue May 7, 2014

Changed the respnose structure of the percentiles aggregation where n…

e0523d3

…ow all the percentiles are placed under a `values` object (or `values` array in case the `keyed` flag is set to `false` Closes #5870

uboness closed this as completed in fc52db1 May 7, 2014

clintongormley added the :Analytics/Aggregations Aggregations label Jun 6, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Percentiles aggregations are always keyed and suggestion on non keyed response #5870

Percentiles aggregations are always keyed and suggestion on non keyed response #5870

Mpdreamz commented Apr 18, 2014

uboness commented Apr 18, 2014

Mpdreamz commented Apr 18, 2014

uboness commented Apr 18, 2014

Mpdreamz commented Apr 18, 2014

uboness commented Apr 18, 2014

uboness commented May 7, 2014

Percentiles aggregations are always keyed and suggestion on non keyed response #5870

Percentiles aggregations are always keyed and suggestion on non keyed response #5870

Comments

Mpdreamz commented Apr 18, 2014

uboness commented Apr 18, 2014

Mpdreamz commented Apr 18, 2014

uboness commented Apr 18, 2014

Mpdreamz commented Apr 18, 2014

uboness commented Apr 18, 2014

uboness commented May 7, 2014