Ability to assume missing field as zero in aggregations #5298

bobrik · 2014-02-28T15:14:41Z

To keep docs reasonably small we omit fields that has zero value, but when we use avg or extended_stats aggregation it would be nice to make missing values assumed to be zeroes too.

In example below we have 31 970 816 docs in bucket, but only 7 310 of them have non-zero value.

{
   "aggregations": {
      "country": {
         "buckets": [
            {
               "key": "RU",
               "doc_count": 31970816,
               "cents": {
                  "count": 7310,
                  "min": 8,
                  "max": 169800,
                  "avg": 514.1964432284542,
                  "sum": 3758776,
                  "sum_of_squares": 60978796462,
                  "variance": 8077434.639111836,
                  "std_deviation": 2842.0827994820693
               }
            }
        ]
    }
}

Maybe additional boolean parameter could be introduced for extended_stats and avg aggregations, like assume_zeroes?

cc @uboness

The text was updated successfully, but these errors were encountered:

roytmana · 2014-03-01T00:35:36Z

I was about to sumitba request on this too.
Imi would suggest that anyvaggregation operating on a field should have missing option. If specified, aggregation should accumulate missing values under that value and honor any nested aggregations within. It should never assume any value like 0 since it may clash with actual keys.

I was planning to show examples of enormous query that is needed for a two lecel aggregation that has to cover all values including missing and other for both levels using missing aggregation. It can be done but not only the query is huge and highly repetitive the result need to be heavily processed to move second level keys nested under missing agg into the first level buckets.

Please please do implement missing as an option in all bucketing aggs!

I am not even asking to have an option to also aggregare other - keys that were not used due to size parameter although it would be veru useful :-)

jpountz · 2014-09-05T11:03:44Z

I agree the behavior feels wrong with the avg or stats aggregations. Maybe we could support a missing option like sorting does.

jpountz · 2015-05-15T14:34:39Z

Closed through #11042. Most aggs now have a missing option that allows to configure the value to consider when a document has no values.

clintongormley added the discuss label Jul 10, 2014

jpountz added adoptme and removed discuss labels Sep 5, 2014

clintongormley added the :Analytics/Aggregations Aggregations label Dec 29, 2014

jpountz closed this as completed May 15, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ability to assume missing field as zero in aggregations #5298

Ability to assume missing field as zero in aggregations #5298

bobrik commented Feb 28, 2014

roytmana commented Mar 1, 2014

jpountz commented Sep 5, 2014

jpountz commented May 15, 2015

Ability to assume missing field as zero in aggregations #5298

Ability to assume missing field as zero in aggregations #5298

Comments

bobrik commented Feb 28, 2014

roytmana commented Mar 1, 2014

jpountz commented Sep 5, 2014

jpountz commented May 15, 2015