Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistent return value across metric aggregations when no docs in bucket contain field #29066

Closed
peteharverson opened this issue Mar 14, 2018 · 1 comment · Fixed by #30460

Comments

@peteharverson
Copy link

Elasticsearch version (bin/elasticsearch --version):
Version 6.2.2

Description of the problem including expected versus actual behavior:
Different metric aggregations are returning different values when none of the documents in the bucket contain the field used in the aggregation. avg, min and max for example return null, whereas the percentiles agg returns NaN. I would expect the return values to be consistent across aggregations, whether it be null or NaN.

Ran the following aggregations, where some buckets contained docs without the test.sslTime field.

avg agg:

"aggs": {
    "2": {
      "date_histogram": {
        "field": "createdDate",
        "interval": "15m",
        "time_zone": "Europe/London",
        "min_doc_count": 1
      },
      "aggs": {
        "3": {
          "terms": {
            "field": "test.testId.keyword",
            "size": 5,
            "order": {
              "_term": "desc"
            }
          },
          "aggs": {
            "1": {
              "avg": {
                "field": "test.sslTime"
              }
            }
          }
        }
      }
    }
  }

Percentiles agg, to obtain the median:

"aggs": {
    "2": {
      "date_histogram": {
        "field": "createdDate",
        "interval": "15m",
        "time_zone": "Europe/London",
        "min_doc_count": 1
      },
      "aggs": {
        "3": {
          "terms": {
            "field": "test.testId.keyword",
            "size": 5,
            "order": {
              "_term": "desc"
            }
          },
          "aggs": {
            "1": {
              "percentiles": {
                "field": "test.sslTime",
                "percents": [
                  50
                ],
                "keyed": false
              }
            }
          }
        }
      }
    }
  }

WIth example of the responses:

From the avg agg:

       {
          "3": {
            "doc_count_error_upper_bound": 0,
            "sum_other_doc_count": 0,
            "buckets": [
              {
                "1": {
                  "value": null
                },
                "key": "VAL1",
                "doc_count": 6
              }
            ]
          },
          "key_as_string": "2018-02-03T11:45:00.000Z",
          "key": 1517658300000,
          "doc_count": 6
        }

and from the percentiles agg:

     {
          "3": {
            "doc_count_error_upper_bound": 0,
            "sum_other_doc_count": 0,
            "buckets": [
              {
                "1": {
                  "values": [
                    {
                      "key": 50,
                      "value": "NaN"
                    }
                  ]
                },
                "key": "VAL1",
                "doc_count": 6
              }
            ]
          },
          "key_as_string": "2018-02-03T11:45:00.000Z",
          "key": 1517658300000,
          "doc_count": 6
        }
@colings86 colings86 added the :Analytics/Aggregations Aggregations label Mar 14, 2018
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search-aggs

@colings86 colings86 added the >bug label Apr 24, 2018
polyfractal added a commit that referenced this issue Jun 18, 2018
The other metric aggregations (min/max/etc) return `null` as their XContent value and string when nothing was computed (due to empty/missing fields).  Percentiles and Percentile Ranks, however, return `NaN `which is inconsistent and confusing for the user.  This fixes the inconsistency by making the aggs return `null`.  This applies to both the numeric value and the "as string" value.  

Note: like the metric aggs, this does not change the value if fetched directly from the percentiles object, which will return as `NaN`/`"NaN"`. This only changes the XContent output.

While this is a bugfix, it still breaks bwc in a minor way as the response changes from prior version.

Closes #29066
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants