Histogram Aggregation accepts Dates and Date Intervals #23193

pickypg · 2017-02-15T20:50:15Z

While working with some example data set, I witnessed a user accidentally fire off a histogram aggregation against a date field with a date interval:

PUT /test/type/1
{
  "date": "2017-02-15T17:39:00.000Z"
}

PUT /test/type/2
{
  "date": "2017-02-15T17:40:00.000Z"
}

GET /test/_search
{
  "size": 0,
  "aggs": {
    "my_histogram": {
      "histogram": {
        "field": "date",
        "interval": "1d"
      }
    }
  }
}

A few things of interest:

I would expect this to block the request because the interval is invalid with respect to the histogram agg.
The interval is ignored, and the value is defaulted to 1. This results in the above example creating around 60000 buckets because the dates are 1 minute apart (given that they're really numbers in the background).
- For the above example, the "took" time is 8ms, but the actual time to receive it is disturbingly long because of the number of buckets.

The text was updated successfully, but these errors were encountered:

ChineseElectricPanda · 2017-03-17T01:11:56Z

Hi, I've had a look into this issue and it seems like the interval is being parsed as a numerical value - in this case, 1d is being parsed as (double) 1. The same happens if 1f is entered instead. Trying the request with an interval value of 1000d results in a bucket size of 1000 (1 second), so I don't think it's defaulting to 1, rather just parsing 1d as 1. Putting a differently formatted time interval (like 24h) returns a parsing exception as expected.

I'm new to the elasticsearch project, so I'm not sure if it is intended behavior for the histogram agg to parse interval strings with suffix d or f as numbers. The parser could be changed to reject any interval value with letters in it, but that would probably break some other use cases.

ppf2 · 2017-11-13T20:48:52Z

+1 We encountered a similar incident where it caused ES to OOM.

          "histogram" : {
            "field" : "crawl_date",
            "interval" : 1
          }

markharwood · 2018-03-19T14:49:11Z

Since #27581 elasticsearch should now have the necessary safeguards to prevent an OOM situation.

Generally speaking we strive for consistency in our APIs and "1d" is a valid number but having said that it is also a very common choice of interval for data histograms. The consequences of getting the wrong choice of histogram should not now be catastrophic (no OOMs) but the question remains how many users would know that the next step would be to pick a date_histogram rather than a regular histogram.
Maybe the correct answer here is to reject the search if the target field is a date field and the chosen interval uses date-like expressions. Or maybe just completely reject histogram aggs on date fields?

cc @elastic/es-search-aggs

polyfractal · 2020-06-18T15:22:08Z

I had a closer look at this, to see if it still affects master. Answer is "Yes", but the reason for it is actually subtle.

The histogram field doesn't support date intervals as it turns out. For example, this fails:

{
  "size": 0,
  "aggs": {
    "my_histogram": {
      "histogram": {
        "field": "date",
        "interval": "1m"
      }
    }
  }
}

{
  "error" : {
    "root_cause" : [
      {
        "type" : "x_content_parse_exception",
        "reason" : "[7:21] [histogram] failed to parse field [interval]"
      }
    ],
    "type" : "x_content_parse_exception",
    "reason" : "[7:21] [histogram] failed to parse field [interval]",
    "caused_by" : {
      "type" : "number_format_exception",
      "reason" : "For input string: \"1m\""
    }
  },
  "status" : 400
}

The reason 1d works is because we are using Java's string-to-decimal conversion tools, and those tools accept various numeric conventions. E.g. x is used in hex notation (0x00), e or E represents scientific notation, etc. Part of those conventions also say that d/D == "double", and f/F == "float"

So "1d" is a valid numerical interval because that says "1.0 as double".

I'm honestly not sure how we would fix this in a sane manner... and if it's even a thing we want to "fix" (since it'd break anyone who is using those legitimate conventions). I'll mark this as team-discuss... but I'm thinking we might just want to document the behavior and close.

polyfractal · 2020-07-02T16:22:55Z

We discussed this today in our team meeting:

We don't think we can "fix" this behavior, since it is also valid elsewhere (e.g. you can specify 1d when indexing a double field, and it is interpreted as 1(double) ).
We decided to document the behavior with a big warning so that users are aware
We considered deprecating/removing the ability to run histogram on date fields, which would fix the issue in a different way. But we decided there are more egregious combinations that need fixing first (date_histo on boolean fields), and people not-uncommonly interchange dates and longs-as-millis, so this could be a potentially controversial deprecation. It's not a huge maintenance burden so probably not worth deprecating at this time.

bzick · 2021-02-04T09:29:53Z

Can you add a message to the log about a large number of buckets?

Some ES clients allow for this kind of error with aggregations. Finding an error in a large cluster is very difficult and time-consuming. A log message could simplify diagnosis of this a problem.

This bug led to the failure of the entire cluster. :(

not-napoleon · 2022-01-12T16:33:53Z

We discussed this again on 2022-01-12. This behavior is confusing, but it's also essentially intended. There are many ways that aggregations can generate too many buckets and run into trouble, and we do our best to find and fix those (e.g. by adding more things to our memory circuit breakers). We aren't going to change the parsing behavior and we aren't going to drop support for running numeric histograms on date fields, so I don't see anything actionable on this issue. Closing based on that assessment.

pickypg added :Analytics/Aggregations Aggregations >bug labels Feb 15, 2017

clintongormley added the help wanted adoptme label Feb 16, 2017

rjernst added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label May 4, 2020

$@polyfractal$ polyfractal added team-discuss and removed help wanted adoptme labels Jun 18, 2020

$@polyfractal$ polyfractal removed the team-discuss label Jul 2, 2020

not-napoleon closed this as completed Jan 12, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Histogram Aggregation accepts Dates and Date Intervals #23193

Histogram Aggregation accepts Dates and Date Intervals #23193

pickypg commented Feb 15, 2017

ChineseElectricPanda commented Mar 17, 2017

ppf2 commented Nov 13, 2017

markharwood commented Mar 19, 2018

polyfractal commented Jun 18, 2020

polyfractal commented Jul 2, 2020

bzick commented Feb 4, 2021 •

edited

Loading

not-napoleon commented Jan 12, 2022

Histogram Aggregation accepts Dates and Date Intervals #23193

Histogram Aggregation accepts Dates and Date Intervals #23193

Comments

pickypg commented Feb 15, 2017

ChineseElectricPanda commented Mar 17, 2017

ppf2 commented Nov 13, 2017

markharwood commented Mar 19, 2018

polyfractal commented Jun 18, 2020

polyfractal commented Jul 2, 2020

bzick commented Feb 4, 2021 • edited Loading

not-napoleon commented Jan 12, 2022

bzick commented Feb 4, 2021 •

edited

Loading