Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Weird response with range agg on float field #81749

Closed
stratoula opened this issue Dec 15, 2021 · 9 comments
Closed

Weird response with range agg on float field #81749

stratoula opened this issue Dec 15, 2021 · 9 comments
Assignees
Labels
:Analytics/Aggregations Aggregations >bug Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo)

Comments

@stratoula
Copy link

stratoula commented Dec 15, 2021

Description of the problem

We found a bug that is easy to reproduce it in kibana with Lens ranges on float field.
Specifically we created a float field and we apply the range aggregation on it. The request looks like that:

{
  "aggs": {
    "0": {
      "range": {
        "field": "test",
        "ranges": [
          {
            "to": 6,
            "from": 5
          },
          {
            "to": 10.6,
            "from": 6
          }
        ],
        "keyed": true
      }
    }
  },
  "size": 0,
  "fields": [],
  "script_fields": {},
  "stored_fields": [
    "*"
  ],
  "runtime_mappings": {},
  "_source": {
    "excludes": []
  },
  "query": {
    "bool": {
      "must": [],
      "filter": [],
      "should": [],
      "must_not": []
    }
  }
}

You can see that the requested ranges have decimal numbers.
The response we get though seems like that:

{
  "id": "FkUzMjU2dEt0VEJpeDNnTmNlSUgwSkEfRnBBb1BSWDdSTzY5U2szU3pPT1hDQToxMjAzNjgwMg==",
  "rawResponse": {
    "took": 0,
    "timed_out": false,
    "_shards": {
      "total": 1,
      "successful": 1,
      "skipped": 0,
      "failed": 0
    },
    "hits": {
      "total": 3,
      "max_score": null,
      "hits": []
    },
    "aggregations": {
      "0": {
        "buckets": {
          "5.0-6.0": {
            "from": 5,
            "to": 6,
            "doc_count": 1
          },
          "6.0-10.600000381469727": {
            "from": 6,
            "to": 10.600000381469727,
            "doc_count": 2
          }
        }
      }
    }
  },
  "isPartial": false,
  "isRunning": false,
  "total": 1,
  "loaded": 1,
  "isRestored": false
}

You can see that the second bucket is not the same with the response.

Elasticsearch version (bin/elasticsearch --version):
7.16

Steps to reproduce:

  1. Create an index and map the field to float
PUT myindex 
{
 "mappings": {
   "properties": {
     "test": {
       "type": "float"
     }
   }
 }  
}
PUT myindex/_bulk
{"index": {"_id": "1"}}
{"test": 10}
{"index": {"_id": "2"}}
{"test": 5.5}
{"index": {"_id": "3"}}
{"test": 6.7}
  1. Get the range agg with ranges that have decimal numbers
  2. Check the response

In 7.15.2 this worked fine. I think it was introduced in 7.16. (I replicated in a 7.16.1 instance)

@stratoula stratoula added >bug needs:triage Requires assignment of a team area label labels Dec 15, 2021
@romseygeek romseygeek added :Analytics/Aggregations Aggregations and removed needs:triage Requires assignment of a team area label labels Dec 15, 2021
@elasticmachine elasticmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Dec 15, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-analytics-geo (Team:Analytics)

@salvatore-campagna
Copy link
Contributor

salvatore-campagna commented Dec 15, 2021

Maybe this is a consequence of #78932?

@salvatore-campagna
Copy link
Contributor

salvatore-campagna commented Dec 15, 2021

I tried to reproduce the issue with two unit tests...one using floats and one using doubles. It looks like that using doubles, keys are "correct" ("5.0-6.0" and "6.0-10.6"), while they are not when using floats.

salvatore-campagna added a commit to salvatore-campagna/elasticsearch that referenced this issue Dec 16, 2021
Keys derived from float values seem incorrect due to some rounding
issue and they do not always match the original ranges as specified
in the aggregation query. Ideally the user expects keys in the response
to match ranges in the request. When using double values, keys match
expected values.

These testsq covers a bug reported in elastic#81749.
salvatore-campagna added a commit to salvatore-campagna/elasticsearch that referenced this issue Dec 17, 2021
The idea is to generate the key using the `from` and `to` values
from double values instead of using the stored float values.
The stored float values can still be used for all bucket comparisons.

This fix addresees issue elastic#81749.
@salvatore-campagna
Copy link
Contributor

All tests are green now. I will proceed with the backporting as soon as the review is completed. Will try to follow-up on this with the team so that maybe we can have a fix merged on Monday or Tuesday. I will backport this to: 7.16, 7.17 and 8.0.

@imotov
Copy link
Contributor

imotov commented Dec 24, 2021

Back porting to 7.17 and 8.0.0 is enough.

@prakx87
Copy link

prakx87 commented Jan 5, 2022

We got the same issue in elasticsearch:7.16.2.

@knorpsiu
Copy link

I have this issue too. Strangely, if I use a range of {"from": 0.0, "to": 0.5} (and another {"from": 0.5}) it works...

@salvatore-campagna
Copy link
Contributor

salvatore-campagna commented Jan 12, 2022

Back-porting to version 7.17.0 and 8.0.0 completed 🥳

@salvatore-campagna
Copy link
Contributor

Closing this as a result of merging the fix in #81801 and back-porting the fix to version 7.17.0 and 8.0.0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Analytics/Aggregations Aggregations >bug Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo)
Projects
None yet
Development

No branches or pull requests

7 participants