Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

missing value not considered in min/max aggregations #48905

Closed
mattweber opened this issue Nov 7, 2019 · 8 comments · Fixed by #48970
Assignees

Comments

@mattweber
Copy link
Contributor

@mattweber mattweber commented Nov 7, 2019

The missing value is not considered in min/max aggreagtions in es7 as they were in previous versions and I don't see this documented as a breaking change. I believe this is due optimization of using segment min/max values.

To reproduce run the following against es6 and es7. The missing values should be returned as the min and max value of the aggregation.

curl -XPUT -H"Content-Type: application/json" 'localhost:9200/testmissing/_doc/1' -d '{"title": "with value", "value": 1}'
curl -XPUT -H"Content-Type: application/json" 'localhost:9200/testmissing/_doc/2?refresh' -d '{"title": "missing value"}'
curl -XPOST -H"Content-Type: application/json" 'localhost:9200/testmissing/_search?pretty' -d '{
    "size": 0,
    "aggs": {
        "min_missing": {
            "min": {
                "field": "value",
                "missing": -1
            }
        },
        "max_missinng": {
            "max": {
                "field": "value",
                "missing": 2
            }
        }
    }
}'
@elasticmachine

This comment has been minimized.

Copy link
Collaborator

@elasticmachine elasticmachine commented Nov 8, 2019

Pinging @elastic/es-analytics-geo (:Analytics/Aggregations)

@imotov

This comment has been minimized.

Copy link
Member

@imotov imotov commented Nov 8, 2019

Which version of elasticsearch did you test it on. I just tried it on 7.4.2 and I am getting:

{
  "took" : 12,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 2,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  },
  "aggregations" : {
    "max_missinng" : {
      "value" : 2.0
    },
    "min_missing" : {
      "value" : -1.0
    }
  }
}

What am I missing? (some pun intended)

@mattweber

This comment has been minimized.

Copy link
Contributor Author

@mattweber mattweber commented Nov 8, 2019

You are right, it is fixed on 7.4.2. I was testing on 7.4.1, thanks!

@mattweber mattweber closed this Nov 8, 2019
@mattweber mattweber reopened this Nov 8, 2019
@mattweber

This comment has been minimized.

Copy link
Contributor Author

@mattweber mattweber commented Nov 8, 2019

@imotov Actually no, issue still exists in 7.4.2. Appears to be inconsistent, run the following a couple times deleting the index between runs. Min should be -1 and max 10.

curl -XPUT -H"Content-Type: application/json" 'localhost:9200/testmissing/_doc/1' -d '{"title": "with values", "value": [2, 3, 5]}'
curl -XPUT -H"Content-Type: application/json" 'localhost:9200/testmissing/_doc/2' -d '{"title": "with values", "value": [7, 1]}'
curl -XPUT -H"Content-Type: application/json" 'localhost:9200/testmissing/_doc/3' -d '{"title": "with values", "value": [8, 2, 5, 4]}'
curl -XPUT -H"Content-Type: application/json" 'localhost:9200/testmissing/_doc/4?refresh' -d '{"title": "missing value"}'
curl -XPOST -H"Content-Type: application/json" 'localhost:9200/testmissing/_search?pretty' -d '{
    "size": 0,
    "aggs": {
        "min_missing": {
            "min": {
                "field": "value",
                "missing": -1
            }
        },
        "max_missinng": {
            "max": {
                "field": "value",
                "missing": 10
            }
        }
    }
}'

I actually have the above in code using ESIntegTestCase that reproduces every time I run it.

@imotov

This comment has been minimized.

Copy link
Member

@imotov imotov commented Nov 8, 2019

Could you share this ESIntegTestCase? I cannot reproduce it with rest commands.

@mattweber

This comment has been minimized.

Copy link
Contributor Author

@mattweber mattweber commented Nov 9, 2019

@imotov imotov self-assigned this Nov 11, 2019
@imotov imotov added the >bug label Nov 11, 2019
@imotov

This comment has been minimized.

Copy link
Member

@imotov imotov commented Nov 11, 2019

@mattweber Thanks! I was able to reproduce it using your test and @polyfractal suggested the fix. It turned out the reproduction depends on distribution of documents in segments. That's why I wasn't able to reproduce it in kibana. I am going to open a PR soon.

@mattweber

This comment has been minimized.

Copy link
Contributor Author

@mattweber mattweber commented Nov 11, 2019

Great! Thanks for working on it!

imotov added a commit to imotov/elasticsearch that referenced this issue Nov 11, 2019
Fixes the issue when the missing values can be ignored in min/max
due to BKD optimization.

Fixes elastic#48905
imotov added a commit that referenced this issue Nov 12, 2019
Fixes the issue when the missing values can be ignored in min/max
due to BKD optimization.

Fixes #48905
imotov added a commit that referenced this issue Nov 13, 2019
Fixes the issue when the missing values can be ignored in min/max
due to BKD optimization.

Fixes #48905
debadair added a commit to debadair/elasticsearch that referenced this issue Nov 13, 2019
Fixes the issue when the missing values can be ignored in min/max
due to BKD optimization.

Fixes elastic#48905
imotov added a commit that referenced this issue Nov 15, 2019
Fixes the issue when the missing values can be ignored in min/max
due to BKD optimization.

Fixes #48905
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.