Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

java.lang.OutOfMemoryError: Java heap space #35482

Closed
koala775 opened this issue Nov 13, 2018 · 8 comments
Closed

java.lang.OutOfMemoryError: Java heap space #35482

koala775 opened this issue Nov 13, 2018 · 8 comments

Comments

@koala775
Copy link

koala775 commented Nov 13, 2018

Describe the feature:
When I use nested aggregation of dsl return large amount of data,then elasticsearch error: java.lang.OutOfMemoryError: Java heap space.

Elasticsearch version (bin/elasticsearch --version):
6.3.0

Plugins installed: []

JVM version (java -version):
jdk_1.8.152

OS version (uname -a if on a Unix-like system):
linux
Description of the problem including expected versus actual behavior:
1.Without any operating,why ramCurrent is full.
image
2.Then query dsl frequently,elasticsearch happened error :java.lang.OutOfMemoryError: Java heap space
3.ramCurrent include what?
4.aggregation used which ram,filedata cache or query cache or request cache?

Steps to reproduce:

Please include a minimal but complete recreation of the problem, including
(e.g.) index creation, mappings, settings, query etc. The easier you make for
us to reproduce it, the more likely that somebody will take the time to look at it.

1.query node state
2.query dsl frequently
3.elasticsearch error

Provide logs (if relevant):

@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search-aggs

@jpountz
Copy link
Contributor

jpountz commented Nov 13, 2018

When I use nested aggregation of dsl return large amount of data

Can you please share what the request looks like?

@koala775
Copy link
Author

koala775 commented Nov 14, 2018

When I use nested aggregation of dsl return large amount of data

Can you please share what the request looks like?

GET index/_search
{
  "query": {
                "bool": {
                    "must_not": {
                        "term": {"xxx.keyword": "xxx"}
                    },
                    "filter": [
                        {
                            "range": {
                                "crt_dt": {"gt": "2018-01-14T09:00:00.000Z", "lte": "2018-01-14T10:00:00.000Z"}
                            }
                        }
                    ]
                }
            },
            "size" :0,
            "aggregations": {
                "time": {
                    "date_histogram" : {
                        "field" : "date" ,
                        "format": "yyyy-MM-dd kk:mm:ss",
                        "interval": "1m",
                        "time_zone": "Asia/Shanghai",
                        "order" : { "_key" : "desc" },
                        "offset":    "+8h",
                        "min_doc_count": 1

                    },
                    "aggregations": {
                        "name": {
                            "terms": {
                                "script" : {
                                    "source": "doc['xxx.keyword'].value + ';'+doc['xxx.keyword'].value + ';'+doc['xxx.keyword'].value+';'+doc['xxx.keyword'].value+';'+doc['xxx'].value",

                                    "lang": "painless"
                                } ,"size":1000000
                            },

                            "aggregations": {
                                "ava": {
                                    "avg": {
                                        "field": "ava"
                                    }
                                },
                                "success": {
                                    "sum": {
                                        "field": "success"
                                    }
                                },
                                "fail": {
                                    "sum": {
                                        "field": "fail"
                                    }
                                },
                                "max": {
                                    "max": {
                                        "field": "max"
                                    }
                                },
                                "min": {
                                    "min": {
                                        "field": "min"
                                    }
                                }
                            }
                        }
                    }
                }
            }

}

@dliappis dliappis added :Analytics/Geo Indexing, search aggregations of geo points and shapes and removed feedback_needed :Analytics/Aggregations Aggregations labels Dec 19, 2018
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-analytics-geo

@imotov
Copy link
Contributor

imotov commented Dec 20, 2018

That's a term aggregation that is trying to get back 1 million buckets within a date histogram aggregation. So, it's seems like an aggs-related circuit-breaker issue to me. @dliappis was it relabeled as :Analytics/Geo by mistake? I cannot really find anything geo-related here. Could you clarify?

@dliappis
Copy link
Contributor

@imotov thanks for the clarification, indeed had a chat with @mayya-sharipova about the right team, but I think geo came by mistake. I'll relabel it as :Analytics/Aggregations.

@dliappis dliappis added :Analytics/Aggregations Aggregations and removed :Analytics/Geo Indexing, search aggregations of geo points and shapes labels Dec 21, 2018
@polyfractal
Copy link
Contributor

I suspect this is just a case of #28220, where our agg framework does a poor job estimating the memory cost of each agg type. It underestimates some aggs, and over-estimates others.

The newish search.max_buckets setting will help, but I agree we should spend some time making the agg CB estimates better.

@polyfractal
Copy link
Contributor

Closing, I think this was another case of global ords doc value iterators (ala #36090). Recent PRs #43091 and #43339 should help in this situation, and the search.max_buckets should prevent overly abusive aggs from being run.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants