New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The real memory usage of LRUQueryCache is 40 times larger than estimated value in _nodes/stats
#89715
Comments
_nodes/stats
_nodes/stats
Pinging @elastic/es-core-infra (Team:Core/Infra) |
We'll try and reproduce it locally, but a memory dump would be useful to make sure we reproduce it in the same way. OTOH, if you can reproduce it running a bash script on an empty cluster that'd be best. Because it'd be smaller and easy to post publicly. But if you can't, it's all good. I've just sent you an email with a place to upload the heap dump if you'd like to do that. |
@nik9000 I have uploaded the heap dump. |
OK! I've cracked open the heap dump. This has something do with the low level cancellation infrastructure. It may have been fixed in later versions. I'm investigating. |
The heap dump looks to come from Elasticsearch 7.10.1. There's been quite a bit of memory work in this area since - #61788 comes to mind, though it doesn't look quite right. I think there is another one I'm missing. One moment. |
Also #61788 is in 7.10.1 so it can't be that! |
Most of the space seems to be going to something that looks like this:
But I don't see a |
It looks like there's a member on |
In one of our production clusters, the real memory usage of LRUQueryCache could be 10GB, almost 40 times larger than the estimated value (247MB) in
_nodes/stats
.I have met this problem a few times. It is easy to reproduce when the index is large enough, data size reaches up to 1TB or more. With term queries that will match a large number of docs in the index, LRUQueryCache will accumulate and consume much more memory than the estimated value in
_nodes/stats
.Elasticsearch Version
7.14
Installed Plugins
none
Java Version
bundled
OS Version
CentOS6.6 X86_64i
Problem Description
Below is the stats of one es node.
indices.query_cache
in_nodes/stats
is 247MBSteps to Reproduce
LRUQueryCache will slowly accumulate, and memory will continue to rise to 80%. I have met this problem a few times. It is easy to reproduce when the index is large enough, data size reaches up to 1TB or more. With term queries that will match a large number of docs in the index, LRUQueryCache will accumulate and consume much more memory than the estimated value in
_nodes/stats
.Besides (if relevant)
I can provide a memory dump if you need it.
The text was updated successfully, but these errors were encountered: