Join GitHub today
GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.Sign up
possible memory leak index query cache #18161
Elasticsearch version: 2.2.2 and 2.3.2
JVM version: 1.8_65 and 1.8_92
OS version: centos 7 (kernel 3.10.0-327.13.1.el7.x86_64)
Description of the problem including expected versus actual behavior:
Steps to reproduce:
after a couple hours, the cluster becomes unresponsive and a restart is required.
Provide logs (if relevant):
attached memory reports
@wfelipe yes, but what happens when you use the default setting for the search threadpool size, which is (number of processors * 3)/2+1. You don't mention how many processors you have, but just unsetting this setting will give you the default. With a high size, if search is struggling for whatever reason, then it'll just use one of the many threads that you have allowed it to use which will bring a system to its knees. Instead, with a reasonable thread pool size, search requests will be queued or rejected, keeping the system healthy.
That's why I want to see what happens to memory usage when the threads setting is the default.
Everything you are describing is a side effect of having too many threads * segments per node. Lucene keeps state in a thread local per segment, which is why you are seeing so many instances of SegmentCodeReaders and CompressingStoredFieldsReader. You should try to reduce the size of the search/get pools and have fewer (larger) segments per node.