-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LUCENE-9478: Prevent DWPTDeleteQueue from referencing itself and leaking memory #1779
Conversation
…ing memory In LUCENE-9304 we introduced some fixes that unfortunately hold on to the previous DWPTDeleteQueue which is essentially leaking IW memory and cause applications to fail. This fixes the memory leak and adds a test to ensure it's not leaking memory.
@uschindler can you please take a look at the test @mikemccand you should also look please |
WeakAndNext weakAndNext = new WeakAndNext(); | ||
DocumentsWriterDeleteQueue next = weakAndNext.next; | ||
assertNotNull(next); | ||
System.gc(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is terrible but I don't know of a better way of doing this. this at least works on my machine :) and would have found the issue
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@uschindler @dweiss maybe you have better ideas
lucene/core/src/java/org/apache/lucene/index/DocumentsWriterDeleteQueue.java
Show resolved
Hide resolved
WeakAndNext weakAndNext = new WeakAndNext(); | ||
DocumentsWriterDeleteQueue next = weakAndNext.next; | ||
assertNotNull(next); | ||
System.gc(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I checked openjdk sources and they use System.gc() with a similar expectation that it will always try to flush all weak refs before returning, so this seems fine. The test itself could maybe be made more explicit by utilizing an explicit ReferenceQueue rather than asserting the reference has been cleared but this is a matter of taste - if this works, it will in both cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cool! I think the simpler the better
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would also use a ReferenceQueue to do the check. The weak.get() has a bad taste to me.
I did not check that System.gc() tries to cleanup weak refs, but if it does and we can rely on this, I am fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good lord what a subtle memory leak!!! Lambda's closure holding onto this
even though this.nextSeqNo
is already final!! Evil.
Can we bound the expected size of the leak? It is only the tail of the last delete queue on full flush? So you must have a number of deletes pending at full flush (commit, refresh) for this to matter?
Have we seen apps in practice where this is very costly?
I opened mikemccand/luceneutil#78 to figure out why our nightly benchmarks at least didn't notice this ...
Each delete queue would leak around 500 bytes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am OK with this code.
I would really like to see the lambda methods generated by javac. I will do a javap call before and after your patch.
IMHO, the test is fine, but it only tests if there's a reference to this
in order to access nextSeqNo
.
WeakAndNext weakAndNext = new WeakAndNext(); | ||
DocumentsWriterDeleteQueue next = weakAndNext.next; | ||
assertNotNull(next); | ||
System.gc(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would also use a ReferenceQueue to do the check. The weak.get() has a bad taste to me.
I did not check that System.gc() tries to cleanup weak refs, but if it does and we can rely on this, I am fine.
This is a limitation of the Java Compiler javac. Whenever you access an instance field from a class (final or not doe snot matter), it created a lamda bound to
As you see it's a non-static method. The generated lambda class therefor has a reference to After your patch, the lambda is static and instead gets one parameter:
The generated anonymous class will have just be bound to the The method signature of both MethodHandles is identical, one parameter ( By analyzing the code we see that issue is fixed, test is not necessarily needed, but it's hard to check the bytecode :-) +1 for the fix! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
…ing memory (#1779) In LUCENE-9304 we introduced some fixes that unfortunately hold on to the previous DWPTDeleteQueue which is essentially leaking IW memory and cause applications to fail. This fixes the memory leak and adds a test to ensure its not leaking memory.
…ing memory (#1779) In LUCENE-9304 we introduced some fixes that unfortunately hold on to the previous DWPTDeleteQueue which is essentially leaking IW memory and cause applications to fail. This fixes the memory leak and adds a test to ensure its not leaking memory.
…ing memory (apache#1779) In LUCENE-9304 we introduced some fixes that unfortunately hold on to the previous DWPTDeleteQueue which is essentially leaking IW memory and cause applications to fail. This fixes the memory leak and adds a test to ensure its not leaking memory.
Hello, I'm newbie in Apache Lucene-Solr. Can someone explain me this bug as for kid? How can I reproduce it on my computer? I want to start participate by bug fixing, but at first I wanna check how previous issues was patched. Thanks in advance! |
…ing memory (apache#1779) In LUCENE-9304 we introduced some fixes that unfortunately hold on to the previous DWPTDeleteQueue which is essentially leaking IW memory and cause applications to fail. This fixes the memory leak and adds a test to ensure its not leaking memory. cr https://code.amazon.com/reviews/CR-33310184
In LUCENE-9304 we introduced some fixes that unfortunately hold on to the previous
DWPTDeleteQueue which is essentially leaking IW memory and cause applications to fail.
This fixes the memory leak and adds a test to ensure its not leaking memory.