-
Notifications
You must be signed in to change notification settings - Fork 254
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ByteBuf Memory Leak #466
Comments
@dfjones thanks for the report. You mentioned that you were able to repro this with a test, do you mind sharing the test? |
@NiteshKant I can't share that test because it relies on too much internal stuff. However, I will try to create an equivalent reproduction of the issue using just RxNetty components. I'll let you know if I'm able to do this or not. |
Awesome, thanks! |
@NiteshKant Good news, I was able to reproduce the memory leak with a test using just RxNetty. I'll link the modified files here. If it is helpful, the full branch with all the extra debug logging/trace info is here: https://github.com/Squarespace/RxNetty/tree/memleak-debug I created my test in the rxnetty-examples project. I used a modified HelloWorldServer because it appears the response needs to be sent with chunked transfer encoding. |
Forgot to attach a leak report from this new test: |
…nto the ClientRequestResponseConverter so that it can create a UnicastContentSubject with a timeout. This fix addresses issue ReactiveX#466
@NiteshKant I put up the PR linked above. With this patch in place, the test I've shared here no longer produces a memory leak after running for a few minutes. Reading through the code...I couldn't see a reason not to set a timeout on creation of |
Potential Fix for Memory Leak Issue #466
Fixed via #468 |
…nto the ClientRequestResponseConverter so that it can create a UnicastContentSubject with a timeout. This fix addresses issue ReactiveX#466
In production, I found indications of memory leaks and I believe I have traced it down to the following situation.
In a project using the RxNetty 0.4.x client through Netflix Ribbon/Hystrix, on a box with high CPU contention, it appears that memory can leak in the case where the Hystrix timeout fires before the client can deliver the ByteBuf to the subscriber. I believe this is a sort of race condition in a CPU starved environment. Imagine a race between the Hystrix timer thread and the RxNetty I/O thread, if the timer thread wins, eventually the RxNetty thread gets CPU time and produces a ByteBuf that is never consumed.
I noticed that there are safeguards in the form of timeouts in
UnicastContentSubject
, however I believe there is a bug in establishing these timeouts when a new pooled connection is created.I was able to reproduce this issue locally in a test that reproduces the situation described above. I've augmented the latest 0.4.x RxNetty code to add additional logging/debugging information. I'm posting what I have here, starting with the ByteBuf leak report that is attached:
bytebuf-mem-leak-report.txt
The last touch of the ByteBuf comes from
onNext
in an augmentedUnicastContentSubject
shown here:Using this information, I'm including some log entries that hopefully will contain some useful trace information (some log statements along the way, I've added). See the attached log file:
rxnetty-mem-leak-trace-3-filtered.txt
If there is any more information I can provide, please let me know. I'm also willing to contribute a PR to fix this issue, but I'm at a loss of what the "right" fix should be. Always creating
UnicastContentSubject
with a subscription timeout seems to fix the issue, but I'm guessing that is undesirable. I haven't been able to figure out why the addition of the timeout inRequestProcessingOperator
isn't being called in this situation, but my hunch is that the actual fix lies in there.The text was updated successfully, but these errors were encountered: