[ISSUE #10395] Fix native memory leak on TLS certificate hot-reload#10396
[ISSUE #10395] Fix native memory leak on TLS certificate hot-reload#10396qianye1001 wants to merge 1 commit into
Conversation
qianye1001
left a comment
There was a problem hiding this comment.
Review by rocketmq-reviewer-bot
Summary
Fix native memory leak caused by old SslContext not being released during TLS certificate hot-reload when using the OpenSSL (netty-tcnative) provider. The fix captures the old SslContext before building the replacement and calls ReferenceCountUtil.release() after successful assignment.
Findings
-
[Low]
ProxyAndTlsProtocolNegotiator.java:114—loadSslContext()is not synchronized. If called concurrently (e.g., from multipleFileWatchServicecallbacks), two threads could both capture the sameoldSslContextreference and both callrelease(), causing a double-release (IllegalReferenceCountException). In practiceFileWatchServiceuses a single callback thread, so this is low risk — but addingsynchronizedto the method or a lock would eliminate the race entirely. -
[Low]
ProxyAndTlsProtocolNegotiator.java:81— Good thatvolatilewas added tosslContext. The parent classNettyRemotingAbstractalready declaressslContextasprotected volatile, soNettyRemotingServeris covered. ✅ -
[Info]
NettyRemotingServer.java:188-191— Release ordering is correct: "build new → assign → release old" ensures no window wheresslContextis null or prematurely freed. -
[Info] Both files —
ReferenceCountUtil.release()is a safe no-op for JDKSslContext(non-ReferenceCounted), so this fix works correctly for both OpenSSL and JDK providers. -
[Suggestion] Missing automated test coverage. The PR checklist shows unchecked test boxes. Consider adding a unit test that:
- Creates an OpenSSL
SslContextvialoadSslContext() - Calls
loadSslContext()again (simulating cert reload) - Asserts the old context's
refCnt()is 0 after reload
This would prevent regression and validate the fix.
- Creates an OpenSSL
-
[Suggestion] PR body mentions "manual RSS monitoring" for testing — for a memory leak fix, an automated test or at least a step-by-step reproduction/verification procedure would strengthen confidence.
Cross-repo Note
No cross-repo impact. This change is internal to the broker/proxy TLS subsystem.
Verdict
Approve with minor suggestions — The fix is correct and addresses a real native memory leak. The "build new, then release old" ordering is sound. Thread safety is adequate (volatile on both field declarations). The main improvement would be adding automated test coverage.
Automated review by rocketmq-reviewer-bot
|
This PR was submitted without prior community approval via If this fix is not desired or the approach is wrong, please close this PR. If the fix direction is acceptable, please comment Apologies for the premature submission. |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## develop #10396 +/- ##
=============================================
- Coverage 49.09% 48.95% -0.14%
+ Complexity 13507 13472 -35
=============================================
Files 1376 1376
Lines 100537 100543 +6
Branches 12983 12985 +2
=============================================
- Hits 49357 49225 -132
- Misses 45184 45304 +120
- Partials 5996 6014 +18 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Summary
Fix native memory leak caused by old
SslContextnot being released during TLS certificate hot-reload when using the OpenSSL (netty-tcnative) provider.Fixes #10395
Root Cause
When TLS certificates are dynamically reloaded via
FileWatchService, theloadSslContext()methods inNettyRemotingServerandProxyAndTlsProtocolNegotiatordirectly overwrite thesslContextfield without releasing the old instance. SinceReferenceCountedOpenSslContextallocates native off-heap memory (SSL_CTX, X509 chain, EVP_PKEY), this native memory is leaked on every certificate rotation cycle.Changes
remoting/src/main/java/org/apache/rocketmq/remoting/netty/NettyRemotingServer.javasslContextbefore building the replacementReferenceCountUtil.release(oldSslContext)after successful assignmentSslContextandReferenceCountUtilproxy/src/main/java/org/apache/rocketmq/proxy/grpc/ProxyAndTlsProtocolNegotiator.javasslContextfieldvolatilefor thread-safe reads from TLS handshake threadsReferenceCountUtilSafety
sslContextis never null or prematurely releasedReferenceCountUtil.release()is safe for JDK providerSslContext(no-op for non-ReferenceCounted)Testing
This PR was automatically generated based on the approved fix proposal.