GEODE-3967: fix the offheap memory leak in serial gateway sender's un…#3044
GEODE-3967: fix the offheap memory leak in serial gateway sender's un…#3044gesterzhou merged 1 commit intoapache:developfrom
Conversation
141f984 to
ed3faa5
Compare
| } | ||
| doUpdate = false; | ||
| } | ||
| if (ev.isConcurrencyConflict()) { |
There was a problem hiding this comment.
This block of two ifs is already within a try > if > try > else.
For longterm readability of this function, can you refactor this else into it's own function?
| || ((EntryEventImpl) event).isConcurrencyConflict() && !event.isOriginRemote())) { | ||
| senderEvent = | ||
| new GatewaySenderEventImpl(operation, event, substituteValue, false); // OFFHEAP | ||
| // ok |
There was a problem hiding this comment.
What does this comment mean?
There was a problem hiding this comment.
this is old comments, I did not remove it
| // 2 Special cases: | ||
| // 1) UPDATE_VERSION_STAMP: only enqueue to primary | ||
| // 2) CME && !originRemote: only enqueue to primary | ||
| if (!(event.getOperation().equals(Operation.UPDATE_VERSION_STAMP) |
There was a problem hiding this comment.
Perhaps the two halves of the OR can be pulled into their own boolean variables for better readability?
| * Current sender should handle old events, the remove site should receive all of them | ||
| */ | ||
| @Test | ||
| public void oldEventShouldBeProcessedAtNewSender() { |
There was a problem hiding this comment.
I like your comment for this function. I shows the 4 distinct steps of this method.
Maybe each step can be pulled out into it's own function since this function is already over 100 lines. It might help with readability
There was a problem hiding this comment.
Same with oldEventShouldBeProcessedAtTwoNewSender. Maybe a few helper functions can be created that can be used in both oldEventShouldBeProcessedAtTwoNewSender and oldEventShouldBeProcessedAtNewSender
| * them | ||
| */ | ||
| @Test | ||
| public void bothOldAndNewEventsShouldBeProcessedByOldSender() { |
There was a problem hiding this comment.
Can this be broken up until 6 smaller functions to make this overall function easier to read given it's already over 100 lines.
There was a problem hiding this comment.
I realize that these are test cases so maybe you can ignore this request.
jhuynh1
left a comment
There was a problem hiding this comment.
Just to make reviewing this a bit easier, we reverted a similar change due to causing wan inconsistencies. What is different in this diff than what was reverted last year, or what was the reason why we were getting more wan inconsistencies that is no longer a problem in this diff?
ed3faa5 to
d668604
Compare
| } | ||
| } | ||
| } | ||
|
|
There was a problem hiding this comment.
This part of fix will resolve the root cause which caused the revert.
…processedEvents. When ConcurrentCacheModificationException happened, GatewaySenderEventImpl should save the status and notify gatewaysender if it hold primary queue, because other member might have put the event into the secondary queue Let event with CME only enqueue to primary, but not to dispatch. The old logic does not allow CME event to enqueue. This is wrong, because an event without CME might have been added into the secondary queue. We should not dispatch the CME event, otherwise it will cause remote site data inconsistency since these CME events are misordered. So we should enqueue it, but not to dispatch. Also add rollingUpgradeTests
d668604 to
7f2950c
Compare
jhuynh1
left a comment
There was a problem hiding this comment.
As long as the previous issues that caused the revert are cleared and Peter's concerns are addressed, then this looks good to me
…processedEvents.
@jhuynh1 @boglesby
When ConcurrentCacheModificationException happened, GatewaySenderEventImpl
should save the status and notify gatewaysender if it hold primary queue,
because other member might have put the event into the secondary queue
Let event with CME only enqueue to primary, but not to dispatch. The old
logic does not allow CME event to enqueue. This is wrong, because an event
without CME might have been added into the secondary queue.
We should not dispatch the CME event, otherwise it will cause remote site
data inconsistency since these CME events are misordered.
So we should enqueue it, but not to dispatch.
Also add rollingUpgradeTests
Thank you for submitting a contribution to Apache Geode.
In order to streamline the review of the contribution we ask you
to ensure the following steps have been taken:
For all changes:
Is there a JIRA ticket associated with this PR? Is it referenced in the commit message?
Has your PR been rebased against the latest commit within the target branch (typically
develop)?Is your initial contribution a single, squashed commit?
Does
gradlew buildrun cleanly?Have you written or updated unit tests to verify your changes?
If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?
Note:
Please ensure that once the PR is submitted, you check travis-ci for build issues and
submit an update to your PR as soon as possible. If you need help, please send an
email to dev@geode.apache.org.