Skip to content

GEODE-3967: fix the offheap memory leak in serial gateway sender's un…#3044

Merged
gesterzhou merged 1 commit intoapache:developfrom
gesterzhou:feature/GEODE-3967
Jan 9, 2019
Merged

GEODE-3967: fix the offheap memory leak in serial gateway sender's un…#3044
gesterzhou merged 1 commit intoapache:developfrom
gesterzhou:feature/GEODE-3967

Conversation

@gesterzhou
Copy link
Contributor

…processedEvents.

@jhuynh1 @boglesby

When ConcurrentCacheModificationException happened, GatewaySenderEventImpl
should save the status and notify gatewaysender if it hold primary queue,
because other member might have put the event into the secondary queue

Let event with CME only enqueue to primary, but not to dispatch. The old
logic does not allow CME event to enqueue. This is wrong, because an event
without CME might have been added into the secondary queue.

We should not dispatch the CME event, otherwise it will cause remote site
data inconsistency since these CME events are misordered.

So we should enqueue it, but not to dispatch.

Also add rollingUpgradeTests

Thank you for submitting a contribution to Apache Geode.

In order to streamline the review of the contribution we ask you
to ensure the following steps have been taken:

For all changes:

  • Is there a JIRA ticket associated with this PR? Is it referenced in the commit message?

  • Has your PR been rebased against the latest commit within the target branch (typically develop)?

  • Is your initial contribution a single, squashed commit?

  • Does gradlew build run cleanly?

  • Have you written or updated unit tests to verify your changes?

  • If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?

Note:

Please ensure that once the PR is submitted, you check travis-ci for build issues and
submit an update to your PR as soon as possible. If you need help, please send an
email to dev@geode.apache.org.

@gesterzhou gesterzhou force-pushed the feature/GEODE-3967 branch 3 times, most recently from 141f984 to ed3faa5 Compare December 28, 2018 00:03
}
doUpdate = false;
}
if (ev.isConcurrencyConflict()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This block of two ifs is already within a try > if > try > else.

For longterm readability of this function, can you refactor this else into it's own function?

|| ((EntryEventImpl) event).isConcurrencyConflict() && !event.isOriginRemote())) {
senderEvent =
new GatewaySenderEventImpl(operation, event, substituteValue, false); // OFFHEAP
// ok
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this comment mean?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is old comments, I did not remove it

// 2 Special cases:
// 1) UPDATE_VERSION_STAMP: only enqueue to primary
// 2) CME && !originRemote: only enqueue to primary
if (!(event.getOperation().equals(Operation.UPDATE_VERSION_STAMP)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps the two halves of the OR can be pulled into their own boolean variables for better readability?

* Current sender should handle old events, the remove site should receive all of them
*/
@Test
public void oldEventShouldBeProcessedAtNewSender() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like your comment for this function. I shows the 4 distinct steps of this method.

Maybe each step can be pulled out into it's own function since this function is already over 100 lines. It might help with readability

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same with oldEventShouldBeProcessedAtTwoNewSender. Maybe a few helper functions can be created that can be used in both oldEventShouldBeProcessedAtTwoNewSender and oldEventShouldBeProcessedAtNewSender

* them
*/
@Test
public void bothOldAndNewEventsShouldBeProcessedByOldSender() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be broken up until 6 smaller functions to make this overall function easier to read given it's already over 100 lines.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realize that these are test cases so maybe you can ignore this request.

Copy link
Contributor

@jhuynh1 jhuynh1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to make reviewing this a bit easier, we reverted a similar change due to causing wan inconsistencies. What is different in this diff than what was reverted last year, or what was the reason why we were getting more wan inconsistencies that is no longer a problem in this diff?

}
}
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part of fix will resolve the root cause which caused the revert.

…processedEvents.

When ConcurrentCacheModificationException happened, GatewaySenderEventImpl
should save the status and notify gatewaysender if it hold primary queue,
because other member might have put the event into the secondary queue

Let event with CME only enqueue to primary, but not to dispatch. The old
logic does not allow CME event to  enqueue. This is wrong, because an event
without CME might have been added into the secondary queue.

We should not dispatch the CME event, otherwise it will cause remote site
data inconsistency since these CME events are misordered.

So we should enqueue it, but not to dispatch.

Also add rollingUpgradeTests
Copy link
Contributor

@jhuynh1 jhuynh1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As long as the previous issues that caused the revert are cleared and Peter's concerns are addressed, then this looks good to me

@gesterzhou gesterzhou merged commit ea46d00 into apache:develop Jan 9, 2019
@gesterzhou gesterzhou deleted the feature/GEODE-3967 branch January 9, 2019 18:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants