Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid using same OpAddEntry between different ledger handles #5942

Merged
merged 6 commits into from Jan 9, 2020

Conversation

@codelipenghui
Copy link
Contributor

codelipenghui commented Dec 26, 2019

Fixes #5588

Motivation

Avoid using same OpAddEntry between different ledger handles.

Modifications

Add state for OpAddEntry, if op handled by new ledger handle, the op will set to CLOSED state, after the legacy callback happens will check the op state, only INITIATED can be processed.

When ledger rollover happens, pendingAddEntries will be processed. when process pendingAddEntries, will create a new OpAddEntry by the old OpAddEntry to avoid different ledger handles use same OpAddEntry.

Verifying this change

Added new unit test

Does this pull request potentially affect one of the following parts:

If yes was chosen, please highlight the changes

  • Dependencies (does it add or upgrade a dependency): (no)
  • The public API: (no)
  • The schema: (no)
  • The default values of configurations: (no)
  • The wire protocol: (no)
  • The rest endpoints: (no)
  • The admin cli options: (no)
  • Anything that affects deployment: (no)

Documentation

  • Does this pull request introduce a new feature? (no)
@codelipenghui codelipenghui requested review from rdhabalia, sijie and merlimat and removed request for rdhabalia and sijie Dec 26, 2019
@codelipenghui codelipenghui requested a review from jiazhai Dec 26, 2019
@codelipenghui

This comment has been minimized.

Copy link
Contributor Author

codelipenghui commented Dec 27, 2019

run cpp tests
run integration tests

@codelipenghui

This comment has been minimized.

Copy link
Contributor Author

codelipenghui commented Dec 27, 2019

run cpp tests

2 similar comments
@codelipenghui

This comment has been minimized.

Copy link
Contributor Author

codelipenghui commented Dec 31, 2019

run cpp tests

@tuteng

This comment has been minimized.

Copy link
Member

tuteng commented Dec 31, 2019

run cpp tests

internalAsyncAddEntry(addOperation);
}));
}

private synchronized void internalAsyncAddEntry(OpAddEntry addOperation) {
pendingAddEntries.add(addOperation);

This comment has been minimized.

Copy link
@rdhabalia

rdhabalia Dec 31, 2019

Contributor

Is this real root cause of #5588 or it just a patch to avoid such behavior?

This comment has been minimized.

Copy link
@sijie

sijie Jan 1, 2020

Member

The root cause of #5588 is an entry is "re-used" between ledgers. The code at line 1297 is the fix.

internalAsyncAddEntry(addOperation);
}));
}

private synchronized void internalAsyncAddEntry(OpAddEntry addOperation) {
pendingAddEntries.add(addOperation);

This comment has been minimized.

Copy link
@sijie

sijie Jan 1, 2020

Member

The root cause of #5588 is an entry is "re-used" between ledgers. The code at line 1297 is the fix.

@@ -1294,9 +1293,24 @@ public synchronized void updateLedgersIdsComplete(Stat stat) {
log.debug("[{}] Resending {} pending messages", name, pendingAddEntries.size());
}

// Avoid use same OpAddEntry between different ledger handle
int pendingSize = pendingAddEntries.size();

This comment has been minimized.

Copy link
@sijie

sijie Jan 1, 2020

Member

hmm, this doesn't seem to be correct to me. you need to preserve the order when adding the newly created ops back to the queue.

what you need to do:

  • drain the pendingAddEntries queue;
  • create a new OpAddEntry for each entry
  • add these ops into an intermediate list in the order of how they are drained
  • after the pendingAddEntries are drained, add the intermediate list back to the pendingAddEntries queue.

This comment has been minimized.

Copy link
@codelipenghui

codelipenghui Jan 2, 2020

Author Contributor

The pendingAddEntries is ConcurrentLinkedQueue, there is no drainTo method in ConcurrentLinkedQueue, maybe we can use pendingAddEntries.toArray() and then create a new OpAddEntry for each item of the array and add the new entry to an intermediate list.

This comment has been minimized.

Copy link
@sijie

sijie Jan 6, 2020

Member

I see. it is a queue here.

LedgerHandle ledger;
private long entryId;

@SuppressWarnings("unused")
private volatile AddEntryCallback callback;
private static final AtomicReferenceFieldUpdater<OpAddEntry, AddEntryCallback> callbackUpdater =

This comment has been minimized.

Copy link
@sijie

sijie Jan 1, 2020

Member

do we need the change here?

This comment has been minimized.

Copy link
@codelipenghui

codelipenghui Jan 2, 2020

Author Contributor

I just copy the updater close to the field


if (!STATE_UPDATER.compareAndSet(OpAddEntry.this, State.INITIATED, State.COMPLETED)) {
log.warn("[{}] The add op is terminal legacy callback for entry {}-{} adding.", ml.getName(), lh.getId(), entryId);
OpAddEntry.this.recycle();

This comment has been minimized.

Copy link
@sijie

sijie Jan 6, 2020

Member

since we are creating a new entry when retrying the ops on the new ledger, do we need to introduce the state field and recycle here?

Since this op is only used by the old ledger handler, it will not be reused across ledgers. It should already be recycled correctly.

if (existsOp != null) {
// If op is used by another ledger handle, we need to close it and create a new one
if (existsOp.ledger != null) {
existsOp.close();

This comment has been minimized.

Copy link
@sijie

sijie Jan 6, 2020

Member

not sure we need to close here. I think once we duplicate the operation, we can just let the original callback close the old op. so it seems to me that we don't need introducing another state field here.

This comment has been minimized.

Copy link
@codelipenghui

codelipenghui Jan 6, 2020

Author Contributor

We need to close the original op, otherwise when the old op callback, will poll the first op in the pendingAddEntries

But, the first op is the new op we replaced.

@sijie
sijie approved these changes Jan 8, 2020
@codelipenghui codelipenghui merged commit 7ec17b2 into apache:master Jan 9, 2020
13 of 18 checks passed
13 of 18 checks passed
cpp-tests cpp-tests
Details
backwards-compatibility backwards-compatibility
Details
process process
Details
thread thread
Details
unit-test-flaky unit-test-flaky
Details
cli
Details
function-state
Details
messaging
Details
schema
Details
sql
Details
standalone
Details
tiered-filesystem
Details
tiered-jcloud
Details
License check
Details
unit-tests
Details
Jenkins: C++ / Python Tests SUCCESS
Details
Jenkins: Integration Tests SUCCESS
Details
Jenkins: Java 8 - Unit Tests SUCCESS
Details
@sijie sijie added the release/2.5.1 label Jan 22, 2020
@sijie sijie added this to the 2.6.0 milestone Jan 22, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.