Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Transaction] Fix the transaction marker doe not deleted as expect. #11126

Conversation

congbobo184
Copy link
Contributor

@congbobo184 congbobo184 commented Jun 28, 2021

Motivation

fix #11002

implement

Trigger transaction maker deletes again if there are new delete operations happen during the last delete task processing.

Verifying this change

Add the tests for it

Does this pull request potentially affect one of the following parts:
If yes was chosen, please highlight the changes

Dependencies (does it add or upgrade a dependency): (no)
The public API: (no)
The schema: (no)
The default values of configurations: (no)
The wire protocol: (no)
The rest endpoints: (no)
The admin cli options: (no)
Anything that affects deployment: (no)

Copy link
Contributor

@eolivelli eolivelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work,
I left only one comment about shutting down the thread pool

@@ -447,6 +448,13 @@ public void start() throws Exception {
}
}

if (pulsar.getConfiguration().isTransactionCoordinatorEnabled()) {
this.transactionBufferDeleteMarkerMonitor = Executors
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you please "shutdown" this Executor service on BrokerService close ?

I also cannot find the shutdown for these other threadpools: topicPublishRateLimiterMonitor, brokerPublishRateLimiterMonitor, deduplicationSnapshotMonitor

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch! I will follow your advice.

Copy link
Contributor

@codelipenghui codelipenghui left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like the regular task for checking the marker deletion can't fix #11002, if the check task have not triggered yet, we will peek a wrong message right?

I think the right fix is to skip the markers when peeking messages and we don't need a regular checker for the marker deletion, If we have many topics in a broker, the checker will waste more CPU cycles. Or we can say it's an expected behavior(Allow markers that have not been deleted within a period of time)

@@ -390,16 +391,7 @@ public void acknowledgeMessage(List<Position> positions, AckType ackType, Map<St
}
}

if (topic.getBrokerService().getPulsar().getConfig().isTransactionCoordinatorEnabled()
&& this.pendingAckHandle.isTransactionAckPresent()) {
Copy link
Contributor Author

@congbobo184 congbobo184 Jun 28, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@codelipenghui #11002 cause by this code, because delete marker judge the pending ack state, if one sub don't use pending ack, it will not delete marker. if we don't start a monitor to check this, how we fix this race condition.

@eolivelli
Copy link
Contributor

@codelipenghui the peekMessage problem is only a symptom (and it is good to fix it even by skipping messages)
The real problem IMHO is that the list of individually deleted messages will grow without bounds.

@codelipenghui
Copy link
Contributor

codelipenghui commented Jun 29, 2021

@eolivelli I have a discussion with @congbobo184 , I think the correct fix is to trigger the marker deletion test properly. The root cause is due to the race condition the marker deletion task has not been triggered due to the previous marker deletion task does not completed, so we need to re-trigger the marker deletion task if there are acks that happen during the last marker deletion task processing.

@eolivelli
Copy link
Contributor

@codelipenghui

the correct fix is to trigger the marker deletion test properly
makes sense

So IIUC we are not going to add an additional ThreadPool,
@congbobo184 regarding the shutdown of the executors I will be happy to send the patch
it is better to address the problem in a separate patch

@codelipenghui
Copy link
Contributor

@eolivelli

So IIUC we are not going to add an additional ThreadPool,

I think we don't need to add an additional thread pool, we just need to add a check when we complete the last delete marker task.

Copy link
Contributor

@eolivelli eolivelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great

I left one comment
PTAL

@@ -77,7 +81,7 @@ public void testTransactionMarkerDelete() throws Exception {
managedLedger.openCursor("test"), false);

Position position1 = managedLedger.addEntry("test".getBytes());
managedLedger.addEntry(Markers
Position position2 = managedLedger.addEntry(Markers
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we simulate a sequence with multiple transactions markers:

  • send
  • commit
  • send
  • commit
    ...
    and verify that there is no hole ?

also

  • send (with tx)
  • commit
  • commit
  • commit
  • send (no tx)
    ....

@congbobo184
Copy link
Contributor Author

@eolivelli add some test like your comment. please review again! thanks.

Copy link
Contributor

@eolivelli eolivelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good to me

@codelipenghui
you left some comments, please take a look again

@codelipenghui
Copy link
Contributor

@congbobo184 Could you please help update the description of the PR? And add more context about how the root cause is and how the PR can fix the issue, this will make others can understand it easily.

@codelipenghui codelipenghui changed the title [Transaction] Transaction delete marker monitor. [Transaction] Fix the transaction marker doe not deleted as expect. Jul 16, 2021
@codelipenghui codelipenghui merged commit 3fcfb22 into apache:master Jul 16, 2021
@eolivelli
Copy link
Contributor

@codelipenghui can you cherry pick to 2.8.1 ?

codelipenghui pushed a commit that referenced this pull request Jul 23, 2021
…11126)

Trigger transaction maker deletes again if there are new delete operations happen during the last delete task processing.


(cherry picked from commit 3fcfb22)
@codelipenghui codelipenghui added the cherry-picked/branch-2.8 Archived: 2.8 is end of life label Jul 23, 2021
eolivelli pushed a commit to datastax/pulsar that referenced this pull request Aug 16, 2021
…pache#11126)

Trigger transaction maker deletes again if there are new delete operations happen during the last delete task processing.

(cherry picked from commit 3fcfb22)
eolivelli pushed a commit to datastax/pulsar that referenced this pull request Aug 16, 2021
…pache#11126)

Trigger transaction maker deletes again if there are new delete operations happen during the last delete task processing.

(cherry picked from commit 3fcfb22)
bharanic-dev pushed a commit to bharanic-dev/pulsar that referenced this pull request Mar 18, 2022
…pache#11126)

Trigger transaction maker deletes again if there are new delete operations happen during the last delete task processing.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants