Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[fix] [ml] Fix orphan scheduled task for ledger create timeout check #21542

Conversation

poorbarcode
Copy link
Contributor

@poorbarcode poorbarcode commented Nov 8, 2023

Motivation

When an ML tries to create a new ledger, it will create a delay task to check if the ledger create request is timeout[1].

However, we should cancel this delay task after the request to create new ledgers is finished. Otherwise, these tasks will cost unnecessary CPU resources[2].

[1]: https://github.com/apache/pulsar/blob/master/managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/ManagedLedgerImpl.java#L4070-L4082

bookKeeper.asyncCreateLedger(config.getEnsembleSize(), config.getWriteQuorumSize(), config.getAckQuorumSize(), digestType, config.getPassword(), cb, ledgerCreated, finalMetadata);

scheduledExecutor.schedule(() -> {
    if (!ledgerCreated.get()) {
        cb.createComplete(BKException.Code.TimeoutException, null, ledgerCreated);
    }
}, config.getMetadataOperationsTimeoutSeconds(), TimeUnit.SECONDS);

[2]: Thousands delay tasks in memory

for_pr_1 for_pr_2

Modifications

Cancel the scheduled task after the create ledger request is finished

Documentation

  • doc
  • doc-required
  • doc-not-needed
  • doc-complete

Matching PR in forked repository

PR in forked repository: x

@poorbarcode poorbarcode self-assigned this Nov 8, 2023
@github-actions github-actions bot added the doc-not-needed Your PR changes do not impact docs label Nov 8, 2023
@poorbarcode poorbarcode added release/2.10.6 category/reliability The function does not work properly in certain specific environments or failures. e.g. data lost release/3.0.3 release/2.11.4 and removed doc-not-needed Your PR changes do not impact docs labels Nov 8, 2023
@poorbarcode poorbarcode added this to the 3.2.0 milestone Nov 8, 2023
@github-actions github-actions bot added the doc-not-needed Your PR changes do not impact docs label Nov 8, 2023
@poorbarcode poorbarcode changed the title [fix] [ml] Fix orphan task for ledger create timeout check [fix] [ml] Fix orphan scheduled task for ledger create timeout check Nov 8, 2023
Copy link
Contributor

@codelipenghui codelipenghui left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! and Good Job!
Just left a minor comment.

} else {
if (log.isDebugEnabled()) {
log.debug("[{}] Ledger already created when timeout task is triggered", name);
}
}
}, config.getMetadataOperationsTimeoutSeconds(), TimeUnit.SECONDS);

ledgerFutureHook.whenComplete((ignore, ex) -> {
if (!timeoutChecker.isDone()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the API description

/**
     * Attempts to cancel execution of this task.  This method has no
     * effect if the task is already completed or cancelled, or could
     * not be cancelled for some other reason.  Otherwise, if this
     * task has not started when {@code cancel} is called, this task
     * should never run.  If the task has already started, then the
     * {@code mayInterruptIfRunning} parameter determines whether the
     * thread executing this task (when known by the implementation)
     * is interrupted in an attempt to stop the task.

It looks like we don't need this check.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed

@codecov-commenter
Copy link

Codecov Report

Merging #21542 (4c2477c) into master (44abba9) will decrease coverage by 0.07%.
Report is 9 commits behind head on master.
The diff coverage is 50.00%.

Impacted file tree graph

@@             Coverage Diff              @@
##             master   #21542      +/-   ##
============================================
- Coverage     73.27%   73.20%   -0.07%     
+ Complexity    32676    32570     -106     
============================================
  Files          1892     1892              
  Lines        140632   140720      +88     
  Branches      15467    15479      +12     
============================================
- Hits         103044   103018      -26     
- Misses        29497    29588      +91     
- Partials       8091     8114      +23     
Flag Coverage Δ
inttests 24.12% <42.85%> (-0.22%) ⬇️
systests 24.70% <42.85%> (-0.28%) ⬇️
unittests 72.50% <50.00%> (-0.05%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Coverage Δ
...che/bookkeeper/mledger/impl/ManagedLedgerImpl.java 80.64% <50.00%> (-0.39%) ⬇️

... and 100 files with indirect coverage changes

Copy link
Member

@mattisonchao mattisonchao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question:

What is the benefit of using CompletableFuture? Because it has an atomic cancel method, why not pass timeoutChecker directly?

@poorbarcode poorbarcode merged commit e7536a2 into apache:master Nov 11, 2023
48 checks passed
Technoboy- pushed a commit that referenced this pull request Nov 11, 2023
…21542)

### Motivation

When an ML tries to create a new ledger, it will create a delay task to check if the ledger create request is timeout<sup>[1]</sup>.

However, we should cancel this delay task after the request to create new ledgers is finished. Otherwise, these tasks will cost unnecessary CPU resources.

### Modifications

Cancel the scheduled task after the create ledger request is finished
Technoboy- pushed a commit that referenced this pull request Nov 11, 2023
…21542)

### Motivation

When an ML tries to create a new ledger, it will create a delay task to check if the ledger create request is timeout<sup>[1]</sup>.

However, we should cancel this delay task after the request to create new ledgers is finished. Otherwise, these tasks will cost unnecessary CPU resources.

### Modifications

Cancel the scheduled task after the create ledger request is finished
nborisov pushed a commit to nborisov/pulsar that referenced this pull request Nov 13, 2023
…pache#21542)

### Motivation

When an ML tries to create a new ledger, it will create a delay task to check if the ledger create request is timeout<sup>[1]</sup>.

However, we should cancel this delay task after the request to create new ledgers is finished. Otherwise, these tasks will cost unnecessary CPU resources.

### Modifications

Cancel the scheduled task after the create ledger request is finished
poorbarcode added a commit that referenced this pull request Dec 13, 2023
…21542)

When an ML tries to create a new ledger, it will create a delay task to check if the ledger create request is timeout<sup>[1]</sup>.

However, we should cancel this delay task after the request to create new ledgers is finished. Otherwise, these tasks will cost unnecessary CPU resources.

Cancel the scheduled task after the create ledger request is finished

(cherry picked from commit e7536a2)
poorbarcode added a commit that referenced this pull request Dec 13, 2023
…21542)

When an ML tries to create a new ledger, it will create a delay task to check if the ledger create request is timeout<sup>[1]</sup>.

However, we should cancel this delay task after the request to create new ledgers is finished. Otherwise, these tasks will cost unnecessary CPU resources.

Cancel the scheduled task after the create ledger request is finished

(cherry picked from commit e7536a2)
nikhil-ctds pushed a commit to datastax/pulsar that referenced this pull request Dec 20, 2023
…pache#21542)

### Motivation

When an ML tries to create a new ledger, it will create a delay task to check if the ledger create request is timeout<sup>[1]</sup>.

However, we should cancel this delay task after the request to create new ledgers is finished. Otherwise, these tasks will cost unnecessary CPU resources.

### Modifications

Cancel the scheduled task after the create ledger request is finished
srinath-ctds pushed a commit to datastax/pulsar that referenced this pull request Dec 20, 2023
…pache#21542)

### Motivation

When an ML tries to create a new ledger, it will create a delay task to check if the ledger create request is timeout<sup>[1]</sup>.

However, we should cancel this delay task after the request to create new ledgers is finished. Otherwise, these tasks will cost unnecessary CPU resources.

### Modifications

Cancel the scheduled task after the create ledger request is finished
nodece pushed a commit to nodece/pulsar that referenced this pull request Feb 23, 2024
…pache#21542)

When an ML tries to create a new ledger, it will create a delay task to check if the ledger create request is timeout<sup>[1]</sup>.

However, we should cancel this delay task after the request to create new ledgers is finished. Otherwise, these tasks will cost unnecessary CPU resources.

Cancel the scheduled task after the create ledger request is finished

(cherry picked from commit e7536a2)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants