Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Start with txpool disabled #5634

Merged
merged 6 commits into from Jun 28, 2023
Merged

Conversation

fab-10
Copy link
Contributor

@fab-10 fab-10 commented Jun 22, 2023

PR description

This PR review the way the txpool is enabled/disabled in a more intuitive way and also fix a bug that happens when enabling the layered txpool when not synced.

The main point of disabling the txpool, is to avoid its overhead during the initial sync (or resync) phases, since pending transactions are always evaluated and processed against the current head, so if we are syncing there is no value in processing them, and we can wait until the sync is done to enabled the txpool.
The current implementation is a bit convoluted, since the txpool is always enabled at startup, but if the initial sync is not done, then the txpool is disabled, to be re-enabled after the sync is done, while it is straightforward to just start with the txpool disabled, and enabled it when the sync is done.

Fixed Issue(s)

fixes #5636

@github-actions
Copy link

github-actions bot commented Jun 22, 2023

  • I thought about documentation and added the doc-change-required label to this PR if updates are required.
  • I thought about the changelog and included a changelog update if required.
  • If my PR includes database changes (e.g. KeyValueSegmentIdentifier) I have thought about compatibility and performed forwards and backwards compatibility tests

@fab-10 fab-10 force-pushed the txpool-disabled-at-startup branch 5 times, most recently from d25e7f8 to 7044798 Compare June 26, 2023 09:17
@fab-10 fab-10 marked this pull request as ready for review June 26, 2023 09:31
@@ -121,13 +129,13 @@ private void initLogForReplay() {
.log();
}

public void saveToDisk() {
private synchronized void saveToDisk(final PendingTransactions pendingTransactionsToSave) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why pick two arbitrary methods to place the synchronized keyword on. What fields are they looking to prevent unsynchronized access on? Do all accesses to those fields get locked?

Consider using com.google.errorprone.annotations.concurrent.GuardedBy annotations.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just want that these 2 methods are executed serially, in case of different threads enabling/disabling the pool in rapid succession (that should never happen now).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's my point, we are treating synchronization lightly.

If they shouldn't be executed concurrently then you should be able to identify the fields that would conflict. Then we can make sure that no other public method will allow the same conflicts that executing these two methods in parallel would cause.

Consider using import java.util.concurrent.locks.ReentrantLock if your goal is to prevent writing and reading at the same time. This will limit the impact of the lock (not locking on the entire TransactionPool object, which may have unforseen impacts). It also has timeout-enabled lock acquisitions, which should be considered to ensure shutdown doesn't fully freeze in extreme circumstances, or take an extended time such as if multiple shutdown requests have been initiated.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I see your point, will change to avoid locking the entire class, and will change the write to check and kill if an already running write, same for the read, so there is no unbounded queueing on shutdown and it can finish in few seconds as for my tests with >100k txs. At that point there should be no need to put a timeout but in case its place should be in the Runner::stop.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please check the updated implementation

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use the timeout variant, with something large like a minute or ten minutes. A bug in the loading code could prevent orderly shutdown of the entire service. And this is the sort of bug that will sneak in as an unrelated change to some other feature, not that it exists today.

It's better to have a shutdown missing pending transactions than a service that refuses to shutdown. A non-timeout lock request creates that risk.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Managing the timeout on shutdown in Runner::stop where all timeout on services are managed then.

@fab-10 fab-10 requested a review from shemnon June 27, 2023 17:16
final Runnable operation,
final AtomicReference<CompletableFuture<Void>> operationInProgress) {
if (configuration.getEnableSaveRestore()) {
diskAccessLock.lock();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This call to lock should use tryLock(long time, TimeUnit unit) to ensure it doesn't wait forever getting a lock.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@fab-10 fab-10 force-pushed the txpool-disabled-at-startup branch 2 times, most recently from 0163aba to 289074d Compare June 27, 2023 21:46
@fab-10 fab-10 requested a review from shemnon June 27, 2023 21:47
@fab-10 fab-10 force-pushed the txpool-disabled-at-startup branch from 289074d to c802b21 Compare June 27, 2023 21:48
Signed-off-by: Fabio Di Fabio <fabio.difabio@consensys.net>
Signed-off-by: Fabio Di Fabio <fabio.difabio@consensys.net>
Signed-off-by: Fabio Di Fabio <fabio.difabio@consensys.net>
… waiting at max 30 seconds for the stop to complete

Signed-off-by: Fabio Di Fabio <fabio.difabio@consensys.net>
Signed-off-by: Fabio Di Fabio <fabio.difabio@consensys.net>
Signed-off-by: Fabio Di Fabio <fabio.difabio@consensys.net>
@fab-10 fab-10 force-pushed the txpool-disabled-at-startup branch from c802b21 to 4394296 Compare June 28, 2023 09:00
@fab-10 fab-10 merged commit 2dca2f5 into hyperledger:main Jun 28, 2023
19 checks passed
@fab-10 fab-10 deleted the txpool-disabled-at-startup branch June 28, 2023 10:40
davidkngo pushed a commit to liquichain/besu that referenced this pull request Jul 21, 2023
Signed-off-by: Fabio Di Fabio <fabio.difabio@consensys.net>
davidkngo added a commit to liquichain/besu that referenced this pull request Jul 21, 2023
elenduuche pushed a commit to elenduuche/besu that referenced this pull request Aug 16, 2023
Signed-off-by: Fabio Di Fabio <fabio.difabio@consensys.net>
eum602 pushed a commit to lacchain/besu that referenced this pull request Nov 3, 2023
Signed-off-by: Fabio Di Fabio <fabio.difabio@consensys.net>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Failure at startup when enabling layered txpool before initial sync done
2 participants