Pending blocks might not be submitted to DA #1548

tzdybal · 2024-02-16T14:53:05Z

Currently, blocks are submitted to DA via pendingBlocks queue. This queue is stored in memory, so in case of a restart of the aggregator node, the information might be lost. After the restart aggregator will continue to produce blocks, and submit those new blocks to DA, but all that were pending before the restart were lost so they will not be submitted to DA.
This causes issues since if any blocks are not in DA, full nodes are not able to sync.

Additionally:

blocks are safely stored in database before submission to DA
blocks are always pushed to DA in order (by height)
DA submission of multiple blocks is atomic - it's impossible to submit only part of a batch
rollkit can handle duplicate blocks as well as blocks out of order during sync process

Example:

Last produced block height: 1010.
pendingBlocks contains blocks between 1000 and 1010.
Node is restarted.
Block 1011 is produced and pushed to DA.
Blocks between 1000 and 1010 will never be submitted to DA.

The text was updated successfully, but these errors were encountered:

tzdybal · 2024-02-16T15:13:52Z

Simple and effective solution is to modify the code to ensure that height of the latest block successfully submitted to DA is persisted in store. After restart we need to continue DA submission.

On the implementation level, there are several possibilities:

push blocks from store to pendingBlocks when node is (re)starting
get rid of pendingBlocks and always read blocks directly from store (this seems to be more generic approach).

arhamj · 2024-02-20T16:50:49Z

@tzdybal Can I pick this? Option 2 sounds better. We can have a cron-like go-routine which runs every X number of seconds and queries Y number of blocks from DB using a prefix filter. X and Y can vary based on the lag. What do you think?

tzdybal · 2024-02-20T21:55:46Z

Hi @arhamj, I'm currently tackling this issue. Thanks for understanding!

TestPendingBlocks should fail to proof the existence of the bug. This commit introduces a new test to check pending blocks as described in issue #1548. It mimics the behavior of a node producing blocks, stopping and restarting, then producing more blocks.

## Overview This PR creates a new test to check pending blocks as described in issue #1548. It mimics the behavior of a node producing blocks, stopping and restarting, then producing more blocks. Looking at commit history you can check that test was initially failing and then was fixed. New implementation of `pendingBlocks` is safer because its data is persisted in store. If node is restarted, block submission to DA will restart when it ended. The worst case scenario (very unlikely), where `pendingBlocks` can't save information to store, results in resubmission of blocks already submitted to DA, which is only the extra cost. The solution ensures that all the blocks are successfully submitted to DA. This PR is bigger than I expected, and because of that I will work on #1524 and limiting the number of blocks returned by `getPendingBlocks` in follow-up PR. Resolves #1548 Resolves #457  ## Checklist  - [x] New and updated code has appropriate documentation - [x] New and updated code has new and/or updated testing - [x] Required CI checks are passing - [x] Visual proof for any user facing features like CLI or documentation updates - [x] Linked issues closed with keywords  ## Summary by CodeRabbit - **New Features** - Introduced a new mock generation command for enhanced testing capabilities. - Added functionality for improved block management and submission to Data Availability layers. - Enhanced metadata management in the store for better data handling and retrieval. - Implemented new testing functions and improved existing ones for better coverage and reliability. - **Bug Fixes** - Fixed handling of empty block submissions to prevent errors during block publishing. - **Refactor** - Refactored block management to improve code efficiency and readability. - Consolidated mock application creation logic in integration tests for better maintainability. - **Tests** - Expanded testing suite with new tests for block submission, metadata operations, and mock DA interactions. - **Chores** - Updated dependencies and imports across various files to enhance functionality and testing.

tzdybal added T:bug Something isn't working P:high Priority: High va labels Feb 16, 2024

tzdybal self-assigned this Feb 16, 2024

Manav-Aggarwal removed the P:high Priority: High label Feb 23, 2024

tzdybal mentioned this issue Feb 27, 2024

Refactor pending blocks handling #1568

Merged

5 tasks

tzdybal added the P:high Priority: High label Feb 29, 2024

MSevey closed this as completed in #1568 Mar 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pending blocks might not be submitted to DA #1548

Pending blocks might not be submitted to DA #1548

tzdybal commented Feb 16, 2024 •

edited by Manav-Aggarwal

tzdybal commented Feb 16, 2024

arhamj commented Feb 20, 2024

tzdybal commented Feb 20, 2024

Pending blocks might not be submitted to DA #1548

Pending blocks might not be submitted to DA #1548

Comments

tzdybal commented Feb 16, 2024 • edited by Manav-Aggarwal

tzdybal commented Feb 16, 2024

arhamj commented Feb 20, 2024

tzdybal commented Feb 20, 2024

tzdybal commented Feb 16, 2024 •

edited by Manav-Aggarwal