Skip to content

indexer: remove silent failures and block_heads_rx lagged error#452

Merged
ppca merged 7 commits intodevelopfrom
xiangyi/fix_indexer
Jul 22, 2025
Merged

indexer: remove silent failures and block_heads_rx lagged error#452
ppca merged 7 commits intodevelopfrom
xiangyi/fix_indexer

Conversation

@ppca
Copy link
Copy Markdown
Contributor

@ppca ppca commented Jul 16, 2025

Our prev testnet release had 3 nodes silently failing to run eth indexer.

  1. This PR changes both eth and solana indexer's run function to not silently fail.
  2. this PR changes catchup to run as tokio task instead of before the main loop. This is necessary because with current dev logic, when catchup takes a long time (node offline for long), then if block_heads_rx is initialized before we catchup, then it will be lagged and when we receive from it, we will miss the older blocks when catchup began so another catchup is needed; if block_heads_rx is initialized after we catchup, then there's a gap between the end block of catchup and the current block, and we'd need to catchup again. We might need to catchup a few times until the gap between catchup's end block and current block is well within block_heads_rx's limit so that it's not lagged. If we run the catchup as a tokio task while main loop is running and initialize the block_heads_rx before catchup task is spawned, we won't have the gap problem at all and no more catchup is needed.

@ppca ppca requested a review from Copilot July 16, 2025 20:38

This comment was marked as outdated.

@ppca ppca requested review from ChaoticTempest and volovyks July 21, 2025 17:02
@ppca ppca force-pushed the xiangyi/fix_indexer branch from 886e1d3 to bdfd8a1 Compare July 21, 2025 17:34
@ppca ppca requested a review from Copilot July 21, 2025 17:34

This comment was marked as outdated.

@ppca ppca requested a review from Copilot July 21, 2025 17:35

This comment was marked as outdated.

@ppca ppca requested a review from Copilot July 21, 2025 17:37

This comment was marked as outdated.

@ppca ppca changed the title indexer: reduce silent failures indexer: remove silent failures and block_heads_rx error Jul 21, 2025
@ppca ppca changed the title indexer: remove silent failures and block_heads_rx error indexer: remove silent failures and block_heads_rx lagged error Jul 21, 2025
@ppca ppca requested a review from Copilot July 21, 2025 19:04
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR removes silent failures from Ethereum and Solana indexers and fixes block heads stream lagging issues. The changes ensure better error handling and prevent silent failures that occurred in the previous testnet release.

  • Converts indexer run functions from returning Result to returning () while adding explicit error logging
  • Changes catchup to run as a spawned tokio task instead of blocking the main loop
  • Improves error handling for broadcast channel lagged and closed errors

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
chain-signatures/node/src/indexer_sol.rs Removes silent failures by changing return type and adding error logging for program address parsing and client creation
chain-signatures/node/src/indexer_eth.rs Extensive refactoring to remove silent failures, spawn catchup as async task, and improve error handling throughout the indexer

};
let end_block_number = latest_block.header.number;
// helios can only go back maximum 8191 blocks, so we need to adjust the start block number if it's too far behind
let helios_oldest_block_number = latest_block.header.number - 8191;
Copy link

Copilot AI Jul 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The magic number 8191 should be defined as a named constant to improve code maintainability and make the limitation clear.

Suggested change
let helios_oldest_block_number = latest_block.header.number - 8191;
let helios_oldest_block_number = latest_block.header.number - HELIOS_MAX_BLOCKS_BACKWARD;

Copilot uses AI. Check for mistakes.
add_failed_block(blocks_failed_tx.clone(), block_number, block_hash).await;
continue;
}
if block_number % 10 == 0 {
Copy link

Copilot AI Jul 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The magic number 10 for progress logging frequency should be defined as a named constant.

Suggested change
if block_number % 10 == 0 {
if block_number % PROGRESS_LOGGING_FREQUENCY == 0 {

Copilot uses AI. Check for mistakes.
node_near_account_id: AccountId,
) {
) -> anyhow::Result<BlockNumber> {
let Some(start_block_number) = start_block_number else {
Copy link

Copilot AI Jul 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The function signature change from () to -> anyhow::Result<BlockNumber> while the early return case returns an error seems inconsistent with the PR's goal of removing silent failures. Consider logging this case and returning a success value instead.

Suggested change
let Some(start_block_number) = start_block_number else {
let Some(start_block_number) = start_block_number else {
tracing::error!("Failed to start catch-up: no start block number provided");

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor

@volovyks volovyks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ppca ppca merged commit ab2fa37 into develop Jul 22, 2025
3 checks passed
@ppca ppca deleted the xiangyi/fix_indexer branch July 22, 2025 22:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants