Skip to content

Conversation

@fracek
Copy link
Contributor

@fracek fracek commented Mar 21, 2025

The DNA server does not include the endCursor.uniqueKey for backfilled blocks since they cannot be part of a chain reorganization.
This means the value is undefined for these blocks.

In some circumstances this caused an issue with indexers that persist state to Drizzle.

  • Indexer receives "live" block X, this block contains a unique key.
  • Indexer disconnects for some time, enough for the block to become finalized.
  • Indexer reconnects and now receives block Y > X. Since this block is finalized, the unique key is undefined.
  • Drizzle persist the indexer's state, but the unique key is not overwritten since Drizzle ignores undefined values.
  • On restart, the indexer tries to reconnect from block with order key Y and unique key X.
  • This results in an error message like the following: starting cursor Y/X not found. canonical: YYYY, reorged: none

This PR makes sure we overwrite the unique key to null if it's undefined.

@coderabbitai
Copy link

coderabbitai bot commented Mar 21, 2025

📝 Walkthrough

Walkthrough

This pull request introduces new JSON configuration files for several packages specifying prerelease versions with patch changes. Modifications have been made to enhance state persistence and message generation by introducing optional properties and refining the handling of the uniqueKey in both the persistence and testing logic. Several tests have been added or removed to better reflect the new state management behavior.

Changes

File(s) Summary
change/@apibara-indexer-*.json
change/@apibara-plugin-drizzle-*.json
change/@apibara-plugin-mongo-*.json
change/@apibara-plugin-sqlite-*.json
New JSON configuration files added to declare prerelease versions and patch changes, including comments on state persistence fixes and conditional handling of the uniqueKey.
packages/indexer/src/internal/testing.ts Updated MockMessagesOptions to include optional uniqueKey and baseBlockNumber properties; added the helper function uniqueKeyFromOrderKey and modified message generation logic accordingly.
packages/plugin-drizzle/src/persistence.ts
packages/plugin-mongo/src/persistence.ts
Modified persistState functions to conditionally assign uniqueKey as null when falsy, ensuring proper signaling for deletions in the database.
packages/plugin-drizzle/tests/persistence.test.ts
packages/plugin-drizzle/tests/storage.test.ts
Added a new comprehensive test suite for state persistence and removed several cases from the storage tests to streamline validation of persistence, finalization, invalidation, and schema version checks.

Sequence Diagram(s)

sequenceDiagram
    participant Indexer
    participant MessageGenerator
    participant Tester
    Tester->>MessageGenerator: Call generateMockMessages with options
    MessageGenerator->>MessageGenerator: Calculate currentBlockNumber using baseBlockNumber
    MessageGenerator->>MessageGenerator: Compute uniqueKey via uniqueKeyFromOrderKey if enabled
    MessageGenerator-->>Tester: Return messages with updated orderKey and uniqueKey
Loading
sequenceDiagram
    participant Plugin
    participant PersistState
    participant Database
    Plugin->>PersistState: Invoke persistState with endCursor
    alt endCursor.uniqueKey is truthy
        PersistState->>Database: Store uniqueKey value
    else
        PersistState->>Database: Store uniqueKey as null (indicating deletion)
    end
    Database-->>PersistState: Acknowledge update
Loading

Possibly related PRs

Suggested reviewers

  • jaipaljadeja

Poem

I'm a little rabbit, nimble and keen,
Hopping through JSON files so clean.
Test cases and persistence now dance in the light,
With unique keys handled just right.
Code carrots crunch as our features take flight! 🐇💻


📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 02dca40 and d31120c.

📒 Files selected for processing (5)
  • change/@apibara-indexer-795d3289-02e0-4d36-bfca-345341984941.json (1 hunks)
  • change/@apibara-plugin-drizzle-13ed35ed-0dd1-478f-9722-6501d99b9f31.json (1 hunks)
  • change/@apibara-plugin-mongo-85e9b5c6-310b-478d-b775-9335d4b7a04d.json (1 hunks)
  • change/@apibara-plugin-sqlite-af8ce0db-e7d3-440e-94d8-d1723495c5bb.json (1 hunks)
  • packages/plugin-sqlite/src/index.ts (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • change/@apibara-plugin-sqlite-af8ce0db-e7d3-440e-94d8-d1723495c5bb.json
🚧 Files skipped from review as they are similar to previous changes (3)
  • change/@apibara-plugin-drizzle-13ed35ed-0dd1-478f-9722-6501d99b9f31.json
  • change/@apibara-plugin-mongo-85e9b5c6-310b-478d-b775-9335d4b7a04d.json
  • change/@apibara-indexer-795d3289-02e0-4d36-bfca-345341984941.json
⏰ Context from checks skipped due to timeout of 90000ms (1)
  • GitHub Check: test
🔇 Additional comments (1)
packages/plugin-sqlite/src/index.ts (1)

220-224:

✅ Verification successful

Verify if removing the cursor validation check is sufficient

According to the PR objectives, the issue involves handling undefined uniqueKey values for backfilled blocks. While removing the strict validation is a step toward allowing these blocks, I don't see where the PR explicitly sets undefined uniqueKey values to null as mentioned in the objectives.

Please confirm that this change, along with potential changes in other files (like persistence.ts), properly addresses the issue with endCursor.uniqueKey for backfilled blocks as described in the PR objectives.


🏁 Script executed:

#!/bin/bash
# Search for uniqueKey handling in relevant persistence files
echo "Searching for uniqueKey handling in persistence.ts:"
grep -n "uniqueKey" packages/plugin-sqlite/src/persistence.ts || echo "No direct references to uniqueKey found"

# Search for cursor persistence code that might handle the undefined uniqueKey
echo -e "\nSearching for cursor persistence implementation:"
grep -n -A 10 "persistState" packages/plugin-sqlite/src/persistence.ts || echo "persistState implementation not found"

Length of output: 754


Cursor Validation & Undefined uniqueKey Handling Verified

The persistence code in packages/plugin-sqlite/src/persistence.ts (line 62) clearly sets uniqueKey to null when not present, which meets the PR objectives. Removing the strict cursor validation in index.ts (lines 220–224) is therefore sufficient.

  • In persistence.ts:
    • Line 25: References endCursor.uniqueKey
    • Line 62: Sets uniqueKey: storedCursor.unique_key ? storedCursor.unique_key : null,

Given this handling, the changes addressing undefined uniqueKey values for backfilled blocks are implemented as expected.

✨ Finishing Touches
  • 📝 Generate Docstrings

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (3)
packages/indexer/src/internal/testing.ts (3)

54-55: Fix minor typo in variable naming.

Consider correcting “fianlizedToBlock” to “finalizedToBlock” to maintain clarity:

-      const fianlizedToBlock =
+      const finalizedToBlock =

72-73: Duplicate field definitions.

These lines appear to mirror lines 22-23. If unintentional duplication, consider refactoring or confirming they’re required in two locations. Otherwise, it’s consistent with the prior addition.


87-89: Clarify prefix for generated unique key.

Appending 0xff00 is effective, but ensure there’s no confusion about its purpose or potential collisions. Could revise or document the prefix if used more widely.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1365408 and 02dca40.

📒 Files selected for processing (8)
  • change/@apibara-indexer-795d3289-02e0-4d36-bfca-345341984941.json (1 hunks)
  • change/@apibara-plugin-drizzle-13ed35ed-0dd1-478f-9722-6501d99b9f31.json (1 hunks)
  • change/@apibara-plugin-mongo-85e9b5c6-310b-478d-b775-9335d4b7a04d.json (1 hunks)
  • packages/indexer/src/internal/testing.ts (3 hunks)
  • packages/plugin-drizzle/src/persistence.ts (1 hunks)
  • packages/plugin-drizzle/tests/persistence.test.ts (1 hunks)
  • packages/plugin-drizzle/tests/storage.test.ts (1 hunks)
  • packages/plugin-mongo/src/persistence.ts (1 hunks)
🧰 Additional context used
🧬 Code Definitions (1)
packages/indexer/src/internal/testing.ts (2)
packages/protocol/src/stream.ts (4) (4)
  • Invalidate (88-93)
  • Invalidate (95-95)
  • Finalize (97-102)
  • Finalize (104-104)
packages/protocol/src/proto/stream.ts (4) (4)
  • Invalidate (188-199)
  • Invalidate (753-823)
  • Finalize (202-209)
  • Finalize (829-882)
⏰ Context from checks skipped due to timeout of 90000ms (1)
  • GitHub Check: test
🔇 Additional comments (17)
change/@apibara-indexer-795d3289-02e0-4d36-bfca-345341984941.json (1)

1-7: Configuration change for indexer package

This new JSON configuration file correctly identifies the patch release needed for fixing state persistence issues when switching back to backfill. The change is related to the problem described in the PR objectives.

change/@apibara-plugin-mongo-85e9b5c6-310b-478d-b775-9335d4b7a04d.json (1)

1-7: Configuration change for MongoDB plugin

This configuration file correctly describes the purpose of the change: explicitly setting uniqueKey to null when it's undefined. This aligns with the implementation change in the persistence.ts file and addresses the issue described in the PR objectives.

change/@apibara-plugin-drizzle-13ed35ed-0dd1-478f-9722-6501d99b9f31.json (1)

1-7: Configuration change for Drizzle plugin

This configuration file properly documents the prerelease patch for fixing state persistence when switching back to backfill in the Drizzle plugin. This is consistent with the overall fix described in the PR objectives.

packages/plugin-mongo/src/persistence.ts (1)

47-47: Fix for undefined uniqueKey handling

This change properly implements the solution described in the PR objectives. By explicitly setting uniqueKey to null when endCursor.uniqueKey is falsy, it prevents the persistence layer from ignoring undefined values. This addresses the issue that occurs when an indexer processes a live block with a unique key, disconnects, and then reconnects to a finalized block without a unique key.

packages/plugin-drizzle/src/persistence.ts (1)

177-179: Ensure proper handling of uniqueKey updates.

Setting uniqueKey to null when the incoming value is falsy correctly signals that it should be cleared. This is consistent with your inline comment and solves the issue where Drizzle otherwise won't update the column. It's advisable to confirm that all queries or downstream logic properly differentiate between a null key and a truly absent key.

packages/indexer/src/internal/testing.ts (4)

22-23: New optional fields look good.

Introducing uniqueKey?: boolean and baseBlockNumber?: bigint is a straightforward approach to customize block numbering and key generation in tests. No immediate concerns.


34-35: Good use of fallback for baseBlockNumber.

Using options?.baseBlockNumber ?? BigInt(5_000_000) provides a sensible default, enhancing flexibility for test scenarios.


37-38: Efficient unique key derivation.

Generating a unique key from currentBlockNumber centralizes logic and keeps test code clear. No further issues noted.


40-41: No issues with invalidateToBlock.

This calculation is consistent and clearly offsets from baseBlockNumber. The approach is straightforward.

packages/plugin-drizzle/tests/storage.test.ts (1)

17-17: Renaming the describe block.

Renaming the top-level describe to "Drizzle storage" is clear, matching the file’s purpose. No concerns.

packages/plugin-drizzle/tests/persistence.test.ts (7)

19-69: Test “should persist state” gracefully handles errors.

  1. The test ensures the DB records remain consistent up to the failure point (line 37).
  2. Throwing an error to confirm partial persistence is a solid approach to verifying rollback vs. commit logic.

Looks robust for validating partial updates and error handling.


71-122: Test “should override the persisted uniqueKey.”

  1. Verifies that new runs can clear or update the uniqueKey field in the checkpoint table.
  2. Adequately checks end state with two separate mock clients.

Well-structured coverage for unique key override behavior.


124-308: Test “should persist the filters and latest block number (factory mode).”

  1. Exercises dynamic filter creation via a factory function.
  2. Tracks transitions from block 100 to 108 and ensures filters table updates accordingly.

Comprehensive test for multi-filter state management.


310-484: Test “should invalidate state (factory mode).”

  1. Checks partial invalidation using _tag: "invalidate", reverting to the correct block.
  2. Ensures filters table is updated to reflect the truncated range.

Covers reorg/invalidate logic thoroughly.


486-660: Test “should finalize state (factory mode).”

  1. Validates finality at block 107, ensuring upstream blocks remain unaffected.
  2. Confirmed updates to filters and checkpoints align with finality logic.

Solid coverage of finalize scenarios.


662-701: Test “persistence schema version check.”

  1. Ensures schemaVersion is set upon initialization.
  2. Straightforward confirmation of expected version (0).

No issues; demonstrates appropriate schema setup.


702-845: Test “should not persist state for pending blocks.”

  1. Correctly only commits data for accepted blocks; pending blocks remain ephemeral.
  2. The final checkpoint is the last accepted block, highlighting partial acceptance logic.

This is a key scenario ensuring correctness of final vs. pending states.

@fracek fracek force-pushed the fix-drizzle-persistence branch from 02dca40 to d31120c Compare March 21, 2025 15:50
@jaipaljadeja jaipaljadeja self-requested a review March 21, 2025 15:54
Copy link
Member

@jaipaljadeja jaipaljadeja left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! Thank you.

@jaipaljadeja jaipaljadeja merged commit f6caa47 into apibara:main Mar 21, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants