Skip to content

plugin-drizzle: add drizzle plugin and persistence #130

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 14 commits into from
Jan 1, 2025

Conversation

jaipaljadeja
Copy link
Member

No description provided.

@jaipaljadeja jaipaljadeja requested a review from fracek December 30, 2024 09:54
Copy link

coderabbitai bot commented Dec 30, 2024

📝 Walkthrough

Walkthrough

This pull request introduces comprehensive updates across multiple packages in the Apibara ecosystem, focusing on enhancing the Drizzle, Mongo, and SQLite storage plugins. The changes primarily involve adding finalization configurations, improving state management, and refactoring persistence mechanisms. Key modifications include adding default indexer names, updating transaction handling, and introducing more robust error management for various storage operations.

Changes

File Change Summary
change/ JSON files Added prerelease configurations for @apibara/indexer, @apibara/plugin-drizzle, @apibara/plugin-mongo, and @apibara/plugin-sqlite
packages/indexer/src/internal/testing.ts Added finalize option to MockMessagesOptions and updated generateMockMessages function
packages/plugin-drizzle/ Significant updates to storage, persistence, and utility modules with enhanced database transaction and rollback mechanisms
packages/plugin-mongo/src/ Modified finalization logic and made indexerName a required parameter
packages/plugin-sqlite/src/ Updated storage functions to require indexerName and modified finalization SQL statements

Sequence Diagram

sequenceDiagram
    participant Client
    participant StoragePlugin
    participant Database
    participant TransactionManager

    Client->>StoragePlugin: Initialize storage
    StoragePlugin->>Database: Create schema
    StoragePlugin->>TransactionManager: Begin transaction
    Client->>StoragePlugin: Persist state
    StoragePlugin->>Database: Insert/Update records
    StoragePlugin->>TransactionManager: Commit transaction
    Client->>StoragePlugin: Finalize state
    StoragePlugin->>Database: Clean up old records
Loading

Possibly Related PRs

Suggested Reviewers

  • fracek

Poem

🐰 Indexing Rabbit's Delight

With Drizzle, Mongo, and SQLite bright,
Transactions dance, cursors take flight,
Finalize, persist with all my might,
Code flows smooth, future shines light!

Hop hop, code on! 🚀

Tip

CodeRabbit's docstrings feature is now available as part of our Early Access Program! Simply use the command @coderabbitai generate docstrings to have CodeRabbit automatically generate docstrings for your pull request. We would love to hear your feedback on Discord.


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Comment on lines 143 to 152
await withTransaction(db, async (tx) => {
if (endCursor && request.filter[1]) {
await persistState({
tx,
endCursor,
filter: request.filter[1],
indexerName,
});
}
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think connect:factory should be executed in the same transaction as handler:middlleware. Access it through the context.

@@ -1,6 +1,6 @@
import { describe, expect, it } from "vitest";

describe("drizzleStorage", () => {
describe("Drizzle test", () => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we should add end-to-end tests like in the sqlite plugin using pglite.

@fracek
Copy link
Contributor

fracek commented Dec 30, 2024

Looks good for now! I wonder if we should also create the persistence plugins "manually" like we do for the reorg tables? Call them __indexer_checkpoints and __indexer_filters.

@jaipaljadeja
Copy link
Member Author

jaipaljadeja commented Dec 30, 2024

Looks good for now! I wonder if we should also create the persistence plugins "manually" like we do for the reorg tables? Call them __indexer_checkpoints and __indexer_filters.

i think its a good idea to create it manually, so user has to do one less thing during a setup but i have some concerns like what if we need to modify these table structures in future releases, we'd need a strategy to handle users who have already migrated using the older table definitions and potentially make schema updates more complex to manage.
Same concern with reorg tables.

@fracek
Copy link
Contributor

fracek commented Dec 30, 2024

We'd need a strategy to handle users who have already migrated using the older table definitions and potentially make schema updates more complex to manage. Same concern with reorg tables.

Yes, we will need to handle migrations if the table schema changes. Usually this is done like this:

  • create a schema_version table with just two columns (the current schema version, e.g. 0 and a key used in the upsert statement)
  • create a list of migrations (sql statements) to execute to upgrade each schema version to the next.

For example once we reach schema version 3

// constants based on the current release.

const SCHEMA_VERSION = 3;
const migrations = [
[
  `/* migrations from v0 to v1 */`
],
[
  `/* migrations from v1 to v2 */`
],
[
  `/* migrations from v2 to v3 */`
],
];

// In the initialization code.
const currentVersion = `select version from schema_version where k = 0`.run();
let v = currentVersion ?? 0;

if (currentVersion > SCHEMA_VERSION) {
  throw new Error("version mismatch");
}

// run all migrations
`begin transaction`.run();
while (v < SCHEMA_VERSION) {
  // a migration can have multiple statements
  for (const statement of migrations[i]) {
   statement.run();
  }
  v += 1;
}
// bump version stored in the schema
`update schema_version set version = ${SCHEMA_VERSION} where k = 0`.run();
`commit transaction`.run();

@fracek
Copy link
Contributor

fracek commented Dec 30, 2024

I'm happy to make this change in this pr. You just need to create the table __indexer_schema_version with k = 0 and version = 0. Then define you can implement most of the code above but without the need to run the migrations for now (so simply check the version is not greater than the current version).

db,
persistState: enablePersistence = true,
indexerName = "default",
schema,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like that it's possible to override the schema!

TSchema extends
TablesRelationalConfig = ExtractTablesWithRelations<TFullSchema>,
>(
_db: PgDatabase<TQueryResult, TFullSchema, TSchema>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make this parameter optional? The schema is only needed to access the db like the following:

db.query.users.findMany({ ... });

as opposed to

db.select().from(users).where(...)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright will make it optional.

@jaipaljadeja jaipaljadeja marked this pull request as ready for review December 31, 2024 12:37
Comment on lines 203 to 238
const { endCursor, finality } = context as {
endCursor: Cursor;
finality: DataFinality;
};

if (!endCursor) {
throw new DrizzleStorageError("end cursor is undefined");
}

await withTransaction(db, async (tx) => {
context[DRIZZLE_PROPERTY] = { db: tx } as DrizzleStorage<
TQueryResult,
TFullSchema,
TSchema
>;

if (finality !== "finalized") {
await registerTriggers(tx, tableNames, endCursor, idColumn);
}

await next();
delete context[DRIZZLE_PROPERTY];

if (enablePersistence) {
await persistState({
tx,
endCursor,
indexerName,
});
}
});

if (finality !== "finalized") {
// remove trigger outside of the transaction or it won't be triggered.
await removeTriggers(db, tableNames);
}
Copy link
Contributor

@fracek fracek Dec 31, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably want to wrap this in a try { ... } finally { ... } to remove the triggers in all cases.

);
}
} catch (error) {
console.error(error);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should not log the error like this. We can add a drizzle: Error property to DrizzleStorageError and change the constructor to be like new DrizzleStorageError("mesage...", { error })

)) as { rows: ReorgRollbackRow[] };

if (!Array.isArray(result)) {
throw new Error("Invalid result format from reorg_rollback query");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be DrizzleStorageError

case "I":
try {
// For inserts, delete the row
if (op.row_id) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Throw if no op.row_id

case "D":
try {
// For deletes, reinsert the row using json_populate_record
if (op.row_value) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Throw if now row_value

case "U":
try {
// For updates, restore previous values
if (op.row_value && op.row_id) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Throw if either no row_value or row_id

`);

await tx.execute(query);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a default: case to throw since the op is an unknown value.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (13)
packages/plugin-drizzle/src/utils.ts (1)

32-38: Potential malformed JSON catch
deserialize parses JSON and attempts to convert bigint-like strings. Consider wrapping in a try/catch if you expect malformed or user-provided JSON. This would help return more descriptive errors rather than letting a SyntaxError bubble up.

packages/plugin-drizzle/src/index.ts (3)

61-72: DrizzleStorageOptions
The configuration options are clear and flexible. Consider clarifying in docs that specifying schema is optional if db._.schema is sufficient.


134-149: Data invalidation logic
On connect:after, invalidate is triggered if the new run must remove partially processed blocks. This is essential for reorg handling.

Consider logging or metrics to confirm how often reorgs or partial runs occur, which can help debug environment issues.


169-183: Finalize logic
message:finalize correctly finalizes the block and updates the persistent state. Consider whether any extra logging is needed to detect stuck finalization attempts. Otherwise, it seems correct.

packages/plugin-drizzle/src/storage.ts (2)

32-85: Initialization logic
Creating the __reorg_rollback table and the trigger function reorg_checkpoint is well structured, allowing consistent auditing of INSERT, UPDATE, and DELETE. Consider also addressing potential concurrency constraints if multiple processes or indexers do writes simultaneously.


136-241: invalidate function
The rollback logic is thorough, narrowing the rows after a cursor point. It systematically undoes INSERT, DELETE, and UPDATE. This ensures a correct reorg scenario. Keep an eye on performance if large rollbacks are frequent.

packages/plugin-drizzle/src/persistence.ts (1)

44-51: CURRENT_SCHEMA_VERSION and MIGRATIONS
You have a placeholder for incremental schema changes. Smart forward-thinking for versioned migrations.

packages/plugin-drizzle/tests/helper.ts (3)

1-5: Check for TypeScript path alias usage or alternative import strategies.

Imports for Drizzle and related dependencies look consistent. However, if your project often references similar modules across packages, consider implementing path aliases in your tsconfig for improved maintainability.


6-16: Schema definition validation and naming clarity.

Defining the test table with common fields like blockNumber, key, count, and data is straightforward. Ensure that naming is intuitive and consistent across the codebase (e.g., consistently naming columns block_number or blockNumber) and that all fields will be used in tests.


37-54: Migration error handling and forward compatibility.

This migration approach is minimal but solid for initial setups. If future schema changes are expected, consider adding a versioning strategy or dedicated migrations directory. Throwing a DrizzleStorageError is good, but it may be beneficial to provide a more user-friendly fallback or automated migration steps if the schema expands over time.

packages/indexer/src/internal/testing.ts (1)

29-29: Readable variable naming for finalizeAt.

The variable name finalizeAt conveys its meaning. Consider grouping invalidation and finalization logic in a single variable or object if they often appear together, ensuring clarity in test setups.

packages/plugin-mongo/src/index.ts (2)

57-57: LGTM! Good default value addition.

Adding a default value for indexerName improves robustness and aligns with the persistence strategy discussed in the PR comments.

Consider documenting this default value in:

  1. README to help users understand the default behavior
  2. Migration guide for users who might be relying on undefined behavior

Line range hint 1-190: Consider enhancing the persistence strategy.

Based on the PR discussion about schema migrations, consider implementing:

  1. A schema version tracking mechanism
  2. Migration utilities for future schema changes

Following fracek's suggestion, consider:

  1. Creating a __indexer_schema_version table
  2. Implementing version check logic
  3. Preparing for future migration capabilities

This would make the plugin more maintainable and upgrade-friendly.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 03729dd and 1760958.

⛔ Files ignored due to path filters (1)
  • pnpm-lock.yaml is excluded by !**/pnpm-lock.yaml
📒 Files selected for processing (21)
  • change/@apibara-indexer-4f8762fe-e45f-4fa5-8b17-453abb123f8c.json (1 hunks)
  • change/@apibara-plugin-drizzle-916f8f1b-3fc8-423e-9268-52790d29903c.json (1 hunks)
  • change/@apibara-plugin-mongo-a083deaa-e6d0-428a-b3b6-633d2a9a44ed.json (1 hunks)
  • change/@apibara-plugin-sqlite-4c3f1380-bbe7-4007-904d-b75edf438fd7.json (1 hunks)
  • packages/indexer/src/internal/testing.ts (3 hunks)
  • packages/plugin-drizzle/docker-compose.yml (1 hunks)
  • packages/plugin-drizzle/package.json (1 hunks)
  • packages/plugin-drizzle/src/index.ts (1 hunks)
  • packages/plugin-drizzle/src/persistence.test.ts (0 hunks)
  • packages/plugin-drizzle/src/persistence.ts (2 hunks)
  • packages/plugin-drizzle/src/storage.ts (1 hunks)
  • packages/plugin-drizzle/src/utils.ts (1 hunks)
  • packages/plugin-drizzle/tests/helper.ts (1 hunks)
  • packages/plugin-drizzle/tests/storage.test.ts (1 hunks)
  • packages/plugin-drizzle/tsconfig.json (1 hunks)
  • packages/plugin-mongo/src/index.ts (1 hunks)
  • packages/plugin-mongo/src/mongo.ts (1 hunks)
  • packages/plugin-mongo/src/persistence.ts (4 hunks)
  • packages/plugin-sqlite/src/index.ts (1 hunks)
  • packages/plugin-sqlite/src/kv.ts (1 hunks)
  • packages/plugin-sqlite/src/persistence.ts (5 hunks)
💤 Files with no reviewable changes (1)
  • packages/plugin-drizzle/src/persistence.test.ts
✅ Files skipped from review due to trivial changes (5)
  • packages/plugin-drizzle/docker-compose.yml
  • change/@apibara-plugin-sqlite-4c3f1380-bbe7-4007-904d-b75edf438fd7.json
  • change/@apibara-plugin-mongo-a083deaa-e6d0-428a-b3b6-633d2a9a44ed.json
  • change/@apibara-indexer-4f8762fe-e45f-4fa5-8b17-453abb123f8c.json
  • change/@apibara-plugin-drizzle-916f8f1b-3fc8-423e-9268-52790d29903c.json
🔇 Additional comments (54)
packages/plugin-drizzle/src/utils.ts (3)

1-9: Imports look good.
These imports from drizzle-orm and drizzle-orm/pg-core provide strong database typing, ensuring better type safety. No issues found.


18-30: Ensure consistent transaction error handling
The withTransaction function properly wraps the callback in a transaction. Consider whether you need additional error logging/handling inside the callback to avoid partial commits. Otherwise, this looks solid.


40-46: Serialization approach is clear
The serialize function properly handles bigint values by adding an "n" suffix. This is a straightforward approach for JSON representation of large integers.

packages/plugin-drizzle/src/index.ts (8)

1-29: Imports and constants
These import statements and constants (DRIZZLE_PROPERTY) are straightforward. They provide clarity in distinguishing the drizzle-specific context.


33-41: Transaction-bound storage interface
The DrizzleStorage type associating PgTransaction with typed schemas is well-structured. It makes the plugin usage more predictable for consumers.


42-59: Dependency check in useDrizzleStorage
Throwing a DrizzleStorageError when the plugin isn’t registered is a good safeguard. The function’s usage pattern looks correct.


84-100: Initial plugin setup
The logic in drizzleStorage hooking into defineIndexerPlugin is well structured. On run:before, the plugin uses transactions to set up tables if needed. This approach ensures a clean environment on every run.


101-132: State retrieval on connect:before
Properly fetching the stored cursor and filter with getState is a good design. It automatically resumes from the last known point.


151-167: Factory-mode hook usage
Using the connect:factory hook to persist state is a convenient approach. It ensures the transaction from the context is used. This code looks correct.


185-199: Reorg handling
message:invalidate triggers the reorg logic. Ensure large-scale reorg scenarios remain performant. Currently, it’s adequate, but you may want to measure if reorgs can trigger excessively large rollbacks.


201-240: Transaction-bound middleware
Wrapping the handler with a transaction inside handler:middleware is a neat approach to ensure that state changes are atomic. The final code snippet removing triggers after the transaction is also a good safety measure.

packages/plugin-drizzle/src/storage.ts (6)

1-11: Import statements
Everything is well organized for managing PostgreSQL concurrency with Drizzle. No issues found.


19-20: ReorgOperation type
Defining these operations as "I" | "U" | "D" clarifies the possible states for auditing reorg changes. This is helpful for strict type-checking.


21-30: Rollback table definition
Storing op, table_name, and cursor is a robust approach to track historical rows. Indicating row_value as JSONB is flexible for capturing old data.


87-116: Registering triggers
Registering triggers for each table with reorg_checkpoint ensures all changes are tracked. The usage of deferrable constraints is a nice detail so that constraints only apply at transaction commit.


118-134: Removing triggers
Dropping triggers is straightforward here. Consider verifying that the triggers are not used by another indexer instance in parallel. Otherwise, your approach is consistent.


242-259: finalize function
Deleting rollback entries at or below the finalized cursor is a neat way to cap the rollback history. This ensures the system doesn’t store unbounded historical data.

packages/plugin-drizzle/src/persistence.ts (9)

2-3: Query builder usage
The usage of and, eq, gt, lt, etc., from drizzle-orm is consistent. Code reads clearly.


11-13: Defining table names
Using __indexer_checkpoints, __indexer_filters, and __indexer_schema_version as constants clarifies that these tables are reserved for indexing logic. Good naming.


Line range hint 15-36: checkpoints and filters tables
Defining separate tables for checkpoint metadata and filters helps keep the data well structured. This organization will ease any future expansions.


39-42: Tracking schema version
schemaVersion table provides a future-proof path for migrations. This aligns with the user’s conversation about managing schema updates seamlessly.


53-131: initializePersistentState
This function orchestrates schema creation and version checks. Good job verifying the stored version is not ahead of the code’s known version. The fallback for running or skipping migrations is also cleanly implemented.


134-193: persistState
Handles checkpoint insertion/upsertion and filter updates in a single transaction. Ensure that large volumes of filters on the same indexer keep the logic consistent.


195-234: getState
Returns the last cursor and active filter. This is exactly what’s needed to restore or continue indexing. Implementation is straightforward.


236-271: invalidateState
Deletes or adjusts filters whose fromBlock or toBlock is greater than the rollback point. This is key for correct partial reorg. The logic appears correct.


273-296: finalizeState
Cleaning out filter rows below a certain finalized block helps ensure data cleanliness. Good approach.

packages/plugin-drizzle/tests/storage.test.ts (10)

19-106: Test: “should store data along with audit”
Verifies the insertion and corresponding rows in __reorg_rollback. This covers the simplest scenario thoroughly. The snapshot usage is spot-on.


108-204: Test: “should update data along with audit”
Covers the insert-if-missing or update-if-found pattern. The rollback table gets U operations for updates, confirming the old state is stored.


206-291: Test: “should delete data along with audit”
Ensures the rollback table gets a D operation with the old row. This is essential to later restore that row during invalidation scenarios.


293-414: Test: “should invalidate data”
Checks correct rollback logic for partial progress. This ensures that lines with artificial blocks or updated entries can be undone.


490-607: Test: “should finalize data”
Verifies the rollback table is emptied after finalizing. This test ensures no leftover entries linger once blocks become final.


609-661: Test: “should persist state”
Intentionally throws an error to test whether data up to the failing point are persisted. This is a strong test scenario that verifies resilience.


663-839: Tests for “factory mode”
Covers the composited filter approach, stepping from empty filter to “B” to “BC.” It’s a good real-world scenario for incremental modifications. The final verification of filters and checkpoints is correct.


841-1008: Invalidate in factory mode
Ensures partial reorg logic is triggered while using dynamic filters. This is an essential coverage case for advanced use of multiple filters.


1010-1177: Finalize in factory mode
Ensures that once the final cursor is marked finalized, the rollback data for that indexer is cleared. Good parallel to the earlier finalize test.


1179-1220: Schema version check
Ensures the schemaVersion table was initialized to 0. This covers the initial state and supports incremental migrations in the future.

packages/plugin-mongo/src/mongo.ts (1)

46-46: Use of $lte for finalization condition.

Changing the finalization condition from < to <= includes boundary records when finalizing. If the intention is to ensure that all records up to and including a certain cursor value are considered finalized, this adjustment is correct. Be sure to confirm that no unintended records (e.g., borderline cases) are accidentally removed.

packages/plugin-drizzle/tests/helper.ts (1)

18-35: Potential collision risk in memory-based databases with random UUIDs.

The code uses a randomly generated UUID for the in-memory database name. Though collisions are rare, carefully handle potential name conflicts or provide fallback logic if collisions occur.

Would you like assistance generating a fallback approach if db initialization fails due to a name collision?

packages/plugin-sqlite/src/kv.ts (1)

108-108: Finalization condition inclusive of boundary block.

Switching from < to <= ensures that entries with a to_block value exactly matching the finalizing block are removed. Verify that this meets the intended requirement, as it could alter the historical data boundary when finalizing.

packages/indexer/src/internal/testing.ts (2)

18-21: Flexible finalization options in MockMessagesOptions.

Adding the optional finalize property allows more comprehensive mock scenarios. Confirm that tests fully cover the boundary cases, such as defining only one of the nested properties or setting conflicting values.


41-50: Proper finalize message generation.

Using orderKey: 5_000_000 + finalizeAt.finalizeToIndex is consistent, but ensure that scenarios with overlapping or interleaving invalidate and finalize triggers are tested. Confirm the sequence of messages is well-defined if both are triggered in close succession.

packages/plugin-mongo/src/persistence.ts (4)

41-43: Mandatory indexerName now required—ensure all calls provide it.
By making indexerName required, callers must supply a value. This helps prevent unexpected runtime issues but requires all existing references to be updated accordingly.


89-91: Confirm consistent usage of indexerName in getState.
This change aligns with other functions in this file, ensuring that the same indexer name is used for both checkpoint and filter retrieval.


128-130: Required indexerName parameter in invalidateState.
Previously optional, now mandatory. Verify that all references in your codebase pass the correct indexer name to avoid null or undefined parameters.


152-159: Switch from $lt to $lte in finalization logic.
Using $lte changes finalization to include the boundary block. Verify that this meets the intended behavior, as it will remove records up to and including cursor.orderKey.

packages/plugin-sqlite/src/persistence.ts (5)

16-18: indexerName is now mandatory in persistState.
All invocations must supply an indexer name. This ensures data is stored under a clear, non-ambiguous key.


43-45: Enforcing indexerName in getState.
Consistent with other changes, this enforces a unique identifier for retrieving indexed data.


76-78: Mandatory indexerName for finalizeState.
Previously, a default or optional value was allowed. Now, ensure you provide a valid string for finalization.


89-91: Mandatory indexerName for invalidateState.
This standardizes how the indexer name is handled across all functions. Make sure your code provides a correct indexer identifier.


150-150: Use of <= for finalizing filters.
Changing from < to <= means more inclusive finalization. This might alter the logic if the boundary block was previously excluded.

packages/plugin-sqlite/src/index.ts (1)

71-71: Assigned default 'default' to indexerName.
This ensures a consistent fallback if no indexerName is provided, preventing undefined behavior.

packages/plugin-drizzle/tsconfig.json (1)

8-8: Simplified TypeScript config by removing extraneous properties.
Confirm that removing rootDir and include fits your build structure to avoid compilation issues.

packages/plugin-drizzle/package.json (2)

28-31: LGTM! Good practice moving to peerDependencies.

Moving drizzle-orm and pg to peerDependencies is a good architectural decision as it:

  • Allows consumers to manage their own versions
  • Prevents duplicate installations
  • Makes external requirements explicit

35-37: Verify version compatibility with latest releases.

The devDependencies mirror the peer dependencies, which is good. However, let's verify these versions are still current and don't have known vulnerabilities.

blockNumber: integer("block_number").notNull(),
key: text("key"),
count: integer("count"),
data: text("data"),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a datetime column to test if it's serialized/deserialized correctly?

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (5)
packages/plugin-drizzle/src/utils.ts (1)

40-46: Consider making indentation configurable.

The function correctly handles BigInt serialization, but the hardcoded tab indentation might not suit all use cases.

-export function serialize<T>(obj: T): string {
+export function serialize<T>(obj: T, indent: string | number = "\t"): string {
   return JSON.stringify(
     obj,
     (_, value) => (typeof value === "bigint" ? `${value.toString()}n` : value),
-    "\t",
+    indent,
   );
 }
packages/plugin-drizzle/tests/helper.ts (2)

19-35: Consider making debug mode configurable.

The debug option is currently commented out. Consider making it configurable through an environment variable for easier troubleshooting.

   const db = drizzle({
     schema: {
       testTable,
     },
     connection: {
-      // debug: true,
+      debug: process.env.DEBUG_DB === 'true',
       dataDir: `memory://${dbName}`,
     },
   });

37-56: Consider adding error logging before throwing.

While the error handling is good, adding logging before throwing would help with debugging.

   } catch (error) {
+    console.error('Migration failed:', error);
     throw new DrizzleStorageError("Migration failed", {
       cause: error,
     });
   }
packages/plugin-drizzle/src/persistence.ts (1)

39-51: Well-designed schema versioning system

Good implementation with:

  • Version tracking table
  • Placeholder for future migrations
  • Clear version constants

Consider adding documentation for the migration process and version update guidelines.

packages/plugin-drizzle/tests/storage.test.ts (1)

19-1241: Comprehensive test suite with room for improvement

The test suite effectively covers:

  • Basic CRUD operations
  • State persistence
  • Schema versioning
  • Factory mode operations

Consider adding:

  • Edge cases for concurrent operations
  • Error scenarios for network failures
  • Performance tests for large datasets
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1760958 and 263d8bf.

📒 Files selected for processing (6)
  • packages/plugin-drizzle/src/index.ts (1 hunks)
  • packages/plugin-drizzle/src/persistence.ts (2 hunks)
  • packages/plugin-drizzle/src/storage.ts (1 hunks)
  • packages/plugin-drizzle/src/utils.ts (1 hunks)
  • packages/plugin-drizzle/tests/helper.ts (1 hunks)
  • packages/plugin-drizzle/tests/storage.test.ts (1 hunks)
🔇 Additional comments (15)
packages/plugin-drizzle/src/utils.ts (3)

1-14: LGTM! Well-structured error handling.

The error class implementation follows best practices by:

  • Properly extending the Error class
  • Setting a custom name for better error identification
  • Supporting error cause through options

18-30: LGTM! Clean transaction abstraction.

The transaction wrapper is well-implemented with:

  • Strong type safety through generics
  • Clean async/await usage
  • Proper error propagation

32-38: Consider adding runtime type validation.

While the BigInt handling is good, the type assertion to T could be unsafe if the JSON structure doesn't match the expected type.

Consider using a validation library like zod or io-ts to ensure type safety at runtime. This would prevent potential runtime errors from mismatched types.

packages/plugin-drizzle/tests/helper.ts (1)

6-13: LGTM! Comprehensive test table schema.

The schema includes all necessary column types including timestamp, addressing the previous review comment about datetime testing.

packages/plugin-drizzle/src/index.ts (5)

33-59: LGTM! Well-structured storage access pattern.

The implementation provides:

  • Strong type safety through generics
  • Clear error messages
  • Clean context-based storage access

74-97: LGTM! Well-documented and flexible configuration.

The implementation:

  • Has clear JSDoc documentation
  • Provides sensible defaults
  • Makes schema optional as requested in previous reviews

120-178: LGTM! Robust state management implementation.

The implementation:

  • Properly handles cursor and filter state
  • Implements clean invalidation logic
  • Uses transactions consistently as discussed in previous reviews

179-259: LGTM! Robust message handling and cleanup.

The implementation:

  • Has comprehensive error handling
  • Properly cleans up triggers in finally block as suggested
  • Maintains transaction consistency

98-118: Verify behavior with empty schema.

While the schema handling is generally good, verify the behavior when no tables are defined.

✅ Verification successful

Empty schema handling is implemented correctly

The code safely handles empty schemas through multiple fallback mechanisms:

  • Uses nullish coalescing to fall back to db._.schema if provided schema is null/undefined
  • Falls back to empty object {} as the final fallback
  • Wraps the operation in try-catch with a specific DrizzleStorageError
  • Empty schema will result in empty tableNames array, which is a valid state

The implementation ensures that no runtime errors occur with empty schemas, making the current implementation robust and safe.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check if the code handles empty schema gracefully
ast-grep --pattern 'tableNames = Object.values($schema ?? $fallback ?? {}).map($_)'

Length of output: 85


Script:

#!/bin/bash
# Let's try a different pattern to match the schema extraction
ast-grep --pattern 'tableNames = Object.values($_).map((table) => table.dbName)'

# Also search for any tests or error handling related to empty schema
rg -A 3 "Failed to get table names" 

Length of output: 630


Script:

#!/bin/bash
# Let's check the DrizzleStorageError definition and any related error handling
rg -B 2 -A 2 "class DrizzleStorageError"

# Check for any tests related to schema handling
fd "test|spec" -e ts -e js --exec rg -l "schema.*empty|empty.*schema|Failed to get table names"

Length of output: 491

packages/plugin-drizzle/src/storage.ts (4)

41-51: Verify SQL injection protection in table creation

The table creation SQL is using raw SQL execution. While the table name is hardcoded and safe, consider using parameterized queries or escaping mechanisms for future modifications where table names might be dynamic.


54-78: Well-implemented trigger function with comprehensive audit logging

The trigger function effectively captures:

  • Operation type (INSERT/UPDATE/DELETE)
  • Old and new values
  • Table context
  • Proper error handling

165-257: Well-structured error handling and operation processing

The switch statement effectively handles different operation types with:

  • Proper validation of required fields
  • Specific error messages for each operation
  • Comprehensive error handling

261-279: Clean implementation of finalization logic

The function effectively:

  • Cleans up processed records
  • Uses proper transaction handling
  • Includes appropriate error handling
packages/plugin-drizzle/src/persistence.ts (2)

134-194: Robust state persistence implementation

Excellent implementation with:

  • Strong typing using generics
  • Proper transaction handling
  • Comprehensive error handling
  • Clean parameter structure using props object

196-301: Well-structured state management functions

The implementation demonstrates:

  • Consistent patterns across functions
  • Proper transaction handling
  • Comprehensive error handling
  • Good separation of concerns

Comment on lines +176 to +178
DELETE FROM ${op.table_name}
WHERE ${idColumn} = '${op.row_id}'
`),
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

SQL injection risk in DELETE query

Direct string concatenation of table name and column values poses a SQL injection risk.

Use parameterized queries:

-sql.raw(`
-    DELETE FROM ${op.table_name}
-    WHERE ${idColumn} = '${op.row_id}'
-  `),
+sql.raw(`
+    DELETE FROM $1
+    WHERE $2 = $3
+  `, [op.table_name, idColumn, op.row_id]),

Committable suggestion skipped: line range outside the PR's diff.

Comment on lines +100 to +101
sql.raw(`DROP TRIGGER IF EXISTS ${table}_reorg ON ${table};`),
);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Potential SQL injection vulnerability in trigger name

The table name is directly concatenated into the SQL string. While the current usage is safe with validated table names, consider using parameterized queries or proper escaping for future-proofing.

Consider using a prepared statement or proper escaping mechanism:

-sql.raw(`DROP TRIGGER IF EXISTS ${table}_reorg ON ${table};`),
+sql.raw(`DROP TRIGGER IF EXISTS $1_reorg ON $1;`, [table]),

Committable suggestion skipped: line range outside the PR's diff.

@jaipaljadeja jaipaljadeja requested a review from fracek December 31, 2024 14:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants