plugin-drizzle: add drizzle plugin and persistence #130

jaipaljadeja · 2024-12-30T09:54:07Z

No description provided.

coderabbitai · 2024-12-30T09:54:14Z

📝 Walkthrough

Walkthrough

This pull request introduces comprehensive updates across multiple packages in the Apibara ecosystem, focusing on enhancing the Drizzle, Mongo, and SQLite storage plugins. The changes primarily involve adding finalization configurations, improving state management, and refactoring persistence mechanisms. Key modifications include adding default indexer names, updating transaction handling, and introducing more robust error management for various storage operations.

Changes

File	Change Summary
`change/` JSON files	Added prerelease configurations for `@apibara/indexer`, `@apibara/plugin-drizzle`, `@apibara/plugin-mongo`, and `@apibara/plugin-sqlite`
`packages/indexer/src/internal/testing.ts`	Added `finalize` option to `MockMessagesOptions` and updated `generateMockMessages` function
`packages/plugin-drizzle/`	Significant updates to storage, persistence, and utility modules with enhanced database transaction and rollback mechanisms
`packages/plugin-mongo/src/`	Modified finalization logic and made `indexerName` a required parameter
`packages/plugin-sqlite/src/`	Updated storage functions to require `indexerName` and modified finalization SQL statements

Sequence Diagram

sequenceDiagram
    participant Client
    participant StoragePlugin
    participant Database
    participant TransactionManager

    Client->>StoragePlugin: Initialize storage
    StoragePlugin->>Database: Create schema
    StoragePlugin->>TransactionManager: Begin transaction
    Client->>StoragePlugin: Persist state
    StoragePlugin->>Database: Insert/Update records
    StoragePlugin->>TransactionManager: Commit transaction
    Client->>StoragePlugin: Finalize state
    StoragePlugin->>Database: Clean up old records

Possibly Related PRs

feat: add finalize & invalidation implementations and mongo persistence #129: Introduces JSON configuration for @apibara/indexer with similar prerelease characteristics

Suggested Reviewers

fracek

Poem

🐰 Indexing Rabbit's Delight

With Drizzle, Mongo, and SQLite bright,
Transactions dance, cursors take flight,
Finalize, persist with all my might,
Code flows smooth, future shines light!

Hop hop, code on! 🚀

Tip

CodeRabbit's docstrings feature is now available as part of our Early Access Program! Simply use the command @coderabbitai generate docstrings to have CodeRabbit automatically generate docstrings for your pull request. We would love to hear your feedback on Discord.

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Generate unit testing code for this file.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

fracek · 2024-12-30T10:06:36Z

packages/plugin-drizzle/src/index.ts

+      await withTransaction(db, async (tx) => {
+        if (endCursor && request.filter[1]) {
+          await persistState({
+            tx,
+            endCursor,
+            filter: request.filter[1],
+            indexerName,
+          });
+        }
+      });


I think connect:factory should be executed in the same transaction as handler:middlleware. Access it through the context.

fracek · 2024-12-30T10:08:18Z

packages/plugin-drizzle/tests/storage.test.ts

@@ -1,6 +1,6 @@
 import { describe, expect, it } from "vitest";

-describe("drizzleStorage", () => {
+describe("Drizzle test", () => {


Here we should add end-to-end tests like in the sqlite plugin using pglite.

fracek · 2024-12-30T10:09:11Z

Looks good for now! I wonder if we should also create the persistence plugins "manually" like we do for the reorg tables? Call them __indexer_checkpoints and __indexer_filters.

jaipaljadeja · 2024-12-30T11:01:54Z

Looks good for now! I wonder if we should also create the persistence plugins "manually" like we do for the reorg tables? Call them __indexer_checkpoints and __indexer_filters.

i think its a good idea to create it manually, so user has to do one less thing during a setup but i have some concerns like what if we need to modify these table structures in future releases, we'd need a strategy to handle users who have already migrated using the older table definitions and potentially make schema updates more complex to manage.
Same concern with reorg tables.

fracek · 2024-12-30T11:13:03Z

We'd need a strategy to handle users who have already migrated using the older table definitions and potentially make schema updates more complex to manage. Same concern with reorg tables.

Yes, we will need to handle migrations if the table schema changes. Usually this is done like this:

create a schema_version table with just two columns (the current schema version, e.g. 0 and a key used in the upsert statement)
create a list of migrations (sql statements) to execute to upgrade each schema version to the next.

For example once we reach schema version 3

// constants based on the current release.

const SCHEMA_VERSION = 3;
const migrations = [
[
  `/* migrations from v0 to v1 */`
],
[
  `/* migrations from v1 to v2 */`
],
[
  `/* migrations from v2 to v3 */`
],
];

// In the initialization code.
const currentVersion = `select version from schema_version where k = 0`.run();
let v = currentVersion ?? 0;

if (currentVersion > SCHEMA_VERSION) {
  throw new Error("version mismatch");
}

// run all migrations
`begin transaction`.run();
while (v < SCHEMA_VERSION) {
  // a migration can have multiple statements
  for (const statement of migrations[i]) {
   statement.run();
  }
  v += 1;
}
// bump version stored in the schema
`update schema_version set version = ${SCHEMA_VERSION} where k = 0`.run();
`commit transaction`.run();

fracek · 2024-12-30T11:17:57Z

I'm happy to make this change in this pr. You just need to create the table __indexer_schema_version with k = 0 and version = 0. Then define you can implement most of the code above but without the need to run the migrations for now (so simply check the version is not greater than the current version).

…agement

fracek · 2024-12-31T12:34:10Z

packages/plugin-drizzle/src/index.ts

+  db,
+  persistState: enablePersistence = true,
+  indexerName = "default",
+  schema,


I like that it's possible to override the schema!

fracek · 2024-12-31T12:36:54Z

packages/plugin-drizzle/src/index.ts

+  TSchema extends
+    TablesRelationalConfig = ExtractTablesWithRelations<TFullSchema>,
+>(
+  _db: PgDatabase<TQueryResult, TFullSchema, TSchema>,


Can we make this parameter optional? The schema is only needed to access the db like the following:

db.query.users.findMany({ ... });

as opposed to

db.select().from(users).where(...)

Alright will make it optional.

fracek · 2024-12-31T12:39:46Z

packages/plugin-drizzle/src/index.ts

+        const { endCursor, finality } = context as {
+          endCursor: Cursor;
+          finality: DataFinality;
+        };
+
+        if (!endCursor) {
+          throw new DrizzleStorageError("end cursor is undefined");
+        }
+
+        await withTransaction(db, async (tx) => {
+          context[DRIZZLE_PROPERTY] = { db: tx } as DrizzleStorage<
+            TQueryResult,
+            TFullSchema,
+            TSchema
+          >;
+
+          if (finality !== "finalized") {
+            await registerTriggers(tx, tableNames, endCursor, idColumn);
+          }
+
+          await next();
+          delete context[DRIZZLE_PROPERTY];
+
+          if (enablePersistence) {
+            await persistState({
+              tx,
+              endCursor,
+              indexerName,
+            });
+          }
+        });
+
+        if (finality !== "finalized") {
+          // remove trigger outside of the transaction or it won't be triggered.
+          await removeTriggers(db, tableNames);
+        }


Probably want to wrap this in a try { ... } finally { ... } to remove the triggers in all cases.

fracek · 2024-12-31T12:41:46Z

packages/plugin-drizzle/src/storage.ts

+      );
+    }
+  } catch (error) {
+    console.error(error);


We should not log the error like this. We can add a drizzle: Error property to DrizzleStorageError and change the constructor to be like new DrizzleStorageError("mesage...", { error })

fracek · 2024-12-31T12:42:27Z

packages/plugin-drizzle/src/storage.ts

+  )) as { rows: ReorgRollbackRow[] };
+
+  if (!Array.isArray(result)) {
+    throw new Error("Invalid result format from reorg_rollback query");


Should be DrizzleStorageError

fracek · 2024-12-31T12:43:04Z

packages/plugin-drizzle/src/storage.ts

+      case "I":
+        try {
+          // For inserts, delete the row
+          if (op.row_id) {


Throw if no op.row_id

fracek · 2024-12-31T12:43:17Z

packages/plugin-drizzle/src/storage.ts

+      case "D":
+        try {
+          // For deletes, reinsert the row using json_populate_record
+          if (op.row_value) {


Throw if now row_value

fracek · 2024-12-31T12:43:35Z

packages/plugin-drizzle/src/storage.ts

+      case "U":
+        try {
+          // For updates, restore previous values
+          if (op.row_value && op.row_id) {


Throw if either no row_value or row_id

fracek · 2024-12-31T12:44:00Z

packages/plugin-drizzle/src/storage.ts

+              `);
+
+            await tx.execute(query);
+          }


Add a default: case to throw since the op is an unknown value.

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (13)

packages/plugin-drizzle/src/utils.ts (1)

32-38: Potential malformed JSON catch
deserialize parses JSON and attempts to convert bigint-like strings. Consider wrapping in a try/catch if you expect malformed or user-provided JSON. This would help return more descriptive errors rather than letting a SyntaxError bubble up.

packages/plugin-drizzle/src/index.ts (3)

61-72: DrizzleStorageOptions
The configuration options are clear and flexible. Consider clarifying in docs that specifying schema is optional if db._.schema is sufficient.

134-149: Data invalidation logic
On connect:after, invalidate is triggered if the new run must remove partially processed blocks. This is essential for reorg handling.

Consider logging or metrics to confirm how often reorgs or partial runs occur, which can help debug environment issues.

169-183: Finalize logic
message:finalize correctly finalizes the block and updates the persistent state. Consider whether any extra logging is needed to detect stuck finalization attempts. Otherwise, it seems correct.

packages/plugin-drizzle/src/storage.ts (2)

32-85: Initialization logic
Creating the __reorg_rollback table and the trigger function reorg_checkpoint is well structured, allowing consistent auditing of INSERT, UPDATE, and DELETE. Consider also addressing potential concurrency constraints if multiple processes or indexers do writes simultaneously.

136-241: invalidate function
The rollback logic is thorough, narrowing the rows after a cursor point. It systematically undoes INSERT, DELETE, and UPDATE. This ensures a correct reorg scenario. Keep an eye on performance if large rollbacks are frequent.

packages/plugin-drizzle/src/persistence.ts (1)

44-51: CURRENT_SCHEMA_VERSION and MIGRATIONS
You have a placeholder for incremental schema changes. Smart forward-thinking for versioned migrations.

packages/plugin-drizzle/tests/helper.ts (3)

1-5: Check for TypeScript path alias usage or alternative import strategies.

Imports for Drizzle and related dependencies look consistent. However, if your project often references similar modules across packages, consider implementing path aliases in your tsconfig for improved maintainability.

6-16: Schema definition validation and naming clarity.

Defining the test table with common fields like blockNumber, key, count, and data is straightforward. Ensure that naming is intuitive and consistent across the codebase (e.g., consistently naming columns block_number or blockNumber) and that all fields will be used in tests.

37-54: Migration error handling and forward compatibility.

This migration approach is minimal but solid for initial setups. If future schema changes are expected, consider adding a versioning strategy or dedicated migrations directory. Throwing a DrizzleStorageError is good, but it may be beneficial to provide a more user-friendly fallback or automated migration steps if the schema expands over time.

packages/indexer/src/internal/testing.ts (1)

29-29: Readable variable naming for finalizeAt.

The variable name finalizeAt conveys its meaning. Consider grouping invalidation and finalization logic in a single variable or object if they often appear together, ensuring clarity in test setups.

packages/plugin-mongo/src/index.ts (2)

57-57: LGTM! Good default value addition.

Adding a default value for indexerName improves robustness and aligns with the persistence strategy discussed in the PR comments.

Consider documenting this default value in:

README to help users understand the default behavior

Migration guide for users who might be relying on undefined behavior

Line range hint 1-190: Consider enhancing the persistence strategy.

Based on the PR discussion about schema migrations, consider implementing:

A schema version tracking mechanism

Migration utilities for future schema changes

Following fracek's suggestion, consider:

Creating a __indexer_schema_version table

Implementing version check logic

Preparing for future migration capabilities

This would make the plugin more maintainable and upgrade-friendly.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 03729dd and 1760958.

⛔ Files ignored due to path filters (1)

pnpm-lock.yaml is excluded by !**/pnpm-lock.yaml

📒 Files selected for processing (21)

change/@apibara-indexer-4f8762fe-e45f-4fa5-8b17-453abb123f8c.json (1 hunks)
change/@apibara-plugin-drizzle-916f8f1b-3fc8-423e-9268-52790d29903c.json (1 hunks)
change/@apibara-plugin-mongo-a083deaa-e6d0-428a-b3b6-633d2a9a44ed.json (1 hunks)
change/@apibara-plugin-sqlite-4c3f1380-bbe7-4007-904d-b75edf438fd7.json (1 hunks)
packages/indexer/src/internal/testing.ts (3 hunks)
packages/plugin-drizzle/docker-compose.yml (1 hunks)
packages/plugin-drizzle/package.json (1 hunks)
packages/plugin-drizzle/src/index.ts (1 hunks)
packages/plugin-drizzle/src/persistence.test.ts (0 hunks)
packages/plugin-drizzle/src/persistence.ts (2 hunks)
packages/plugin-drizzle/src/storage.ts (1 hunks)
packages/plugin-drizzle/src/utils.ts (1 hunks)
packages/plugin-drizzle/tests/helper.ts (1 hunks)
packages/plugin-drizzle/tests/storage.test.ts (1 hunks)
packages/plugin-drizzle/tsconfig.json (1 hunks)
packages/plugin-mongo/src/index.ts (1 hunks)
packages/plugin-mongo/src/mongo.ts (1 hunks)
packages/plugin-mongo/src/persistence.ts (4 hunks)
packages/plugin-sqlite/src/index.ts (1 hunks)
packages/plugin-sqlite/src/kv.ts (1 hunks)
packages/plugin-sqlite/src/persistence.ts (5 hunks)

💤 Files with no reviewable changes (1)

packages/plugin-drizzle/src/persistence.test.ts

✅ Files skipped from review due to trivial changes (5)

packages/plugin-drizzle/docker-compose.yml
change/@apibara-plugin-sqlite-4c3f1380-bbe7-4007-904d-b75edf438fd7.json
change/@apibara-plugin-mongo-a083deaa-e6d0-428a-b3b6-633d2a9a44ed.json
change/@apibara-indexer-4f8762fe-e45f-4fa5-8b17-453abb123f8c.json
change/@apibara-plugin-drizzle-916f8f1b-3fc8-423e-9268-52790d29903c.json

🔇 Additional comments (54)

packages/plugin-drizzle/src/utils.ts (3)

1-9: Imports look good.
These imports from drizzle-orm and drizzle-orm/pg-core provide strong database typing, ensuring better type safety. No issues found.

18-30: Ensure consistent transaction error handling
The withTransaction function properly wraps the callback in a transaction. Consider whether you need additional error logging/handling inside the callback to avoid partial commits. Otherwise, this looks solid.

40-46: Serialization approach is clear
The serialize function properly handles bigint values by adding an "n" suffix. This is a straightforward approach for JSON representation of large integers.

packages/plugin-drizzle/src/index.ts (8)

1-29: Imports and constants
These import statements and constants (DRIZZLE_PROPERTY) are straightforward. They provide clarity in distinguishing the drizzle-specific context.

33-41: Transaction-bound storage interface
The DrizzleStorage type associating PgTransaction with typed schemas is well-structured. It makes the plugin usage more predictable for consumers.

42-59: Dependency check in useDrizzleStorage
Throwing a DrizzleStorageError when the plugin isn’t registered is a good safeguard. The function’s usage pattern looks correct.

84-100: Initial plugin setup
The logic in drizzleStorage hooking into defineIndexerPlugin is well structured. On run:before, the plugin uses transactions to set up tables if needed. This approach ensures a clean environment on every run.

101-132: State retrieval on connect:before
Properly fetching the stored cursor and filter with getState is a good design. It automatically resumes from the last known point.

151-167: Factory-mode hook usage
Using the connect:factory hook to persist state is a convenient approach. It ensures the transaction from the context is used. This code looks correct.

185-199: Reorg handling
message:invalidate triggers the reorg logic. Ensure large-scale reorg scenarios remain performant. Currently, it’s adequate, but you may want to measure if reorgs can trigger excessively large rollbacks.

201-240: Transaction-bound middleware
Wrapping the handler with a transaction inside handler:middleware is a neat approach to ensure that state changes are atomic. The final code snippet removing triggers after the transaction is also a good safety measure.

packages/plugin-drizzle/src/storage.ts (6)

1-11: Import statements
Everything is well organized for managing PostgreSQL concurrency with Drizzle. No issues found.

19-20: ReorgOperation type
Defining these operations as "I" | "U" | "D" clarifies the possible states for auditing reorg changes. This is helpful for strict type-checking.

21-30: Rollback table definition
Storing op, table_name, and cursor is a robust approach to track historical rows. Indicating row_value as JSONB is flexible for capturing old data.

87-116: Registering triggers
Registering triggers for each table with reorg_checkpoint ensures all changes are tracked. The usage of deferrable constraints is a nice detail so that constraints only apply at transaction commit.

118-134: Removing triggers
Dropping triggers is straightforward here. Consider verifying that the triggers are not used by another indexer instance in parallel. Otherwise, your approach is consistent.

242-259: finalize function
Deleting rollback entries at or below the finalized cursor is a neat way to cap the rollback history. This ensures the system doesn’t store unbounded historical data.

packages/plugin-drizzle/src/persistence.ts (9)

2-3: Query builder usage
The usage of and, eq, gt, lt, etc., from drizzle-orm is consistent. Code reads clearly.

11-13: Defining table names
Using __indexer_checkpoints, __indexer_filters, and __indexer_schema_version as constants clarifies that these tables are reserved for indexing logic. Good naming.

Line range hint 15-36: checkpoints and filters tables
Defining separate tables for checkpoint metadata and filters helps keep the data well structured. This organization will ease any future expansions.

39-42: Tracking schema version
schemaVersion table provides a future-proof path for migrations. This aligns with the user’s conversation about managing schema updates seamlessly.

53-131: initializePersistentState
This function orchestrates schema creation and version checks. Good job verifying the stored version is not ahead of the code’s known version. The fallback for running or skipping migrations is also cleanly implemented.

134-193: persistState
Handles checkpoint insertion/upsertion and filter updates in a single transaction. Ensure that large volumes of filters on the same indexer keep the logic consistent.

195-234: getState
Returns the last cursor and active filter. This is exactly what’s needed to restore or continue indexing. Implementation is straightforward.

236-271: invalidateState
Deletes or adjusts filters whose fromBlock or toBlock is greater than the rollback point. This is key for correct partial reorg. The logic appears correct.

273-296: finalizeState
Cleaning out filter rows below a certain finalized block helps ensure data cleanliness. Good approach.

packages/plugin-drizzle/tests/storage.test.ts (10)

19-106: Test: “should store data along with audit”
Verifies the insertion and corresponding rows in __reorg_rollback. This covers the simplest scenario thoroughly. The snapshot usage is spot-on.

108-204: Test: “should update data along with audit”
Covers the insert-if-missing or update-if-found pattern. The rollback table gets U operations for updates, confirming the old state is stored.

206-291: Test: “should delete data along with audit”
Ensures the rollback table gets a D operation with the old row. This is essential to later restore that row during invalidation scenarios.

293-414: Test: “should invalidate data”
Checks correct rollback logic for partial progress. This ensures that lines with artificial blocks or updated entries can be undone.

490-607: Test: “should finalize data”
Verifies the rollback table is emptied after finalizing. This test ensures no leftover entries linger once blocks become final.

609-661: Test: “should persist state”
Intentionally throws an error to test whether data up to the failing point are persisted. This is a strong test scenario that verifies resilience.

663-839: Tests for “factory mode”
Covers the composited filter approach, stepping from empty filter to “B” to “BC.” It’s a good real-world scenario for incremental modifications. The final verification of filters and checkpoints is correct.

841-1008: Invalidate in factory mode
Ensures partial reorg logic is triggered while using dynamic filters. This is an essential coverage case for advanced use of multiple filters.

1010-1177: Finalize in factory mode
Ensures that once the final cursor is marked finalized, the rollback data for that indexer is cleared. Good parallel to the earlier finalize test.

1179-1220: Schema version check
Ensures the schemaVersion table was initialized to 0. This covers the initial state and supports incremental migrations in the future.

packages/plugin-mongo/src/mongo.ts (1)

46-46: Use of $lte for finalization condition.

Changing the finalization condition from < to <= includes boundary records when finalizing. If the intention is to ensure that all records up to and including a certain cursor value are considered finalized, this adjustment is correct. Be sure to confirm that no unintended records (e.g., borderline cases) are accidentally removed.

packages/plugin-drizzle/tests/helper.ts (1)

18-35: Potential collision risk in memory-based databases with random UUIDs.

The code uses a randomly generated UUID for the in-memory database name. Though collisions are rare, carefully handle potential name conflicts or provide fallback logic if collisions occur.

Would you like assistance generating a fallback approach if db initialization fails due to a name collision?

packages/plugin-sqlite/src/kv.ts (1)

108-108: Finalization condition inclusive of boundary block.

Switching from < to <= ensures that entries with a to_block value exactly matching the finalizing block are removed. Verify that this meets the intended requirement, as it could alter the historical data boundary when finalizing.

packages/indexer/src/internal/testing.ts (2)

18-21: Flexible finalization options in MockMessagesOptions.

Adding the optional finalize property allows more comprehensive mock scenarios. Confirm that tests fully cover the boundary cases, such as defining only one of the nested properties or setting conflicting values.

41-50: Proper finalize message generation.

Using orderKey: 5_000_000 + finalizeAt.finalizeToIndex is consistent, but ensure that scenarios with overlapping or interleaving invalidate and finalize triggers are tested. Confirm the sequence of messages is well-defined if both are triggered in close succession.

packages/plugin-mongo/src/persistence.ts (4)

41-43: Mandatory indexerName now required—ensure all calls provide it.
By making indexerName required, callers must supply a value. This helps prevent unexpected runtime issues but requires all existing references to be updated accordingly.

89-91: Confirm consistent usage of indexerName in getState.
This change aligns with other functions in this file, ensuring that the same indexer name is used for both checkpoint and filter retrieval.

128-130: Required indexerName parameter in invalidateState.
Previously optional, now mandatory. Verify that all references in your codebase pass the correct indexer name to avoid null or undefined parameters.

152-159: Switch from $lt to $lte in finalization logic.
Using $lte changes finalization to include the boundary block. Verify that this meets the intended behavior, as it will remove records up to and including cursor.orderKey.

packages/plugin-sqlite/src/persistence.ts (5)

16-18: indexerName is now mandatory in persistState.
All invocations must supply an indexer name. This ensures data is stored under a clear, non-ambiguous key.

43-45: Enforcing indexerName in getState.
Consistent with other changes, this enforces a unique identifier for retrieving indexed data.

76-78: Mandatory indexerName for finalizeState.
Previously, a default or optional value was allowed. Now, ensure you provide a valid string for finalization.

89-91: Mandatory indexerName for invalidateState.
This standardizes how the indexer name is handled across all functions. Make sure your code provides a correct indexer identifier.

150-150: Use of <= for finalizing filters.
Changing from < to <= means more inclusive finalization. This might alter the logic if the boundary block was previously excluded.

packages/plugin-sqlite/src/index.ts (1)

71-71: Assigned default 'default' to indexerName.
This ensures a consistent fallback if no indexerName is provided, preventing undefined behavior.

packages/plugin-drizzle/tsconfig.json (1)

8-8: Simplified TypeScript config by removing extraneous properties.
Confirm that removing rootDir and include fits your build structure to avoid compilation issues.

packages/plugin-drizzle/package.json (2)

28-31: LGTM! Good practice moving to peerDependencies.

Moving drizzle-orm and pg to peerDependencies is a good architectural decision as it:

Allows consumers to manage their own versions
Prevents duplicate installations
Makes external requirements explicit

35-37: Verify version compatibility with latest releases.

The devDependencies mirror the peer dependencies, which is good. However, let's verify these versions are still current and don't have known vulnerabilities.

fracek · 2024-12-31T12:45:00Z

packages/plugin-drizzle/tests/helper.ts

+  blockNumber: integer("block_number").notNull(),
+  key: text("key"),
+  count: integer("count"),
+  data: text("data"),


Can we add a datetime column to test if it's serialized/deserialized correctly?

…age hook param optional

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (5)

packages/plugin-drizzle/src/utils.ts (1)
40-46: Consider making indentation configurable.

The function correctly handles BigInt serialization, but the hardcoded tab indentation might not suit all use cases.
-export function serialize<T>(obj: T): string {
+export function serialize<T>(obj: T, indent: string | number = "\t"): string {
   return JSON.stringify(
     obj,
     (_, value) => (typeof value === "bigint" ? `${value.toString()}n` : value),
-    "\t",
+    indent,
   );
 }
packages/plugin-drizzle/tests/helper.ts (2)
19-35: Consider making debug mode configurable.

The debug option is currently commented out. Consider making it configurable through an environment variable for easier troubleshooting.
   const db = drizzle({
     schema: {
       testTable,
     },
     connection: {
-      // debug: true,
+      debug: process.env.DEBUG_DB === 'true',
       dataDir: `memory://${dbName}`,
     },
   });
37-56: Consider adding error logging before throwing.

While the error handling is good, adding logging before throwing would help with debugging.
   } catch (error) {
+    console.error('Migration failed:', error);
     throw new DrizzleStorageError("Migration failed", {
       cause: error,
     });
   }
packages/plugin-drizzle/src/persistence.ts (1)

39-51: Well-designed schema versioning system

Good implementation with:

Version tracking table

Placeholder for future migrations

Clear version constants

Consider adding documentation for the migration process and version update guidelines.

packages/plugin-drizzle/tests/storage.test.ts (1)

19-1241: Comprehensive test suite with room for improvement

The test suite effectively covers:

Basic CRUD operations

State persistence

Schema versioning

Factory mode operations

Consider adding:

Edge cases for concurrent operations

Error scenarios for network failures

Performance tests for large datasets

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1760958 and 263d8bf.

📒 Files selected for processing (6)

packages/plugin-drizzle/src/index.ts (1 hunks)
packages/plugin-drizzle/src/persistence.ts (2 hunks)
packages/plugin-drizzle/src/storage.ts (1 hunks)
packages/plugin-drizzle/src/utils.ts (1 hunks)
packages/plugin-drizzle/tests/helper.ts (1 hunks)
packages/plugin-drizzle/tests/storage.test.ts (1 hunks)

🔇 Additional comments (15)

packages/plugin-drizzle/src/utils.ts (3)

1-14: LGTM! Well-structured error handling.

The error class implementation follows best practices by:

Properly extending the Error class
Setting a custom name for better error identification
Supporting error cause through options

18-30: LGTM! Clean transaction abstraction.

The transaction wrapper is well-implemented with:

Strong type safety through generics
Clean async/await usage
Proper error propagation

32-38: Consider adding runtime type validation.

While the BigInt handling is good, the type assertion to T could be unsafe if the JSON structure doesn't match the expected type.

Consider using a validation library like zod or io-ts to ensure type safety at runtime. This would prevent potential runtime errors from mismatched types.

packages/plugin-drizzle/tests/helper.ts (1)

6-13: LGTM! Comprehensive test table schema.

The schema includes all necessary column types including timestamp, addressing the previous review comment about datetime testing.

packages/plugin-drizzle/src/index.ts (5)

33-59: LGTM! Well-structured storage access pattern.

The implementation provides:

Strong type safety through generics
Clear error messages
Clean context-based storage access

74-97: LGTM! Well-documented and flexible configuration.

The implementation:

Has clear JSDoc documentation
Provides sensible defaults
Makes schema optional as requested in previous reviews

120-178: LGTM! Robust state management implementation.

The implementation:

Properly handles cursor and filter state
Implements clean invalidation logic
Uses transactions consistently as discussed in previous reviews

179-259: LGTM! Robust message handling and cleanup.

The implementation:

Has comprehensive error handling
Properly cleans up triggers in finally block as suggested
Maintains transaction consistency

98-118: Verify behavior with empty schema.

While the schema handling is generally good, verify the behavior when no tables are defined.

✅ Verification successful

Empty schema handling is implemented correctly

The code safely handles empty schemas through multiple fallback mechanisms:

Uses nullish coalescing to fall back to db._.schema if provided schema is null/undefined
Falls back to empty object {} as the final fallback
Wraps the operation in try-catch with a specific DrizzleStorageError
Empty schema will result in empty tableNames array, which is a valid state

The implementation ensures that no runtime errors occur with empty schemas, making the current implementation robust and safe.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check if the code handles empty schema gracefully
ast-grep --pattern 'tableNames = Object.values($schema ?? $fallback ?? {}).map($_)'

Length of output: 85

Script:

#!/bin/bash
# Let's try a different pattern to match the schema extraction
ast-grep --pattern 'tableNames = Object.values($_).map((table) => table.dbName)'

# Also search for any tests or error handling related to empty schema
rg -A 3 "Failed to get table names"

Length of output: 630

Script:

#!/bin/bash
# Let's check the DrizzleStorageError definition and any related error handling
rg -B 2 -A 2 "class DrizzleStorageError"

# Check for any tests related to schema handling
fd "test|spec" -e ts -e js --exec rg -l "schema.*empty|empty.*schema|Failed to get table names"

Length of output: 491

packages/plugin-drizzle/src/storage.ts (4)

41-51: Verify SQL injection protection in table creation

The table creation SQL is using raw SQL execution. While the table name is hardcoded and safe, consider using parameterized queries or escaping mechanisms for future modifications where table names might be dynamic.

54-78: Well-implemented trigger function with comprehensive audit logging

The trigger function effectively captures:

Operation type (INSERT/UPDATE/DELETE)
Old and new values
Table context
Proper error handling

165-257: Well-structured error handling and operation processing

The switch statement effectively handles different operation types with:

Proper validation of required fields
Specific error messages for each operation
Comprehensive error handling

261-279: Clean implementation of finalization logic

The function effectively:

Cleans up processed records
Uses proper transaction handling
Includes appropriate error handling

packages/plugin-drizzle/src/persistence.ts (2)

134-194: Robust state persistence implementation

Excellent implementation with:

Strong typing using generics
Proper transaction handling
Comprehensive error handling
Clean parameter structure using props object

196-301: Well-structured state management functions

The implementation demonstrates:

Consistent patterns across functions
Proper transaction handling
Comprehensive error handling
Good separation of concerns

coderabbitai · 2024-12-31T13:55:03Z

packages/plugin-drizzle/src/storage.ts

+                DELETE FROM ${op.table_name}
+                WHERE ${idColumn} = '${op.row_id}'
+              `),


⚠️ Potential issue

SQL injection risk in DELETE query

Direct string concatenation of table name and column values poses a SQL injection risk.

Use parameterized queries:

-sql.raw(` - DELETE FROM ${op.table_name} - WHERE ${idColumn} = '${op.row_id}' - `), +sql.raw(` + DELETE FROM $1 + WHERE $2 = $3 + `, [op.table_name, idColumn, op.row_id]),

Committable suggestion skipped: line range outside the PR's diff.

coderabbitai · 2024-12-31T13:55:04Z

packages/plugin-drizzle/src/storage.ts

+        sql.raw(`DROP TRIGGER IF EXISTS ${table}_reorg ON ${table};`),
+      );


⚠️ Potential issue

Potential SQL injection vulnerability in trigger name

The table name is directly concatenated into the SQL string. While the current usage is safe with validated table names, consider using parameterized queries or proper escaping for future-proofing.

Consider using a prepared statement or proper escaping mechanism:

-sql.raw(`DROP TRIGGER IF EXISTS ${table}_reorg ON ${table};`), +sql.raw(`DROP TRIGGER IF EXISTS $1_reorg ON $1;`, [table]),

Committable suggestion skipped: line range outside the PR's diff.

plugin-drizzle: add drizzle plugin

1d65169

jaipaljadeja requested a review from fracek December 30, 2024 09:54

plugin-drizzle: add persistence

2709267

jaipaljadeja force-pushed the feat/plugin-drizzle branch from c132911 to 2709267 Compare December 30, 2024 10:00

$fracek$

fracek reviewed Dec 30, 2024

View reviewed changes

jaipaljadeja added 9 commits December 30, 2024 19:08

plugin-drizzle: initialize persistence tables manually and schema man…

96e73e4

…agement

plugin-drizzle: use same transaction from context in connect:factory

ed28a15

plugin-drizzle: add finality check when register/deregister triggers

9064f33

indexer: add finalize config in mockStream

000313d

plugin-sqlite: update finalize condition and refactor

d8f1109

plugin-mongo: update finalize condition and refactor

5f1b9b9

plugin-drizzle: improve error handling and refactor

8fe4a46

pluigin-drizzle: add invalidation and finalize methods

d57c90d

plugin-drizzle: add tests for drizzle

1760958

jaipaljadeja force-pushed the feat/plugin-drizzle branch from 2bc2989 to 1760958 Compare December 31, 2024 12:03

jaipaljadeja requested a review from fracek December 31, 2024 12:06

$fracek$

fracek reviewed Dec 31, 2024

View reviewed changes

jaipaljadeja marked this pull request as ready for review December 31, 2024 12:37

$fracek$

fracek reviewed Dec 31, 2024

View reviewed changes

coderabbitai bot reviewed Dec 31, 2024

View reviewed changes

$fracek$

fracek reviewed Dec 31, 2024

View reviewed changes

jaipaljadeja added 3 commits December 31, 2024 19:18

plugin-drizzle: update DrizzleStorageError class

7706ea4

plugin-drizzle: get dbName from schema tables and make useDrizzleStor…

cc44933

…age hook param optional

plugin-drizzle: add timestamp to test table

263d8bf

coderabbitai bot reviewed Dec 31, 2024

View reviewed changes

jaipaljadeja requested a review from fracek December 31, 2024 14:07

$fracek$

fracek approved these changes Jan 1, 2025

View reviewed changes

$@fracek$ fracek merged commit 67bd33f into apibara:main Jan 1, 2025
2 checks passed

coderabbitai bot mentioned this pull request Jan 1, 2025

example: update cli example #131

Merged

This was referenced Jan 17, 2025

starknet: add event decoder module #133

Merged

Add production field to data messages #135

Merged

Make fields not nullable #137

Merged

coderabbitai bot mentioned this pull request Jan 30, 2025

plugin-drizzle: fix concurrency issue during initialization #141

Merged

coderabbitai bot mentioned this pull request Mar 11, 2025

feat: add --alwaysReindex flag in dev mode to reset database state on indexer restarts #155

Merged

		sql.raw(`DROP TRIGGER IF EXISTS ${table}_reorg ON ${table};`),
		);

plugin-drizzle: add drizzle plugin and persistence #130

plugin-drizzle: add drizzle plugin and persistence #130

Uh oh!

Conversation

jaipaljadeja commented Dec 30, 2024

Uh oh!

coderabbitai bot commented Dec 30, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram

Possibly Related PRs

Suggested Reviewers

Poem

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

fracek commented Dec 30, 2024

Uh oh!

jaipaljadeja commented Dec 30, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

fracek commented Dec 30, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

fracek commented Dec 30, 2024

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

fracek Dec 31, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Dec 31, 2024

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Dec 31, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot commented Dec 30, 2024 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)

jaipaljadeja commented Dec 30, 2024 •

edited

Loading

fracek commented Dec 30, 2024 •

edited

Loading

$@fracek$ fracek Dec 31, 2024 •

edited

Loading