Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds a Bulldozer feature suite: new storage engine table and migration, extensive Bulldozer table operators and SQL helpers, a standalone "Bulldozer Studio" dev HTTP server + UI, fuzz/perf tests, and supporting scripts/packaging and minor editor/docs tweaks. Changes
Sequence Diagram(s)sequenceDiagram
participant Browser
participant StudioServer as "Bulldozer Studio\n(HTTP Server)"
participant BulldozerLib as "Bulldozer Module\n(SQL helpers & Table ops)"
participant Postgres
Browser->>StudioServer: GET / (SPA) / API requests
StudioServer->>BulldozerLib: toExecutableSqlTransaction / handler (init/list/set/delete)
BulldozerLib->>Postgres: Execute SQL (storage engine reads/writes, temp helpers)
Postgres-->>BulldozerLib: Rows / success / errors
BulldozerLib-->>StudioServer: Results / errors
StudioServer-->>Browser: JSON response / HTML
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
nams1570
left a comment
There was a problem hiding this comment.
Partial Review- some bugs in concattable implementation
| (gen_random_uuid(), ${getStorageEnginePath(options.tableId, [])}::jsonb[], 'null'::jsonb), | ||
| (gen_random_uuid(), ${getStorageEnginePath(options.tableId, ["metadata"])}::jsonb[], '{ "version": 1 }'::jsonb) | ||
| `], | ||
| delete: () => [sqlStatement` |
There was a problem hiding this comment.
Potential bugs: I see the recursive CTE here deletes the bulldozer engine objects, but the triggers aren't deleted right?
What if hypothetically, i create a ConcatTable c_table_a, and two downstream tables p and q have it as an input table. This means that p an q would call c_table_a's registerRowChangeTriggers right?
Now, let's delete c_table_a. would anything break on p and q?
What about if you create a new ConcatTable c_table_a (reusing same name) again? Would p and q now listen for changes to c_table_a?
From my limited testing, i see that the above case will happen.
Also, let's say c_table_a had sourceA as an input table. So sourceA->c_table_a->p. Naturally, changes to sourceA would affect p via triggers. If you delete c_table_a, and then create another c_table_a that DOESN'T take in sourceA as an input, changes to sourceA would still affect the v2 of c_table_a AND p.
There was a problem hiding this comment.
The same bug applies across all tables.
| const referenceCompareSortKeysSql = firstTable.compareSortKeys(sqlExpression`$1`, sqlExpression`$2`).sql; | ||
| for (const table of tables) { | ||
| const compareGroupKeysSql = table.compareGroupKeys(sqlExpression`$1`, sqlExpression`$2`).sql; | ||
| const compareSortKeysSql = table.compareSortKeys(sqlExpression`$1`, sqlExpression`$2`).sql; | ||
| if (compareGroupKeysSql !== referenceCompareGroupKeysSql || compareSortKeysSql !== referenceCompareSortKeysSql) { | ||
| throw new StackAssertionError("declareConcatTable requires comparator-compatible input tables", { | ||
| tableId: options.tableId, | ||
| tableDebugId: tableIdToDebugString(table.tableId), | ||
| }); | ||
| } | ||
| } |
There was a problem hiding this comment.
Discussion: isn't the compareSortKeys check here a little too strict? As I see it, later in the file you set the rows' sortKeys to null anyway, and differing sorting upstream isn't going to affect a Union's output right?
| RD extends RowData, | ||
| >(options: { | ||
| tableId: TableId, | ||
| tables: Table<GK, any, RD>[], |
There was a problem hiding this comment.
Potential bug: if the same table is entered twice here, wouldn't there be duplicated rows?
Also the same change event would be subscribed to twice right? you'd have two callbacks for the same change to the same table run?
| "sourceRows"."groupkey" AS "groupKey", | ||
| ${createConcatenatedRowIdentifierSql(tableIndex, `"sourceRows"."rowidentifier"`)} AS "rowIdentifier", | ||
| 'null'::jsonb AS "rowSortKey", | ||
| "sourceRows"."rowdata" AS "rowData" | ||
| FROM (${table.listRowsInGroup({ | ||
| start: "start", | ||
| end: "end", | ||
| startInclusive: true, | ||
| endInclusive: true, | ||
| }).sql}) AS "sourceRows" | ||
| WHERE ${getInputInitializedSql(table)} |
There was a problem hiding this comment.
Potential bug: here, we read sourceRows from each inputTable.listRowsinGroup. But the listRowsinGroup defined in index.ts doesn't require it to return the groupKey. The type looks like this:
listRowsInGroup(options: { groupKey?: SqlExpression<GK>, start: SqlExpression<SK> | "start", end: SqlExpression<SK> | "end", startInclusive: boolean, endInclusive: boolean }): SqlQuery<Iterable<{ rowIdentifier: RowIdentifier, rowSortKey: SK, rowData: RD }>>,
For example, the StoredTable implementation of listRowsinGroup doesn't return groupkey.
| model BulldozerStorageEngine { | ||
| id String @id @default(uuid()) @db.Uuid | ||
| keyPath Json[] | ||
| keyPathParent Unsupported("jsonb[]") |
There was a problem hiding this comment.
nit: if this is nullable, shouldn't it be Unsupported("jsonb[]")? ?
| if (current === "\"") inDoubleQuote = false; | ||
| index++; | ||
| continue; | ||
| } |
There was a problem hiding this comment.
splitSqlStatements mishandles escaped double-quote identifiers
Low Severity
The splitSqlStatements parser treats every " as toggling inDoubleQuote mode, but PostgreSQL allows "" inside double-quoted identifiers to represent a literal quote character. A SQL identifier like "column""name" would be incorrectly parsed as two separate quoted regions, causing any semicolons following the premature quote-exit to be treated as statement terminators. No currently generated SQL triggers this, but it breaks the parser contract if future SQL uses escaped identifiers.
Reviewed by Cursor Bugbot for commit 5b69a18. Configure here.
| `); | ||
| }); | ||
| sendJson(response, 200, { ok: true }); | ||
| return; |
There was a problem hiding this comment.
Raw upsert lacks protection for reserved root paths
Medium Severity
The /api/raw/upsert endpoint allows modifying the value of reserved root paths like [] and ["table"], while the /api/raw/delete endpoint explicitly protects them. This inconsistency means a user could accidentally overwrite the root or table-namespace node values, potentially disrupting the hierarchical storage structure that all bulldozer tables depend on.
Reviewed by Cursor Bugbot for commit a7f999f. Configure here.
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
There are 3 total unresolved issues (including 2 from previous reviews).
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 9cb5f5b. Configure here.
| queued_row."reducerSql" | ||
| ) | ||
| INTO next_state, next_rows_data, next_timestamp | ||
| USING current_state, queued_row."rowData", current_timestamp_value; |
There was a problem hiding this comment.
SQL injection via format %s with reducerSql
Medium Severity
The bulldozer_timefold_process_queue function uses EXECUTE format(... %s ..., queued_row."reducerSql") to run SQL stored in the BulldozerTimeFoldQueue table's reducerSql TEXT column. This directly interpolates unparameterized SQL from a database column into executable code. While the column is currently only populated by application code, any path that allows writing to BulldozerTimeFoldQueue (including the studio's raw upsert endpoint) could inject arbitrary SQL that runs with the function's privileges.
Additional Locations (1)
Reviewed by Cursor Bugbot for commit 9cb5f5b. Configure here.


Note
High Risk
High risk due to new database tables/migrations (including PL/pgSQL worker +
pg_cronscheduling) and a large new dev-only HTTP server script that mutates data via raw SQL. Also touches passkey registration verification and cron job startup timing, which can affect auth and background processing behavior.Overview
Introduces Bulldozer DB persistence by adding Prisma models + migrations for
BulldozerStorageEngine(hierarchicaljsonb[]key paths with generated parent links) and a timefold queue (BulldozerTimeFoldQueue/BulldozerTimeFoldMetadata) with abulldozer_timefold_process_queue()worker that replays reducers, emits rows/state into storage, and schedules itself viapg_cron.Adds Bulldozer Studio as a backend dev tool: a local-only HTTP UI (
run-bulldozer-studio.ts) to visualize table graphs, inspect/edit raw storage, and run init/delete/set-row actions; wired intopnpm devand backed by a new dependency (elkjs).Also includes small robustness tweaks: delay starting cron job polling to allow server startup, harden passkey registration options handling around empty
hints, and assertregistrationInfois present after passkey verification. Migration tests are added to validate storage hierarchy behavior and timefold queue processing.Reviewed by Cursor Bugbot for commit e3c3865. Bugbot is set up for automated code reviews on this repo. Configure here.
Summary by CodeRabbit
New Features
Infrastructure
Tests
Chores
Documentation