Skip to content

fix(tables): compare order_key bytewise (COLLATE "C") to stop insert collation errors#4908

Merged
TheodoreSpeaks merged 3 commits into
stagingfrom
fix/insert-row-error
Jun 8, 2026
Merged

fix(tables): compare order_key bytewise (COLLATE "C") to stop insert collation errors#4908
TheodoreSpeaks merged 3 commits into
stagingfrom
fix/insert-row-error

Conversation

@TheodoreSpeaks

Copy link
Copy Markdown
Collaborator

Summary

  • Pin user_table_rows.order_key to COLLATE "C" (migration 0228) so Postgres compares fractional-index keys bytewise, matching the in-house fractional-indexing library. The prod DB default en_US.UTF-8 sorts lowercase before uppercase (the opposite of the Z < a ASCII boundary the keys rely on), so max()/neighbor lookups returned out-of-order bounds and generateKeyBetween threw a >= b on insert (and rows displayed out of order). Local DBs default to C.UTF-8, so it never reproduced locally.
  • Add a one-off repair script repair-table-order-key-collation.ts that re-keys any table with bytewise inversions/duplicates from (position, id) order, under the per-table advisory lock (idempotent, --dry-run). Separate from the NULL backfill, which already ran.
  • Harden resolveInsertByNeighbor to anchor on the next/prev distinct key so a stray duplicate can never throw.
  • Add fractional-indexing tests covering the Z/a boundary that en_US inverts.

Type of Change

  • Bug fix

Testing

Tested manually. bun run lint clean, bun run check:api-validation:strict passed, tsc --noEmit clean, fractional-indexing tests (5) + table suite (196) pass, drizzle-kit check clean.

Operational

Run after deploy: apply migration 0228, then repair-table-order-key-collation.ts --dry-run to size, then for real. Migration + neighbor fix stop the error independent of the TABLES_FRACTIONAL_ORDERING flag; the repair is pre-cleaning needed before the flag is enabled.

Checklist

  • Code follows project style guidelines
  • Self-reviewed my changes
  • Tests added/updated and passing
  • No new warnings introduced
  • I confirm that I have read and agree to the terms outlined in the Contributor License Agreement (CLA)

@vercel

vercel Bot commented Jun 8, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment
Project Deployment Actions Updated (UTC)
docs Skipped Skipped Jun 8, 2026 8:03pm

Request Review

@cursor

cursor Bot commented Jun 8, 2026

Copy link
Copy Markdown

PR Summary

Medium Risk
Schema migration takes ACCESS EXCLUSIVE on user_table_rows and rewrites the order_key index; the repair script mutates row keys in production and must run after deploy with dry-run sizing first.

Overview
Fixes table row ordering by aligning Postgres order_key comparison with the fractional-indexing library’s bytewise (ASCII) sort, which disagreed with prod’s default en_US.UTF-8 locale at the Z/a boundary and caused wrong neighbors, generateKeyBetween a >= b failures, and mis-sorted rows.

Migration 0228 sets user_table_rows.order_key to COLLATE "C" so max, neighbor queries, and ORDER BY order_key match JS string order.

resolveInsertByNeighbor now finds the next/previous row by distinct key (gt/lt on order_key only) instead of (order_key, id) tuple comparison, so duplicate keys cannot pass invalid bounds into keyBetween.

Adds repair-table-order-key-collation.ts to detect tables whose keys are not strictly increasing in position, id order under COLLATE "C", then re-mint keys from position under the per-table advisory lock (--dry-run supported). Run after the migration.

Adds fractional-indexing tests that lock in bytewise ordering (append, prepend across Z/a, batch mint, invalid bounds).

Reviewed by Cursor Bugbot for commit e537aa5. Bugbot is set up for automated code reviews on this repo. Configure here.

@TheodoreSpeaks

Copy link
Copy Markdown
Collaborator Author

@greptile review

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit f14bde6. Configure here.

Comment thread apps/sim/scripts/repair-table-order-key-collation.ts
@greptile-apps

greptile-apps Bot commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR fixes a production insert crash caused by en_US.UTF-8 locale sorting lowercase before uppercase — the opposite of the ASCII byte order the fractional-indexing library relies on — by pinning user_table_rows.order_key to COLLATE "C" (migration 0228) and hardening the neighbor-lookup logic in resolveInsertByNeighbor.

  • Migration 0228 issues a single ALTER TABLE … SET DATA TYPE text COLLATE "C" that rebuilds the dependent order_key index under byte order, making every ORDER BY, max(), and comparison agree with the JS library.
  • resolveInsertByNeighbor is hardened to find the next/prev distinct key (strict gt/lt on the key column) rather than the next row in (order_key, id) tuple order, so duplicate keys can never hand keyBetween equal bounds and trigger the a >= b assertion.
  • repair-table-order-key-collation.ts is a one-off idempotent script that re-keys any table with bytewise inversions/duplicates; dry-run intentionally skips the advisory lock so sizing passes don't queue behind live inserts.

Confidence Score: 5/5

Safe to merge — the migration, neighbor-lookup hardening, and repair script all address the root cause cleanly with no regressions in existing call paths.

The migration is a single ALTER with no data movement; the service change is a narrowly scoped query logic fix with correct null handling in both the flag-on and flag-off code paths; the repair script is idempotent and isolated behind the same per-table advisory lock the app uses; and the new tests directly cover the Z/a boundary that the en_US locale inverts.

No files require special attention. The schema.ts comment correctly documents the Drizzle collation limitation so future maintainers won't be surprised if drizzle-kit generate omits the COLLATE "C" clause.

Important Files Changed

Filename Overview
packages/db/migrations/0228_order_key_binary_collation.sql Single ALTER TABLE to pin order_key to COLLATE "C"; correct and minimal, Postgres rebuilds the dependent index automatically.
packages/db/schema.ts Comment update only — correctly documents that Drizzle can't express column collation and that the collation is migration-only.
apps/sim/lib/table/service.ts resolveInsertByNeighbor now finds the next/prev DISTINCT key instead of the next row in (key, id) tuple order, preventing keyBetween from receiving equal bounds when duplicate keys exist; logic is correct and index-friendly.
apps/sim/scripts/repair-table-order-key-collation.ts Well-structured one-off repair script; dry-run skips advisory lock, detection query uses explicit COLLATE "C" so it works before migration is applied, chunked UPDATEs are safe, and dry/live log labels are distinct.
apps/sim/lib/fractional-indexing/fractional-indexing.test.ts New test suite locks in bytewise ordering assumptions; covers append, prepend across Z/a boundary, bulk run, and the a >= b contract — directly exercises the failure mode that triggered this bug.

Reviews (3): Last reviewed commit: "improvement(tables): make repair-script ..." | Re-trigger Greptile

Comment thread apps/sim/scripts/repair-table-order-key-collation.ts Outdated
Comment thread apps/sim/scripts/repair-table-order-key-collation.ts
@TheodoreSpeaks

Copy link
Copy Markdown
Collaborator Author

Good catch — fixed in 075f3fe. The detection query was ordering the window by order_key COLLATE "C" and then comparing consecutive keys, which is monotonic by construction, so it only ever caught duplicates. It now walks rows in authoritative position, id order and flags any table where order_key isn't strictly increasing along that walk — so swapped keys (e.g. pos 0 = a0, pos 1 = Zz) are detected, and it's still idempotent after a re-key.

@greptile-apps

greptile-apps Bot commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR fixes a production-only collation bug where user_table_rows.order_key was sorted under en_US.UTF-8 (locale order), which inverts the Z < a ASCII boundary that the fractional-indexing library depends on. The three-part fix is: (1) migration 0228 pins the column to COLLATE "C" (bytewise), (2) resolveInsertByNeighbor is hardened to look for the next/prev distinct key (skipping duplicate-key siblings) so a stale duplicate can never pass inverted bounds to generateKeyBetween, and (3) a one-off repair script re-keys any table that accumulated mis-ordered or duplicate keys during the window before migration 0228.

  • Migration: single ALTER TABLE … SET DATA TYPE text COLLATE "C" that rebuilds the user_table_rows_table_order_key_idx index under the new collation; carries an ACCESS EXCLUSIVE lock but is acceptable here because the feature is flag-gated and the table is low-volume at this stage.
  • Service fix: neighbour queries are simplified from row-tuple comparisons to plain key comparisons (gt/lt), which correctly handles any residual duplicate keys and is identical to the row-tuple form when all keys are distinct.
  • Repair script + tests: the idempotent repair script correctly acquires the same per-table advisory lock the app uses; five new tests lock in the bytewise-ordering contract across the Z/a boundary.

Confidence Score: 4/5

The core fix — migration 0228 + the neighbour-query change — is correct and safe to deploy; the only rough edge is in the one-off repair script's dry-run output.

The migration, service change, and tests are all well-reasoned and internally consistent. The advisory lock key in the repair script matches the one in the application. The one concrete issue is that the repair script's --dry-run output says 're-keyed' and 'Repair complete.' with the same stat format as a live run — an operator following the PR's own instructions ('dry-run to size, then for real') could read the dry-run summary as confirmation the repair already completed and skip the real run. It does not affect the migration or service fix itself.

apps/sim/scripts/repair-table-order-key-collation.ts — the dry-run stat labels and completion message.

Important Files Changed

Filename Overview
packages/db/migrations/0228_order_key_binary_collation.sql Single ALTER TABLE that changes order_key to COLLATE "C"; correct and well-documented, rebuilds the dependent index atomically
apps/sim/lib/table/service.ts resolveInsertByNeighbor now uses plain key comparisons (gt/lt) instead of row-tuple comparisons, cleanly handling duplicate-key siblings; null-anchor fallback is preserved and equivalent to the prior behavior
apps/sim/scripts/repair-table-order-key-collation.ts Idempotent repair script; advisory lock key matches the app's lock; dry-run per-table log and summary stats use "re-keyed" language even when nothing was written, which can mislead operators sizing the repair
apps/sim/lib/fractional-indexing/fractional-indexing.test.ts New tests correctly exercise the Z/a boundary, append/prepend monotonicity, bulk generation, and the a>=b throw contract; good regression coverage
packages/db/schema.ts Comment-only change documenting COLLATE "C" lives in the migration since Drizzle cannot express column collation; accurate and useful for future maintainers
packages/db/migrations/meta/_journal.json Journal correctly registers migration 0228 as the latest entry with the right index and tag

Sequence Diagram

sequenceDiagram
    participant App
    participant DB as Postgres (COLLATE "C")
    participant Lib as fractional-indexing lib

    Note over DB: Migration 0228: order_key SET DATA TYPE text COLLATE "C"

    App->>DB: acquireRowOrderLock(tableId)
    App->>DB: SELECT anchor row (orderKey, position)
    
    alt afterRowId insert
        App->>DB: "SELECT orderKey WHERE orderKey > anchorKey ORDER BY orderKey ASC LIMIT 1"
        DB-->>App: nextKey (bytewise-correct neighbour)
        App->>Lib: keyBetween(anchorKey, nextKey)
        Lib-->>App: "newKey (always anchorKey < newKey < nextKey)"
    else beforeRowId insert
        App->>DB: "SELECT orderKey WHERE orderKey < anchorKey ORDER BY orderKey DESC LIMIT 1"
        DB-->>App: prevKey (bytewise-correct neighbour)
        App->>Lib: keyBetween(prevKey, anchorKey)
        Lib-->>App: "newKey (always prevKey < newKey < anchorKey)"
    end

    App->>DB: INSERT row with newKey

    Note over App,DB: Repair script (one-off, post-migration)
    App->>DB: "SELECT DISTINCT table_id WHERE order_key COLLATE C >= next_key COLLATE C"
    loop each mis-ordered table
        App->>DB: pg_advisory_xact_lock(user_table_rows_pos:tableId)
        App->>DB: SELECT id ORDER BY position, id
        App->>Lib: nKeysBetween(null, null, count)
        Lib-->>App: fresh evenly-spaced key run
        App->>DB: "UPDATE rows SET order_key = new_key (chunked)"
    end
Loading

Reviews (2): Last reviewed commit: "fix(tables): detect mis-keyed tables by ..." | Re-trigger Greptile

Comment thread apps/sim/scripts/repair-table-order-key-collation.ts
Comment thread apps/sim/scripts/repair-table-order-key-collation.ts
@TheodoreSpeaks

Copy link
Copy Markdown
Collaborator Author

@greptile review

@TheodoreSpeaks TheodoreSpeaks merged commit ec256d2 into staging Jun 8, 2026
14 checks passed
@waleedlatif1 waleedlatif1 deleted the fix/insert-row-error branch June 8, 2026 22:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant