fix: re-enable subgraph_domain.name indexes (hash + gin trigram)#1856
fix: re-enable subgraph_domain.name indexes (hash + gin trigram)#1856
Conversation
🦋 Changeset detectedLatest commit: dec0101 The changes in this PR will be included in the next version bump. This PR includes changesets to release 24 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
|
The latest updates on your projects. Learn more about Vercel for GitHub.
2 Skipped Deployments
|
|
Warning Rate limit exceeded
Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 8 minutes and 5 seconds. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Organization UI Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (7)
📝 WalkthroughWalkthroughReplaced the disabled Changes
Sequence Diagram(s)sequenceDiagram
participant Ponder as Ponder setup
participant Logger as Logger
participant EnsDB as ensDbClient
participant Indexer as Index creation
Note over Ponder,Logger: initialization start
Ponder->>Logger: debug("starting setup")
Ponder->>EnsDB: execute("CREATE EXTENSION IF NOT EXISTS pg_trgm")
EnsDB-->>Ponder: success / error
alt success
Ponder->>Logger: info("pg_trgm ensured")
Ponder->>Indexer: proceed to create indexes (hash + gin_trgm_ops)
Indexer-->>Ponder: indexes created
else failure
EnsDB-->>Logger: error
Ponder-->>Logger: error("extension creation failed", cause)
Ponder-->>Ponder: throw Error(cause)
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related issues
Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Pull request overview
Re-enables an index on subgraph_domains.name in the ENS subgraph-compatible Ponder/Drizzle schema, switching from a btree index to a hash index to avoid Postgres’ btree index-row size limit when spam/oversized names are present.
Changes:
- Add a hash index on
subgraph_domains.name(byExactName) to support exact-match lookups without hitting the btree 8191-byte limit. - Remove the prior commented-out/disabled
byNameindex block.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
The update looks good to me 👍 As discussed on Slack, it'd great to enable the |
lightwalker-eth
left a comment
There was a problem hiding this comment.
@shrugs Hey shared a suggestion. Please feel welcome to merge when ready 🫡
|
@shrugs feel free to have the required pg extension enabled with ENSDb migrations. It's a nice approach that ensures that the extension will be enabled just once for the single ENSDb instance. You can create the migration file like so: -- This migration enables the pg_trgm extension, which is used for
-- trigram-based indexing and searching in PostgreSQL.
CREATE EXTENSION IF NOT EXISTS pg_trgm; |
…ain.name Use the existing `initializeIndexingSetup` hook to install the `pg_trgm` extension before Ponder creates indexes, then re-introduce the GIN trigram index (`gin_trgm_ops`) on `subgraph_domain.name` to back the Subgraph GraphQL partial-match filters (`_contains`, `_starts_with`, `_ends_with`). The hash index added previously is retained for exact equality lookups; the trigram index covers partial matches.
Greptile SummaryThis PR re-enables the Confidence Score: 5/5Safe to merge — changes are additive (new indexes + extension install), idempotent, and integration-tested end-to-end. All findings are P2 or informational. The hash + GIN index combination correctly solves the btree row-size limit. The pg_trgm migration is idempotent and correctly placed in the Drizzle migrator so the extension is available before Ponder creates the GIN index. Documentation updates are consistent across all four relevant doc pages. No files require special attention. Important Files Changed
Sequence DiagramsequenceDiagram
participant API as api/index.ts
participant M as migrateEnsNodeSchema()
participant DB as Postgres (ensnode schema)
participant P as Ponder (createIndexes)
API->>M: call (async, fire-and-forget)
M->>DB: Drizzle migrate()<br/>(migrationsSchema: "ensnode")
DB->>DB: CREATE EXTENSION IF NOT EXISTS pg_trgm<br/>(lands in public schema)
DB-->>M: migration complete
M-->>API: .then(startEnsDbWriterWorker)
Note over P: Ponder index-creation phase
P->>DB: CREATE INDEX USING hash (name)
P->>DB: CREATE INDEX USING gin (name gin_trgm_ops)<br/>(resolves gin_trgm_ops from public)
DB-->>P: indexes created
Reviews (2): Last reviewed commit: "fix: address PR review nits" | Re-trigger Greptile |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Call out that any Postgres instance used as ENSDb must have the `pg_trgm` extension available for installation, since ENSIndexer runs `CREATE EXTENSION IF NOT EXISTS pg_trgm` at startup to back the partial-name search indexes on `subgraph_domain.name`.
The idempotency and error-propagation tests for the pg_trgm install re-exercise the shared `eventHandlerPreconditions` promise-caching mechanism, which is already covered by the ENSRainbow onchain-path tests. The SQL-content assertion tested an implementation detail that a source-level review catches more cheaply. Keep the ensDb singleton and logger mocks — they're needed for the module to load cleanly under vitest, not for coverage.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 9 out of 9 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- import `sql` from ponder instead of drizzle-orm for consistency with other schema files in ensdb-sdk (ensv2.schema.ts does the same) - rephrase pg_trgm managed-provider note from "enabled by default" (inaccurate — pg_trgm is per-database `CREATE EXTENSION`) to "available for installation" to match the rest of the docs
|
@greptile re-review |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 9 out of 9 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
lightwalker-eth
left a comment
There was a problem hiding this comment.
@shrugs Looks nice! Just 1 question. Please take the lead to merge when ready 👍
| // For more details, see: https://github.com/namehash/ensnode/issues/1819 | ||
| // byName: index().on(t.name), | ||
| // uses a hash index because some name values exceed the btree max row size (8191 bytes) | ||
| byExactName: index().using("hash", t.name), |
There was a problem hiding this comment.
I'm surprised no other code is updated in relation to us making the schema changes here? Ex: code in API handlers?
There was a problem hiding this comment.
none! postgres planner will use the indexes without specification
Summary
subgraph_domain.namewith a hash index (exact-match lookups that bypass the btree 8191-byte row size limit) and a gin trigram index (gin_trgm_ops) for partial-match filters_contains,_starts_with,_ends_with.pg_trgmextension via a new Drizzle migration (0001_enable_ext_pg_trgm.sql) that runs throughmigrateEnsNodeSchema()before Ponder starts up.pg_trgmrequirement.Why
closes the remaining work from #1819. measured on mainnet, the gin index is ~1.2 GB (~19% of heap) — within typical range for a text trigram index, not pathological.
Testing
pnpm -F ensindexer -F @ensnode/ensdb-sdk typecheck— cleanpnpm lint:ci— cleanpnpm test --project ensindexer --project @ensnode/ensdb-sdk— unit tests passastro build— clean (69 pages)Notes for Reviewer
public), so unqualifiedCREATE EXTENSION IF NOT EXISTS pg_trgmlands inpublicand Ponder's write-path connection resolves the unqualifiedgin_trgm_opsat index-creation time. doing this in the runtime setup hook was more fragile (see earlier commits in this PR for the search_path rabbit-hole)..op("gin_trgm_ops")gets dropped by ponder's index emitter, so the gin index is declared via a rawsqlfragment (index().using("gin", sql${t.name} gin_trgm_ops)).name, and pathological 8KB spam names add ~500-1000× the trigrams of a normal name. measured impact on mainnet is modest (~19% of heap, ~1.2 GB). a partial index (e.g.WHERE length(name) < 2048) is a reasonable follow-up if this grows.Checklist