Skip to content

Optimize traversal/query execution and add performance measurement guardrails#6

Merged
pdlug merged 1 commit into
mainfrom
feat/performance-improvements
Feb 17, 2026
Merged

Optimize traversal/query execution and add performance measurement guardrails#6
pdlug merged 1 commit into
mainfrom
feat/performance-improvements

Conversation

@pdlug
Copy link
Copy Markdown
Contributor

@pdlug pdlug commented Feb 14, 2026

Re-architects the SQL compiler and Drizzle backend into modular, dialect-aware layers while delivering performance improvements across query compilation, execution, traversal, and ingestion paths. Simplifies and tightens the public API surface, adds prepared query support, and introduces benchmark regression guardrails in CI.

Breaking API Changes

  • selectAggregate(...) renamed to aggregate(...)
  • includeImplyingEdges replaced with expand: "none" | "implying" | "inverse" | "all" (default: "inverse")
  • Recursive traversal methods (.maxHops(), .minHops(), .collectPath(), .withDepth()) consolidated into recursive({ minHops, maxHops, cyclePolicy, path, depth })
  • Cycle behavior now explicit via cyclePolicy: "prevent" | "allow" (default: "prevent")
  • Unbounded recursion capped at MAX_RECURSIVE_DEPTH = 100; explicit maxHops validated up to MAX_EXPLICIT_RECURSIVE_DEPTH = 1000
  • EdgeTypeNames / NodeTypeNames renamed to EdgeKinds / NodeKinds; getEdgeTypeNames() / getNodeTypeNames() renamed to getEdgeKinds() / getNodeKinds()
  • Store class exported as type-only (prevents new Store(), preserves Store<G> annotations)
  • StoreConfig replaced by StoreOptions (adds queryDefaults.traversalExpansion)

Public API Surface Reduction

Schema management extracted to new @nicia-ai/typegraph/schema sub-export:

  • Moved: serializeSchema, deserializeSchema, computeSchemaHash, initializeSchema, ensureSchema, migrateSchema, computeSchemaDiff, getMigrationActions, isBackwardsCompatible, all serialized types, and validation utilities

Removed from main entry point:

  • Compiler internals: QueryAst, SetOperation, ValueType, VariableLengthSpec, logical plan types, predicate/temporal/recursive compiler functions
  • getDialect, DialectAdapter (kept SqlDialect as type-only)
  • Profiler internals: extractPropertyAccesses, ProfileCollector, keyToPath/pathToKey, generateRecommendations, getUnindexedFilters
  • KindRegistry / buildKindRegistry
  • Result utilities (ok/err/isOk/isErr/unwrap/unwrapOr)
  • encodeDate / decodeDate
  • Validation utilities (createValidationError, validateProps, wrapZodError)

New Features

Prepared Queries

  • param(name) placeholders for scalar predicate positions
  • .prepare() precompiles AST → SQL text once; prepared.execute(bindings) runs with zero recompilation
  • Strict bindings validation (all declared params required, unknown bindings rejected)
  • Falls back to AST recompilation when raw execution is unavailable

Collection API Additions

  • getByIds(ids) on node/edge collections — batched SELECT ... WHERE id IN (...), preserves input order, returns undefined for missing IDs
  • bulkInsert — void-returning fire-and-forget ingestion
  • bulkCreate — multi-row INSERT ... RETURNING instead of per-item inserts
  • bulkUpsert (edges) — batch getNodes lookup instead of N+1 sequential calls
  • Node find({ where }) — delegates to query builder with full predicate support
  • count() now accepts QueryOptions (temporal mode, asOf)

Store & Type Additions

  • New hook types: StoreHooks, HookContext, OperationHookContext, QueryHookContext
  • TypedEdgeCollection, TypedNodeRef type exports
  • New error classes: CompilerInvariantError, DatabaseOperationError
  • New Result combinators: flatMap, map, mapErr, orElse

Compiler Architecture

SQL compilation restructured into a three-stage pipeline:

Query AST → Logical Plan → Compiler Passes → SQL Emitter → Drizzle SQL
  • plan/ — Intermediate representation with 9 plan node types (Scan, Filter, Join, Aggregate, Sort, Limit, VectorKnn, RecursiveExpand, SetOp); lowering functions for standard, recursive, and set-operation queries
  • passes/ — Multi-pass optimization framework with temporal, recursive, and vector passes
  • emitter/ — SQL generation from logical plans; plan-inspector.ts validates plan shape before emission; standard-builders.ts extracts CTE/projection/clause construction
  • standard-pass-pipeline.ts — Orchestrates passes, predicate indexing, and column pruning

Compiler introduces predicate pre-indexing, column pruning, and compilation caching to reduce per-query overhead.

Backend Architecture

Backend refactored from monolithic operations.ts into concern-based modules with dialect-polymorphic strategy dispatch..

Benchmarks

Complete rewrite of @nicia-ai/typegraph-benchmarks — replaced tinybench with a custom measurement framework:

  • Modular architecture: cli.ts, config.ts, backend.ts, measurements.ts, guardrails.ts, seed.ts
  • Scenarios: forward/reverse/inverse traversals, multi-hop (2/3-hop), recursive (100/1000-hop), aggregates, cached vs prepared execution
  • Regression guardrails with backend-specific threshold overrides for SQLite and PostgreSQL
  • CI integration: new perf:check and perf:check:postgres steps in GitHub Actions

Testing

Expanded test coverage for existing public APIs and added extensive new tests for the new APIs and backend refactor

Documentation

New pages: Schema Evolution guide, Testing guide, Query Execution reference

Major expansions: Performance overview (+411 lines), recursive queries, predicates reference, schemas & stores, traversal docs

Updated for new APIs: aggregate(), traversal expand modes, recursive options, bulkInsert, bulkCreate, getByIds, node find({ where }), prepared queries

@pdlug pdlug force-pushed the feat/performance-improvements branch 9 times, most recently from dccabe9 to 248739a Compare February 16, 2026 23:12
Re-architect SQL compilation into plan/passes/emitter pipeline and split
monolithic Drizzle operations into concern-based modules with strategy
dispatch. Add prepared query support, batch insert/lookup APIs,
compilation caching, and execution adapters with SQLite prepared
statement caching.

Breaking API changes:
- selectAggregate() → aggregate()
- includeImplyingEdges → expand: "none"|"implying"|"inverse"|"all"
- Recursive traversal consolidated into recursive({ minHops, maxHops,
  cyclePolicy, path, depth }) replacing chained methods
- EdgeTypeNames/NodeTypeNames → EdgeKinds/NodeKinds (and getter fns)
- Store class now type-only export (use createStore())
- StoreConfig replaced by StoreOptions
- Schema APIs extracted to new @nicia-ai/typegraph/schema entry point
- Removed from main: KindRegistry, Result utilities, date helpers,
  validation utilities, compiler/profiler internals

New features:
- param()/prepare()/execute() for zero-recompilation prepared queries
- getByIds(), bulkInsert, bulkCreate, bulkUpsert on collections
- Node find({ where }) with full predicate support
- Bind-limit-aware batch chunking per dialect

Adds new test suites, rewrites benchmark framework with CI
regression guardrails, and adds schema evolution, testing, and query
execution documentation.
@pdlug pdlug force-pushed the feat/performance-improvements branch from 248739a to 4553aed Compare February 16, 2026 23:56
@pdlug pdlug merged commit 9351542 into main Feb 17, 2026
15 of 16 checks passed
@pdlug pdlug deleted the feat/performance-improvements branch February 17, 2026 00:04
@github-actions github-actions Bot mentioned this pull request Feb 17, 2026
pdlug added a commit that referenced this pull request May 8, 2026
P1 — customizable uniques table for materializeRemovals cleanup:
- Extended SqlTableNames to include `uniques`. SQLite + Postgres
  backends populate it from the dialect-specific schema. Cleanup deletes
  from the customized table instead of the hardcoded default.

P2 #2 — reconciliation watermark for history walk:
- New optional backend primitives: ensureReconciliationMarkersTable,
  getReconciliationMarker, setReconciliationMarker. New table
  typegraph_reconciliation_markers (graphId PK, reconciledToVersion).
- reconcilePendingRemovals reads the marker, walks only transitions
  newer than it, then writes active.version as the new marker.
- Backends without the primitives fall back to walking from version 1
  (legacy behavior preserved for custom backends).
- A 100-version graph drops 100 round-trips + 100 Zod parses to 0 on
  steady-state calls.

P2 #3 — drop first validateGraphExtension call in mergeGraphExtension:
- Single union-validate covers both input shape and cross-document
  invariants. Genuine evolves now pay one validator walk instead of
  two. Callers wanting input-precise errors call
  validateGraphExtension(document) themselves first.

P2 #4 — per-kind compile cache:
- WeakMap<ExtensionNodeDef, CompiledNode> scoped per kindName outer Map.
- WeakMap<ExtensionEdgeDef, ZodObject> for edge schemas (from/to
  resolution stays per-call since it depends on the broader compile
  context's nodeTypeByName).
- Partial-overlap evolves reuse compiled output for unchanged kinds.

P2 #5 — strict mode threaded into index validation:
- validateIndexEntry now accepts strict and applies rejectUnknownObjectKeys
  with NODE_INDEX_KEYS or EDGE_INDEX_KEYS allowlist.
- Typos like `coveringField` (singular) now surface as
  INVALID_INDEX_DECLARATION instead of compiling to a weaker index.

P2 #6 — vector-index signature uses actual embeddings table:
- materialize-indexes.ts now reads backend.tableNames?.embeddings
  for the signature input. Fixes false drift detection on backends
  with custom embeddings table names.

Test impact: custom-table-names.test.ts updated for the new `uniques`
slot in SqlTableNames. All 3060 SQLite + 588 Postgres tests pass.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant