Skip to content

feat: stabilize adaptive background indexing runtime#15

Merged
emremy merged 4 commits into
mainfrom
v0.7.0_prepare_new_version
May 19, 2026
Merged

feat: stabilize adaptive background indexing runtime#15
emremy merged 4 commits into
mainfrom
v0.7.0_prepare_new_version

Conversation

@emremy
Copy link
Copy Markdown
Owner

@emremy emremy commented May 19, 2026

Summary

This PR finalizes the v0.7.0 adaptive background indexing runtime work.

The focus of this release is not new query APIs, but runtime maturity:

  • adaptive background scheduling
  • sorted rebuild optimization
  • worker/runtime stabilization
  • transfer/memory reduction
  • observability stabilization
  • portability hardening

The public query API remains synchronous.


Key Changes

Adaptive Background Scheduling

Added internal auto scheduling for dirty indexes.

Current behavior:

  • equality indexes remain sync-preferred
  • small tables remain sync-preferred
  • large eligible dirty sorted indexes may schedule background rebuilds
  • terminal queries remain synchronous and scan-correct while rebuilds happen in the background
  • explain() never schedules rebuilds

No public indexing config was added in this release.


Sorted Background Rebuild Optimization

Optimized the sorted apply/merge pipeline:

  • replaced linear chunk scanning with typed-array heap merge
  • reduced compare count dramatically
  • reduced sorted transfer/output buffers from ~80MB to ~40MB on 10M rows
  • preserved deterministic ordering and fail-closed validation

10M Benchmark

Before:

  • sorted apply ~7900ms

After:

  • sorted apply ~600ms
  • total real-worker sorted rebuild ~1.1s

Worker Runtime Stabilization

  • stabilized ESM/CJS worker resolution
  • lazily load NodeBackgroundWorkerExecutor
  • kept worker_threads out of top-level public runtime entrypoints
  • added runtime smoke coverage for auto-scheduled rebuilds
  • validated packaged worker artifacts through npm pack dry-runs

Observability Stabilization

Expanded and stabilized additive observability surfaces:

QueryInfo / explain()

  • indexState
  • fallbackReason
  • backgroundRebuildScheduled
  • backgroundRebuildState
  • selectedIndex
  • schedulerDecision metadata

Table-level diagnostics remain intentionally internal (__debug*).


Benchmark Highlights

100k rows

  • sorted: sync-preferred
  • equality: sync-preferred

1M rows

Sorted:

  • sync query ~316ms
  • real-worker total ~112ms

10M rows

Equality:

  • sync query ~142-165ms
  • real-worker ~258-303ms
  • policy remains sync-preferred

Sorted:

  • sync query ~3360-3430ms
  • real-worker total ~1080-1110ms
  • rebuild ~290-306ms
  • apply ~608-650ms
  • fallback query ~165-170ms
  • transfer/output ~40MB

Important Runtime Philosophy

Background indexing is intentionally narrow.

It is designed as:

  • latency isolation for large dirty sorted rebuilds

It is NOT:

  • a universal worker speedup system
  • a distributed cache/database
  • an async query engine

Validation

Passed:

  • npm test
  • npm run test:types
  • npm run build
  • npm run test:worker-runtime
  • npm run bench:codspeed
  • npm run benchmark:background-indexing -- --json
  • COLQL_BENCH_LARGE=1 npm run benchmark:background-indexing -- --json
  • COLQL_BENCH_10M=1 npm run benchmark:background-indexing -- --json
  • npm pack --dry-run

Known Limitations

  • background indexing currently targets large SAB/zero-copy eligible dirty sorted indexes only
  • equality rebuilds remain sync-preferred
  • browser worker runtime is not officially supported yet
  • no public indexing scheduler config exists yet
  • table diagnostics remain internal/debug-only

emremy added 4 commits May 19, 2026 01:31
… sorted indexes

This update enhances ColQL v0.7.0 by implementing conservative auto scheduling for eligible large dirty sorted indexes while maintaining synchronous public query APIs. Equality indexes and small sorted indexes continue to prefer synchronous rebuild behavior. The `query.explain()` method now includes diagnostics for background rebuild eligibility and scheduling decisions, improving overall query performance and diagnostics.
@emremy emremy self-assigned this May 19, 2026
@emremy emremy added the release label May 19, 2026
@codspeed-hq
Copy link
Copy Markdown

codspeed-hq Bot commented May 19, 2026

Merging this PR will improve performance by 31.7%

⚠️ Different runtime environments detected

Some benchmarks with significant performance changes were compared across different runtime environments,
which may affect the accuracy of the results.

Open the report in CodSpeed to investigate

⚡ 1 improved benchmark
✅ 24 untouched benchmarks

Performance Changes

Benchmark BASE HEAD Efficiency
background/sorted/numeric-encode-merge/10k 50.4 ms 38.3 ms +31.7%

Tip

Curious why this is faster? Comment @codspeedbot explain why this is faster on this PR, or directly use the CodSpeed MCP with your agent.


Comparing v0.7.0_prepare_new_version (b7e42c8) with main (1eb57e6)

Open in CodSpeed

@emremy emremy merged commit bfd7cb1 into main May 19, 2026
5 checks passed
@emremy emremy deleted the v0.7.0_prepare_new_version branch May 19, 2026 13:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant