Skip to content

perf(native): fix full-build regression from NativeDbProxy overhead#906

Open
carlos-alm wants to merge 2 commits intomainfrom
fix/native-fullbuild-regression
Open

perf(native): fix full-build regression from NativeDbProxy overhead#906
carlos-alm wants to merge 2 commits intomainfrom
fix/native-fullbuild-regression

Conversation

@carlos-alm
Copy link
Copy Markdown
Contributor

Summary

Fixes #903 — native full-build regression in 3.9.2 (9.4s vs 5.2s in 3.9.1, +81%).

The NativeDbProxy introduced in #897 has per-statement napi serialization overhead (2 round-trips per .run()) that dominates when multiplied across 667 files × thousands of SQL operations in structure/analysis phases.

Three fixes:

  • Enable bulkInsertCfg native fast path (cfg.ts): The Rust method handles delete-before-insert atomically on a single rusqlite connection, eliminating the dual-connection WAL concern that originally blocked it. Replaces hundreds of individual proxy SQL calls with a single napi bulk call.
  • Hand off to better-sqlite3 after orchestrator (pipeline.ts): After the Rust orchestrator completes, switch from proxy to direct better-sqlite3 for JS post-processing (structure + analysis). Properly wires suspendJsDb/resumeJsDb WAL callbacks and transfers the advisory lock to the new connection.
  • Optimize NativeDbProxy.run() (native-db-proxy.ts): Only query last_insert_rowid() for INSERT statements — DELETE/UPDATE skip the extra napi call, halving per-statement overhead.

Expected impact

Phase Before (proxy) After Improvement
CFG (1-file) 53.6ms ~1ms ~50x
Structure (1-file) 127.9ms ~35ms ~3.5x
Dataflow (1-file) 31.3ms ~1ms ~30x
Full build 9.4s ~5.5s ~40%

Test plan

  • TypeScript compilation passes (tsc --noEmit)
  • Biome lint passes on all changed files
  • CI integration tests pass
  • Verify with next benchmark run that native full-build time ≤ WASM

…903)

The NativeDbProxy introduced in #897 has per-statement napi serialization
overhead (2 round-trips per .run()) that causes a full-build regression
when multiplied across 667 files (9.4s vs 5.2s in 3.9.1).

Three fixes:

1. Enable bulkInsertCfg native fast path in buildCFGData. The Rust method
   handles delete-before-insert atomically on a single rusqlite connection,
   eliminating the dual-connection WAL concern that originally blocked it.

2. After the Rust orchestrator completes, hand off to better-sqlite3 for
   JS post-processing (structure + analysis) instead of routing through
   the slow proxy. Wire suspendJsDb/resumeJsDb WAL callbacks and transfer
   the advisory lock to the new connection.

3. Optimize NativeDbProxy.run() to only query last_insert_rowid() for
   INSERT statements, halving per-statement overhead for DELETE/UPDATE.

Closes #903
@claude
Copy link
Copy Markdown

claude bot commented Apr 9, 2026

Claude finished @carlos-alm's task —— View job


PR Review: Native Full-Build Performance Fix

Reviewing this performance regression fix according to the 5-phase review process.

Todo List


Root Cause Analysis

Problem Being Solved: Native engine full-build regression from 5.2s (v3.9.1) to 9.4s (v3.9.2), an 81% slowdown caused by NativeDbProxy per-statement napi serialization overhead.

Why This Approach: The regression was introduced by #897's NativeDbProxy which adds 2 round-trips per SQL .run() operation. With 667 files × thousands of operations in structure/analysis phases, this serialization overhead dominates. The three fixes are targeted and justified:

  1. Enable bulkInsertCfg native fast path - The Rust method handles delete-before-insert atomically, eliminating the dual-connection WAL concern
  2. Hand off to better-sqlite3 after orchestrator - Switch from proxy to direct better-sqlite3 for JS post-processing phases where napi overhead hurts most
  3. Optimize NativeDbProxy.run() - Only query last_insert_rowid() for INSERT statements, halving per-statement overhead

Risk Assessment: LOW - These are surgical optimizations that don't change core logic, only reduce unnecessary overhead.

Code Quality Analysis

native-db-proxy.ts:30-48 ✅ Well-designed optimization

const isInsert = sql.trimStart().substring(0, 6).toUpperCase() === 'INSERT';
// ... later in run()
if (isInsert) {
  const row = ndb.queryGet('SELECT last_insert_rowid() AS rid', []) as {
    rid: number;
  } | null;
  return { changes: 0, lastInsertRowid: row?.rid ?? 0 };
}
return { changes: 0, lastInsertRowid: 0 };
  • Correct approach: Only INSERT statements need last_insert_rowid()
  • Safe string parsing: Handles whitespace with trimStart() and case normalization
  • Consistent interface: Still returns the expected { changes, lastInsertRowid } shape

pipeline.ts:652-668 ✅ Proper handoff logic

// Always hand off to better-sqlite3 for JS post-processing.
// The NativeDbProxy has per-statement napi serialization overhead that
// makes structure/analysis phases significantly slower than direct better-sqlite3.
const lockPath = ctx.nativeFirstProxy 
  ? (ctx.db as unknown as { __lockPath?: string }).__lockPath 
  : undefined;
if (ctx.nativeFirstProxy) ctx.nativeFirstProxy = false;
if (!handoffWalAfterNativeBuild(ctx)) {
  return formatNativeTimingResult(p, 0, analysisTiming);
}
// Transfer advisory lock ownership from proxy to new better-sqlite3 connection
if (lockPath) (ctx.db as unknown as { __lockPath?: string }).__lockPath = lockPath;
  • Correct WAL handling: Properly closes native connection and reopens JS connection
  • Lock transfer: Correctly preserves advisory lock ownership during handoff
  • Error handling: Returns partial result if DB reopen fails

cfg.ts:386-430 ✅ Native bulk-insert enablement

if (allNative && nativeDb?.bulkInsertCfg) {
  // ... build entries array
  let inserted: number;
  try {
    engineOpts?.suspendJsDb?.();
    inserted = nativeDb.bulkInsertCfg(entries);
  } finally {
    engineOpts?.resumeJsDb?.();
  }
  info(`CFG (native bulk): ${inserted} functions analyzed`);
}
  • Proper WAL coordination: Uses suspendJsDb/resumeJsDb to prevent dual-connection conflicts
  • Atomic batching: Replaces hundreds of individual SQL calls with single napi bulk call
  • Try/finally safety: Ensures resumeJsDb is called even if bulkInsertCfg throws

Backlog Compliance

  • Zero-dep: ✅ - No new runtime dependencies
  • Foundation-aligned: ✅ - Improves performance without changing behavior
  • Problem-fit: 4 - Directly addresses build performance (core AI workflow efficiency)
  • Breaking: No - Purely internal optimizations, no API changes
  • Tier: 1 (zero-dep ✅ + foundation-aligned ✅)

Test Plan Assessment

The PR shows partial test completion:

  • ✅ TypeScript compilation passes
  • ✅ Biome lint passes on changed files
  • ❌ CI integration tests pending
  • ❌ Benchmark verification pending

Final Recommendation

  • Rating: ⭐⭐⭐⭐ (4/5)
  • Action: APPROVE WITH CONDITIONS
  • Reasoning: Well-analyzed performance fix with surgical optimizations. The three changes directly address the stated bottlenecks without risking correctness. Code quality is high with proper error handling and WAL coordination. Only condition is verifying the expected performance improvement through benchmarks.

Conditions for merge:

  1. CI tests must pass
  2. Benchmark verification should confirm ~40% improvement and native ≤ WASM performance
  3. Consider adding a regression test to prevent future NativeDbProxy overhead issues

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 9, 2026

Codegraph Impact Analysis

6 functions changed11 callers affected across 7 files

  • NativeDbProxy in src/domain/graph/builder/native-db-proxy.ts:19 (5 transitive callers)
  • NativeDbProxy.prepare in src/domain/graph/builder/native-db-proxy.ts:28 (0 transitive callers)
  • NativeDbProxy.run in src/domain/graph/builder/native-db-proxy.ts:40 (0 transitive callers)
  • runPostNativeAnalysis in src/domain/graph/builder/pipeline.ts:456 (4 transitive callers)
  • tryNativeOrchestrator in src/domain/graph/builder/pipeline.ts:567 (5 transitive callers)
  • buildCFGData in src/features/cfg.ts:368 (4 transitive callers)

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Apr 9, 2026

Greptile Summary

Fixes the 3.9.2 full-build regression (~81% slowdown) caused by per-statement napi serialization overhead in NativeDbProxy. Three targeted changes: always hand off from the proxy to direct better-sqlite3 after the Rust orchestrator finishes, wire suspendJsDb/resumeJsDb WAL callbacks so the native bulk-insert path can safely interleave with the JS connection, and enable bulkInsertCfg in cfg.ts now that the Rust side handles delete-before-insert atomically on its own connection.

Confidence Score: 5/5

Safe to merge — the WAL handoff and bulk-insert fast path are correctly implemented; remaining findings are P2 cleanup items.

All findings are P2 (stale comment, theoretical INSERT-detection gap). The core WAL suspend/resume lifecycle, the proxy-to-better-sqlite3 handoff, and the bulkInsertCfg gating are logically sound. No data-integrity or correctness issues found.

pipeline.ts — stale comment on runPostNativeAnalysis (line 523) worth cleaning up before this pattern spreads.

Vulnerabilities

No security concerns identified.

Important Files Changed

Filename Overview
src/domain/graph/builder/native-db-proxy.ts Optimizes .run() to skip last_insert_rowid query for non-INSERT statements; minor edge-case gap for CTE-prefixed inserts and REPLACE INTO.
src/domain/graph/builder/pipeline.ts Always hands off from NativeDbProxy to better-sqlite3 after the orchestrator; correctly wires suspendJsDb/resumeJsDb; stale comment and dead !ctx.nativeFirstProxy guard remain in runPostNativeAnalysis.
src/features/cfg.ts Enables the native bulkInsertCfg fast path (previously blocked), correctly scoped behind allNative && nativeDb?.bulkInsertCfg with WAL suspend/resume guards.

Sequence Diagram

sequenceDiagram
    participant O as tryNativeOrchestrator
    participant R as Rust buildGraph
    participant H as handoffWalAfterNativeBuild
    participant JS as better-sqlite3 (ctx.db)
    participant NA as NativeDb (analysis)
    participant CFG as buildCFGData

    O->>R: buildGraph() via nativeDb
    R-->>O: result (nodes/edges)
    O->>H: always hand off (nativeFirstProxy=false)
    H->>H: closeNativeDb (WAL checkpoint + close)
    H->>JS: openDb() — fresh better-sqlite3
    O->>NA: runPostNativeAnalysis()
    NA->>NA: openReadWrite (fresh nativeDb for analysis)
    NA->>NA: wire suspendJsDb / resumeJsDb
    NA->>CFG: buildCFGData(db, fileSymbols, engineOpts)
    CFG->>JS: getFunctionNodeId() reads
    CFG->>JS: suspendJsDb() — WAL checkpoint
    CFG->>NA: bulkInsertCfg(entries) — atomic delete+insert
    CFG->>NA: resumeJsDb() — WAL checkpoint
    NA->>NA: close nativeDb after analyses
    O->>O: closeDbPair(ctx.db, ctx.nativeDb)
Loading

Reviews (2): Last reviewed commit: "fix: address review feedback — remove re..." | Re-trigger Greptile

Comment on lines +658 to +668
const lockPath = ctx.nativeFirstProxy
? (ctx.db as unknown as { __lockPath?: string }).__lockPath
: undefined;
if (ctx.nativeFirstProxy) ctx.nativeFirstProxy = false;
if (!handoffWalAfterNativeBuild(ctx)) {
// DB reopen failed — return partial result
return formatNativeTimingResult(p, 0, analysisTiming);
}
// Transfer advisory lock ownership from the proxy to the new better-sqlite3
// connection so closeDbPair releases it at the end.
if (lockPath) (ctx.db as unknown as { __lockPath?: string }).__lockPath = lockPath;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Lock-path transfer is redundant

openDb() always calls acquireAdvisoryLock(dbPath) and then immediately sets db.__lockPath = \${dbPath}.lock`on the returned connection (seeconnection.ts:165–170). The transfer at line 668 re-assigns the same string that was already written by openDb, so it is a no-op. The lockPath` capture and conditional assign can be removed to reduce confusion about ownership semantics.

Suggested change
const lockPath = ctx.nativeFirstProxy
? (ctx.db as unknown as { __lockPath?: string }).__lockPath
: undefined;
if (ctx.nativeFirstProxy) ctx.nativeFirstProxy = false;
if (!handoffWalAfterNativeBuild(ctx)) {
// DB reopen failed — return partial result
return formatNativeTimingResult(p, 0, analysisTiming);
}
// Transfer advisory lock ownership from the proxy to the new better-sqlite3
// connection so closeDbPair releases it at the end.
if (lockPath) (ctx.db as unknown as { __lockPath?: string }).__lockPath = lockPath;
if (ctx.nativeFirstProxy) ctx.nativeFirstProxy = false;
if (!handoffWalAfterNativeBuild(ctx)) {
// DB reopen failed — return partial result
return formatNativeTimingResult(p, 0, analysisTiming);
}

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — removed the redundant lockPath capture and conditional transfer. As noted, openDb() already calls acquireAdvisoryLock(dbPath) and sets __lockPath on the new connection (connection.ts:165-170), so the transfer was a no-op.

Comment on lines +420 to +429
if (entries.length > 0) {
let inserted: number;
try {
engineOpts?.suspendJsDb?.();
inserted = nativeDb.bulkInsertCfg(entries);
} finally {
engineOpts?.resumeJsDb?.();
}
info(`CFG (native bulk): ${inserted} functions analyzed`);
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 inserted is uninitialized if suspendJsDb throws before bulkInsertCfg runs

TypeScript accepts this because a throw inside the try block means info() is unreachable, so the compiler knows inserted is assigned when info() is actually executed. Runtime behaviour is correct. However, initialising inserted = 0 makes the intent explicit and avoids potential confusion if a catch block is ever added later.

Suggested change
if (entries.length > 0) {
let inserted: number;
try {
engineOpts?.suspendJsDb?.();
inserted = nativeDb.bulkInsertCfg(entries);
} finally {
engineOpts?.resumeJsDb?.();
}
info(`CFG (native bulk): ${inserted} functions analyzed`);
}
if (entries.length > 0) {
let inserted = 0;
try {
engineOpts?.suspendJsDb?.();
inserted = nativeDb.bulkInsertCfg(entries);
} finally {
engineOpts?.resumeJsDb?.();
}
info(`CFG (native bulk): ${inserted} functions analyzed`);
}

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — initialized inserted = 0 for explicit intent, matching the suggestion exactly.

…d dead branch (#906)

- Remove redundant lockPath capture and transfer in tryNativeOrchestrator
  (openDb already sets __lockPath on the new connection)
- Remove dead else-if branch in runPostNativeAnalysis (ctx.nativeFirstProxy
  is always false after the unconditional reset in tryNativeOrchestrator)
- Initialize inserted = 0 in buildCFGData for explicit intent
@carlos-alm
Copy link
Copy Markdown
Contributor Author

Addressed Greptile's outside-diff comment (P2: dead branch at pipeline.ts:504-506): removed the else if (ctx.nativeFirstProxy) arm in runPostNativeAnalysis, since ctx.nativeFirstProxy is unconditionally set to false before this function is called from tryNativeOrchestrator.

@carlos-alm
Copy link
Copy Markdown
Contributor Author

@greptileai

carlos-alm added a commit that referenced this pull request Apr 9, 2026
NativeDbProxy overhead causes native full build to regress +81%
(5206ms -> 9403ms). Fix tracked in PR #906. Add to KNOWN_REGRESSIONS
to unblock this benchmark data PR.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

perf: native engine full-build regression in 3.9.2 (9.4s vs WASM 7.2s)

1 participant