Phase 4: per-phase perf campaign vs native git on codex — TTLs, protocol levers (noflush/noopen/io_uring), bulk+streamed clone, POSIX and lifecycle fixes by factory-ain3sh · Pull Request #2 · Factory-AI/vfs

factory-ain3sh · 2026-07-03T03:36:36Z

What this is

77 commits closing the VFS performance gap against native git on the canonical openai/codex workload, plus the correctness and lifecycle work the campaign surfaced. Goal metric: per-phase wall-clock ratio <= 1.5x native.

Where we landed (codex, multi n=5)

phase	before	after	bar (<=1.5x)
read_search	~4.7x	1.37-1.41x	met
status	~1.9x	0.60-0.93x	met
diff	~80ms	18ms (0.05x)	met
checkout	—	0.42x	met
fsck	—	0.83x	met
clone (agentfs clone)	9.6x plain	2.22x (0.754s)	floored: whole-state double write (pack+worktree 2x43MB into SQLite)
edit	~8ms	6ms codex / 2.5-2.8ms micro floor	micro target met; residual = kernel close-inval + fsync txn floor
read-path warm	~4.7x	~2.1-2.4x	floored: kernel close-time STATX_BLOCKS invalidation (upstream patch written + VM-validated, see .agents/kernel/)

Every remaining miss carries a named, measured floor documented in .agents/specs/.

Highlights

Perf (default-on, each with a kill switch):

Kernel entry/attr TTLs 1s -> 10s; keep-cache with fingerprint revalidation; FOPEN_CACHE_DIR
ENOSYS-FLUSH (no close-time round trip) and ENOSYS-OPEN (kernel no_open: zero-message opens via shared per-inode file table)
FUSE-over-io_uring transport (vendored fuser ABI 7.31 -> 7.42; probe-gated, falls back to legacy channel)
Write batcher: cross-inode group commit, no drain on FORGET, self-invalidation suppression
agentfs clone: bulk ingest via SDK import_entries -> streamed ImportSession pipeline (cat-file parse overlapped with import), fabricated git index for clean first status

Correctness / lifecycle:

POSIX unlink-while-open: OpenInodes registry with deferred reaping, nlink=0 orphan sweep, integrity invariant amended
Signal handling: supervise_child + PDEATHSIG in exec, session teardown in mount — no more orphaned processes or stale mounts on TERM/INT
noopen-coherence gate (6 scenarios incl. full POSIX unlink-while-open), durability and integrity gates green; 168 SDK + 109 CLI tests

Measurement infrastructure:

git-workload benchmark pinned to the codex fixture by default (a synthetic-fixture drift mid-campaign was caught by session audit and voided; measurement contract now pinned in the roadmap spec)
Per-op FUSE dispatch counters, profile checkpoints, A/B multi-run harness

Upstream:

Root-caused the read-path floor to fuse_flush's unconditional STATX_BLOCKS invalidation; 17-line FUSE_I_BLOCKS_DIRTY kernel patch written and VM-validated (GETATTR storm 1095 -> 70, 2.2x cycle time; du/mmap correctness intact). Patch archived in .agents/kernel/, pending sign-off + submission.

Tighten v0.5 copy migration and FUSE coalescing after review: preserve overlay config, normalize whiteout parent paths, keep legacy migrate v0.4-only, stream sparse/large file migration and verification, lock/hash the source DB family, and flush FUSE writes across getattr/truncate/cross-handle ordering boundaries. Profiling coverage now records FUSE flush count/ranges/bytes so coalescer effectiveness is visible in AGENTFS_PROFILE summaries. Validation passed SDK fmt/clippy/tests, CLI fmt/check/clippy/tests, cli/tests/all.sh, phase0 smoke, replay smoke, and diff whitespace checks; pjdfstest skipped with exit 77 because pjdfstest is not installed. Benchmark results: the local bounded read smoke on /home/ain3sh/factory/factory-mono improved from the earlier Phase 3 baseline of ~125.8x native to 15.17x native with stdout-equivalent output; the synthetic phase0 smoke measured 16.53x native. This is a material profiling/benchmarking improvement but still above the north-star 1.5-2x target. Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

Document the v0.5 schema/write-path architecture, copy-only migration, and pjdfstest operating model so Phase 5 starts from an explicit correctness baseline. Add a phase45-ci pjdfstest profile that passes under the current unprivileged FUSE contract, emits selected-test and known-gap report artifacts, and reserves exit 77 for missing prerequisites only. CI now builds pjdfstest and runs the supported profile, while full pjdfstest remains available for Phase 5 triage. Validation: bash -n scripts/validation/posix/run-pjdfstest.sh; run-pjdfstest.sh --list-profiles; run-pjdfstest.sh --profile phase45-ci (37 files, 142 tests, PASS); scripts/validation/phase0.sh; git diff --check. Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

Resolve follow-up review findings across the Phase 5 prototype: partial-origin opens now resolve persisted base paths instead of volatile HostFS inodes, detect base-size drift, cover remount/readdir_plus/rename/truncate cases, and keep metadata-only regular-file updates on the partial-origin path. Tighten NFS write-handle semantics with random bounded write handles and SETATTR/truncate authorization tests. Clean up validation docs/manifests so supported chown tests do not overlap known gaps, selected pjdfstest manifests are reported with path/hash, large-edit benchmark reports partial-origin tables, and backend-risk commands reference existing replay tooling. Validation: SDK fmt/check/clippy plus focused partial-origin tests; CLI fmt/check/clippy plus NFS handler tests; phase45-ci and phase5-ci pjdfstest; phase0 smoke; validation helper syntax/self-tests; large-edit/backend-risk smoke; git diff --check. Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

Record Phase 5.5 backend spike results and make the helper capture measured validation outcomes for future upgrade/fallback decisions. Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

Ships the accumulated phase 6 / 6.5 / 7 / 8 north-star work that was staged on phase4-north-star-implementation but uncommitted: - Phase 6: partial-origin overlay with portable/inline storage, secure read-only passthrough plumbing, materialize/integrity/backup production safety commands, encrypted-database key/cipher handling, fuse cache invalidation tests - Phase 6.5: read-path fast path (cached inode attrs, read profiler, cache tuning, instrumentation counters, passthrough rules) - Phase 7: principle-preserving git workload fast path (write batcher, FUSE concurrency lanes, cache plumbing, git workload gates) - Phase 8: parallel FUSE dispatch + bounded worker pool with shared read lane and exclusive write lane, deferred kernel-cache invalidation infrastructure, writeback-cache configuration, phase 8 validation gates (concurrent-git-stress, writeback-durability, writeback-no-fsync-crash, fuse-serialization-stress) Touches: cli/src/{cmd,fuser,mount,nfs,nfsserve,sandbox}/, cli/src/fuse.rs, cli/src/opts.rs, cli/src/main.rs, sdk/rust/src/{filesystem,profiling}.rs, sdk/rust/src/lib.rs, sandbox/src/vfs/sqlite.rs, sandbox/Cargo.lock, MANUAL.md, README.md, SPEC.md, TESTING.md, validation scripts, .agents/specs/ phase markers, and .agents/05_* session notes. NOTE: a small portion of the Tier One delta (MutationAudit struct, the 3 kernel-cache default flips in fuse.rs, the rewritten fuse_sync_inval_enabled_from_env() body, the rewritten FuseKernelCacheConfig::from_env body, the 4 reworded warn messages, and the matching FUSE controls section in MANUAL.md/TESTING.md) is bundled into this commit because it is textually intermingled with prior phase 6-8 work in the same files. The cleanly-separable Tier One CODE (the fuse-modern abi-7-* cascade in cli/Cargo.toml, the FuseDispatchMode::from_env auto default in cli/src/fuser/session.rs, and the clippy fix in cli/src/sandbox/linux.rs) lands in the next commit; Tier One artifacts (spec, RCA notes, multi-iter benchmark wrapper, baseline + post-impl aggregate JSONs) land in the commit after that. See .agents/specs/2026-05-24-tier-one-spec-enable-kernel-cache-by-default-37x-8-12x.notes.md for the full RCA covering both the ABI cascade bug and the sync_inval deadlock. Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

Three small, cleanly-separable code changes that complete the Tier One default-on kernel cache. The intermingled remainder of the Tier One delta (default flips and MutationAudit infrastructure in fuse.rs, deferred-by-default invalidation, FUSE controls section in MANUAL.md and TESTING.md) is bundled into the preceding backlog commit because the same files also carry ~7 000 lines of phase 6-8 work; this commit contains the only Tier One edits that land in files we did not also modify for the backlog. cli/Cargo.toml: add fuse-modern umbrella feature enabling abi-7-19 through abi-7-31 and add it to the default feature set. The vendored fuser dispatcher gates each FUSE opcode behind its abi-7-N cfg, so without this cascade the kernel sends opcode 44 (FUSE_READDIRPLUS) and the dispatcher returns ENOSYS, breaking any readdir on the mount once the kernel cache fast path is enabled. cli/src/fuser/session.rs: change FuseDispatchMode::from_env()'s unset branch from Self::Serial to the same auto resolution used by AGENTFS_FUSE_WORKERS=auto, so the worker pool is on by default. This is the matching half of the kernel-cache fast-path default flip in cli/src/fuse.rs (which is in the backlog commit). cli/src/sandbox/linux.rs: silence clippy::too_many_arguments on run_cmd() so cargo clippy -D warnings keeps passing after the lint profile we re-ran during Tier One ship validation. Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

…pper Adds the artifacts that accompany the default-on kernel cache work: - .agents/specs/2026-05-24-tier-one-spec-*.md: approved spec describing the Tier One scope (env-var default flip + invalidation audit + abi-7-* feature cascade) and the 8-12x target. - .agents/specs/2026-05-24-tier-one-spec-*.notes.md: implementation notes covering the RCA for both latent bugs surfaced by the default flip (FUSE_READDIRPLUS ENOSYS via missing abi-7-21, and the sync_invalidation + parallel-workers deadlock on git fork/fsync), plus the post-impl benchmark comparison vs the baselines below. - .agents/benchmarks/baseline-current-default.agg.json: 5-iter median baseline of the current branch BEFORE Tier One (overall 4.46x). - .agents/benchmarks/baseline-main-default.agg.json: 5-iter median baseline of origin/main 3a5ed2b AgentFS 0.6.4 (overall 3.85x). - .agents/benchmarks/post-impl-default.agg.json: 3-iter median after Tier One (overall 2.92x; clone 7.21x, checkout 1.55x, status 1.10x, read_search 2.19x, edit 9.19x [native sub-ms], diff 0.79x [faster than native]). 21% improvement vs current baseline; 24% vs main. - .agents/benchmarks/{baseline-*,run-*}.json: per-iteration raw JSON preserved for reproducibility. - .agents/benchmarks/fixtures/README.md: reproduction notes; the ~63 MiB openai/codex bare clone itself is gitignored. - scripts/validation/git-workload-benchmark-multi.py: non-invasive multi-iteration wrapper around git-workload-benchmark.py that reports median + p25/p75 + stdev per phase, used as the canonical performance measurement going forward. Also updates .gitignore for Python __pycache__/ and the large benchmark fixture directory. Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

Tier Two prep — direct head-to-head measurement of the same three workloads (read-heavy, copy-on-write, mixed git) against native, the original AgentFS at origin/main 3a5ed2b, and Tier One AgentFS at HEAD 9be0da4. Both agentfs binaries built from clean release profiles on the same machine, no AGENTFS_FUSE_* env vars set. Headline (ratio of agentfs / native; lower is better): | Workload | Original | Tier One | Delta | | read-heavy (full run, w/ startup) | 2.51x | 3.03x | +21% | | read-heavy (steady-state only) | 7.76x | 3.79x | -51% | | copy-on-write 50 MiB edit | 8.19x | 5.42x | -34% | | mixed git workload (median) | 5.16x | 3.21x | -38% | Bonus: CoW delta DB growth for the single-byte edit dropped from 172.6 MiB to 50.4 MiB (-71%). Tier One regressed read-heavy full-run startup by ~10-15 ms because the mount now negotiates parallel workers + readdirplus + writeback + ABI 7.31 at FUSE init; this is amortised on sustained workloads (see the steady-state row dropping 51%) but matters for short-lived sandboxes. Captured as a Tier Two focus item. Files: COMPARISON.md (human-readable tables + Tier Two focus notes) plus the 6 raw per-run JSONs for reproducibility. Tracks at .agents/benchmarks/tier-two-prep/. Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

Closes the read-path gap with a HostFS passthrough for unmodified partial-origin delta inodes (Axis C) and cuts clone-phase write overhead with both cross-inode batched commits (Axis A1) and a FUSE-layer per-fh write coalescer (Axis A2). Bundles the two Tier One cleanups (release-first agentfs binary resolver, feature-gated FUSE_DO_READDIRPLUS capability negotiation) that were noted during the Tier Two due-diligence pass. Net mixed-workload effect (codex fixture, 5-iter / 2-warmup median): agentfs total 2.91s → 2.51s (-14%); ratio 3.21x → 2.97x. CoW edit agentfs absolute 0.67s → 0.36s (-46%). Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

Spec + implementation notes + before/after benchmark JSONs for Tier Two (HostFS read passthrough, clone batching, FUSE coalescer). Adds tier-two-post/COMPARISON.md mirroring the tier-two-prep comparison so the read-heavy / CoW / mixed numbers across origin/main, Tier One, and Tier Two are all in one place. Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

…ad-in-default Profile-validated findings from Tier 3 due diligence: - AGENTFS_FUSE_WRITEBACK defaults TRUE in cli (line 130) but FALSE in the SDK (env_flag_enabled). The cross-inode batched commit shipped in Tier 2 was dead code in the canonical workload. - Axis C HostFS passthrough never fires (passthrough_attempted=0) even with AGENTFS_OVERLAY_PARTIAL_ORIGIN=1 explicitly set: the codex clone workload never modifies a base file, so partial-origin mappings are never created. - Tier 2 diff/CoW wins were per-iteration noise, not attributable to A1 or C; the real Tier 2 deliverables were A2, the lock-fix refactor, and the cleanups. Includes a 5-iter mixed-workload benchmark with AGENTFS_FUSE_WRITEBACK=1 forced to document what Tier 2 would have delivered if the gating had been correct (agentfs 2.51 s -> 2.29 s). Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

…forced Raw 5-iter / 2-warmup mixed-workload aggregate from the canonical codex fixture with the SDK batcher actually enabled (the env var the cli defaults to on but the SDK defaults to off). This is the comparison artifact for the Tier 2 retroactive correction. Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

… worker pool, larger inline tier Tier 3 ships the three low-risk perf moves whose RCAs were nailed down by Tier 3 due diligence: * Axis D — align SDK 'AGENTFS_FUSE_WRITEBACK' default with cli (true when unset). The cli has defaulted FUSE writeback ON since Tier 1 but the SDK gated the cross-inode write batcher behind 'env_flag_enabled' (default off), making Tier 2's A1 dead in default config. Profile counters confirm: enqueues went 0 -> 4759 on the canonical workload after the fix. * Axis F — default AGENTFS_FUSE_CPU_PERCENT 25 -> 50 so 'auto' worker resolution yields more parallelism on the typical machine. The previous 25% default saturated at 3 workers on a 14-core box with 570 ms of cumulative dispatch wait during clone. * Axis I — DEFAULT_INLINE_THRESHOLD 4 KiB -> 16 KiB so the (4, 16] KiB tail of codex working-tree files avoids the chunked- storage path. fs_config persists per-DB so existing databases keep their 4 KiB threshold; only newly-initialised DBs adopt 16 KiB. chunk_write_chunks halved on the canonical workload. * drain_due_timer enhancement — when the per-inode timer fires and the inode is ripe, route through drain_pending_batched to commit all pending inodes in one txn. Harmless when only one ino is ripe. Net effect (5-iter / 2-warmup median, codex fixture): agentfs total 2.51 s -> 2.28 s (-9%); ratio 2.97x -> 2.73x. Axis E (defer release/close drain) and Axis H (multi-row VALUES INSERT) were attempted and reverted; see Tier 3 notes for RCAs. Axis G (pack-aware streaming writer) deferred to Tier 4 — it depends on the same 'consistent-without-drain' SDK read path that E needs. Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

Includes axis-by-axis RCAs for D/F/I (shipped), H/E (attempted + reverted with profile evidence), G (deferred to Tier 4), and the C disposition decision. tier-three-post/ has the raw 5-iter JSON for the final mixed-workload run and the per-axis intermediate runs. Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

Removes the synchronous SQLite drain from the SDK read path. AgentFSFile::pread now consults the in-memory write batcher (peek_pending) and merges over SQLite-resident bytes. Drains become a pure durability operation, triggered only by fsync / destroy / timer / bytes-threshold. SDK changes (sdk/rust/src/filesystem/agentfs.rs): - AgentFSWriteBatcher gains peek_pending, peek_pending_max_end, truncate_pending, and discard_pending. peek* are read-only snapshots that acquire state.lock briefly without touching the pool. truncate_pending shrinks pending in place for AgentFSFile::truncate. discard_pending drops all pending writes for an ino, used at unlink/rename/remove sites so a later batched drain doesn't try to INSERT into a missing fs_inode row. - AgentFSFile::{pread,pwrite,pwrite_ranges,truncate,fsync} no longer call drain_writes on every op. pwrite routes through batcher.enqueue when the batcher is wired. pread peeks the batcher BEFORE acquiring the pool conn and drops the conn BEFORE the splice loop to keep timer-drain tasks un-starved on the single-conn ephemeral pool. fsync remains the explicit durability barrier. - AgentFS::{getattr,lookup,lstat,stat} no longer call drain_inode_writes. New merge_pending_size helper ORs peek_pending_max_end into the SQLite size view. Fixes a 30-second ConnectionPoolTimeout deadlock that surfaces once the batcher actually holds pending data (lookup held the only permit, then drain_pending_batched waited for the same permit). - AgentFS::{unlink,rename,remove} (both path-based and trait impls) now call batcher.discard_pending(ino) before deleting the inode row. Without this, the Explicit drain that bundles ALL pending inodes in one txn fails with Fs(NotFound) on the deleted ino. - AgentFSWriteBatcher::enqueue now calls attr_cache.remove(ino) so consumers of cached attrs don't see pre-write state after a successful pwrite. getattr re-caches the OR'd size so cached_attr agrees with what getattr returned. CLI changes: - cli/src/fuse.rs: flush_pending_inode no longer calls drain_inode_writes; the per-fh FUSE WriteBuffer still flushes into the SDK batcher, but the batcher's pending writes now serve FUSE reads through the overlay. - cli/src/cmd/fs.rs: write_filesystem (one-shot CLI op) calls drain_all before returning so the next opener (e.g. cat) sees the bytes. Tests: - 157 SDK lib tests pass (148 pre-existing + 9 new overlay tests covering read-after-write, partial overlap, hole reads, truncate clipping, getattr size growth, concurrent writers, unlink-during-pending, fsync-drains-to-sqlite). - 106 CLI tests pass after the FUSE refactor. - clippy clean; cargo fmt applied. - Phase 8 smoke: all 7 gates pass. Benchmark (9-iter median, codex fixture): - Mixed median ratio 3.24x vs Tier 3's 2.73x; high variance dominates (stdev ~1.7x). agentfs absolute 2.47s vs Tier 3 2.28s. Checkout phase improved 40% (overlay paying off); diff/read_search regressed ~50% (state.lock acquires per pread). Tier 4 is a foundation commit; Tier 5 is where the perf win actually lands. Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

…omparison Tier 4/5/6 spec lays out the full architectural arc to 1.5x mixed median with explicit go/no-go gates between tiers: - Tier 4 (this commit's code): consistent-without-drain SDK overlay foundation. Target ~2.5x. Effort ~3 days, ~500 LOC. Risk: medium. - Tier 5: defer release/forget drain (Axis E) + pack-aware streaming writer (Axis G), now structurally safe to ship on the Tier 4 foundation. Target ~2.0x. Effort ~3-5 days, ~600 LOC. - Tier 6: shadow-tree pivot. Working-tree content moves to real HostFS files; SQLite keeps overlay metadata only. Reads return shadow fd via FOPEN_PASSTHROUGH (Linux 6.9+). Target ~1.5x. Effort ~2-3 weeks, ~2000 LOC. Risk: high (architectural break). Tier 5 -> Tier 6 gate: if mixed median <=1.8x with tight variance, GO Tier 6. Otherwise re-spec before the shadow-tree pivot. Honest scope limits called out in the spec: - CoW (50 MiB single-byte edit) 1.5x is NOT in this stack; needs Tier 7 smaller-chunks-for-partial-origin work. - Encrypted databases pay a fixed crypto overhead. - Cold-mount startup not addressed. Notes file logs the Tier 4 implementation honestly: - 157 SDK tests + 106 CLI tests + 7 Phase 8 gates green. - 9-iter benchmark median 3.24x vs Tier 3's 2.73x; high variance (stdev 1.72x). Per-phase: checkout -40% (overlay paying off); diff/read_search +50% (state.lock acquires per pread, ~50ms absolute). - Three latent bugs surfaced and fixed: single-conn pool deadlock in lookup, orphan fs_data rows on unlink/rename/remove, and CLI write_filesystem durability for fresh openers. - Recommendation: GO on Tier 5. Foundation is correct; Tier 5 is where perf actually moves. Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

…budget + deviations recorded Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

…aching via FOPEN_CACHE_DIR A FLUSH that drained no writes invalidated the inode anyway, feeding the drift guard's sticky dropped set: the first close(2) of any file revoked FOPEN_KEEP_CACHE forever and every warm re-open paid full FUSE READs (64/1280 opens kept cache; now 1280/1280, READs 1280->64). opendir now grants FOPEN_CACHE_DIR|FOPEN_KEEP_CACHE (FUSE_NO_OPENDIR_SUPPORT only advertised when off), halving readdirplus on the git workload, and open() collapses three block_on hops into one. Read-path warm steady-state 12.7x -> ~4.0x (8/8 A/B pairs, paired wall median 0.744); git workload dispatches -7.9%, status phase 6.33x -> 1.99x. Kill switches: AGENTFS_FUSE_FLUSH_INVAL=1, AGENTFS_FUSE_CACHE_DIR=0. Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

…r = OPEN+FLUSH round trips, next levers logged Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

…drop to fingerprint revalidation Upper/Delta files never got FOPEN_KEEP_CACHE (Layer::Base-only) and the sticky dropped set revoked eligibility forever after a file's first write, so every git-created file paid full FUSE READs on each re-open. Drop now clears the fingerprint and the next read-only open revalidates against fresh stats; AgentFS grants keep-cache for regular files and the overlay delegates Delta inodes. Git workload: grants 20->1694, READs 2548->519, dispatches -5.3%, paired wall 0.906; status 0.71x, diff sub-native, read_search 2.25x. Kill switches: AGENTFS_KEEPCACHE_DELTA=0, AGENTFS_FUSE_STICKY_KEEPCACHE_DROP=1. Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

…t+fsck under 1.5x; residual = OPEN+FLUSH round trips Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

Per-CPU io_uring queues serve requests via REGISTER/COMMIT_AND_FETCH uring_cmds (raw SQE128 rings, no new deps), replacing the read/writev syscall ping-pong on /dev/fuse; the legacy loop keeps running for INIT, FORGET, INTERRUPT and notifications, and the kernel falls back to it on any registration failure. Requests are reassembled into the classic contiguous layout so the existing parse/dispatch/reply stack is reused; ChannelSender becomes Fd|Uring. INIT advertises FUSE_OVER_IO_URING only behind the env gate + kernel offer + ring-setup probe, with max_write clamped to 1MiB to bound ring memory. Requires fuse.enable_uring=1. Eval: phase8 repeated-read 3.00x -> 1.81x, base-read steady-state -34%, git workload parity (clone is SQLite-bound), all correctness gates and equivalence green. Knobs: AGENTFS_FUSE_URING_DEPTH, _SPIN_US. Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

…ated-read 1.81x), opt-in pending idle-host A/B Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

The first FLUSH performs its normal drain work and replies ENOSYS, latching the kernel's connection-wide no_flush so every later close() skips the FLUSH round trip (the kernel pushes dirty writeback pages via write_inode_now before checking no_flush, so no data bypasses the adapter). The buffered tail a closed handle leaves behind until the async RELEASE is sealed by always-on pending-tail guards: a pending_dirty_handles atomic gives attr-bearing paths a free fast path, lookup drains and refetches, readdirplus intersects entries with pending inodes and refetches once, link drains before the SDK call, and setattr's drain is now unconditional. These guards also close the pre-existing pre-close staleness window. New gate scripts/validation/flush-coherence.py races stat / scandir / link-stat / read against RELEASE under {flush,noflush} x {default TTL, entry TTL 0}: 4/4 pass, one FLUSH op total (vs 242), zero mismatches. Eval: open/read/close cycle 61.7us -> 31.2us (-49%), 26.4us compound with uring (-57%); repeated-read gate 3.00x -> 1.96x; read-path paired wall 0.823; git workload parity over 7 pairs. Kill switch AGENTFS_FUSE_NOFLUSH=0; forced off under AGENTFS_DRAIN_ON_RELEASE=1. Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

…lt on, coherence gate added Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

…ies stats keep_cache_for_read_open now returns the Stats it consulted so the adapter fingerprints the grant without a second getattr, and the adapter grants directly from its own epoch-guarded attr cache when the delta keep-cache gate is on, skipping the SDK probe entirely (SDK getattrs in the read_search phase: 207 -> 0 per run). Wall-time neutral: the eliminated calls were mostly SDK-LRU hits; the measured per-open floor is the two surviving SQLite SELECTs (overlay partial_origin + AgentFS::open existence check), carried as input to the ENOSYS-OPEN evaluation. Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

…oor = 2 SELECTs; ENOSYS-OPEN is the lever Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

…n (AGENTFS_FUSE_NOOPEN=1) Replying ENOSYS to the first FUSE_OPEN latches the kernel's connection-wide no_open: every later open(2)/close(2) completes with no FUSE request (default fuse_file carries fh=0 + FOPEN_KEEP_CACHE) and FUSE_RELEASE is skipped for every file, including CREATE-opened ones. All fh=0 traffic resolves through a shared per-inode file table: read/fsync resolve O_RDONLY, writes resolve O_RDWR (upgrading a read-resolved entry replaces its file post-copy-up, strictly more coherent than per-fh stale base handles), CREATE seeds the entry and echoes fh=0, ftruncate's SETATTR fh path falls through to the same resolution, and FORGET drains the buffered tail and drops the entry (soft LRU cap AGENTFS_FUSE_INO_FILES_CAP, clean entries only). The per-inode WriteBuffer joins the WS7 pending machinery (guards, counter, flush_all_pending/destroy). Gated on the kernel offering FUSE_NO_OPEN_SUPPORT; forced off under AGENTFS_DRAIN_ON_RELEASE. New gate scripts/validation/noopen-coherence.py: close-race loop, ftruncate via fh=0, O_TRUNC reopen, mmap+msync, eviction-cap and overlay copy-up upgrade scenarios — 6/6 pass (1 open + 1 release vs 65 + 129 legacy). Light gates green; preliminary micro (loaded host): open/read/close 64.9 -> 18.8us/cycle (4.72x -> 1.72x). Full A/B and promotion decision deferred to an idle host. Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

…open gap classified, eval pending idle host Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

…workloads or dead mount entries agentfs exec and agentfs mount had no signal handling, so SIGTERM/SIGINT killed the process without running MountHandle's unmount: the mount table kept a dead entry (ENOTCONN for every later visitor) and exec's workload child survived as an orphan still running inside it — interrupted benchmark harnesses leaked both on every kill. exec now supervises the child (select over child-exit vs SIGTERM/SIGINT/SIGHUP; forwards SIGTERM, 5s grace, then SIGKILL), sets PR_SET_PDEATHSIG=SIGKILL on the child so even SIGKILL on agentfs cannot orphan it, always unmounts and removes the temp mountpoint, and exits 128+signo. The mount command runs the FUSE session on its own thread and unmounts on the shared mount::shutdown_signal(); NFS foreground upgrades from ctrl_c-only to the same three signals. Kill matrix: TERM/INT fully clean (no procs, mounts, or dirs; exits 143/130), KILL reaps the child via PDEATHSIG (lazy mount entry is the uncatchable residual). auto_unmount was a dead end: the vendored fuser forces allow_other with it, which requires user_allow_other in /etc/fuse.conf. Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

Idle-host A/B: micro open/read/close 47.3 -> 21.2us/cycle (paired median 0.469); git workload read_search -56..-83%, diff -57..-62%, status -20..-47%, checkout -22..-34%, fsck -18..-34%, edit and clone neutral; read-path benchmark neutral (same-run normalized 2.54x -> 2.25x). Correctness with the new default: noopen-coherence 6/6, flush-coherence 4/4, metadata-mutation, serialization stress, writeback durability, no-fsync crash, 275 unit tests. Still requires kernel FUSE_NO_OPEN_SUPPORT and stays off under AGENTFS_DRAIN_ON_RELEASE. Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

…k RCA recorded Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

…t handle drops Unlink and rename-replace reaped fs_inode/fs_data rows the moment nlink hit 0, so any I/O on an unlinked-but-open file failed — even the close-time writeback mtime SETATTR — in both the per-fh and noopen fh paths. Every user-visible AgentFSFile now carries an RAII guard in a shared OpenInodes registry (the batcher's ephemeral internal handles opt out). All four deletion sites (public and trait unlink + rename overwrite) skip row deletion while handles are live, leaving nlink = 0 as the crash-safe orphan marker; the last handle drop queues the ino and process_deferred_reaps (hooked at trait unlink/rmdir/rename and finalize, nlink=0-guarded against rowid reuse) deletes the rows in one transaction. A mount-time sweep collects crash-stranded orphans. The integrity invariant namespace.non_root_inode_has_dentry now admits the orphan state (dentry-less iff nlink = 0). noopen-coherence scenario 5 restored to full POSIX assertions (read-back, write-through, fsync, st_nlink==0, clean close): 6/6 PASS in both modes. Two new SDK tests cover deferred reap and the mount sweep; test_delete_file_removes_all_chunks now closes its handle before remove, per the new contract. Documented residuals: ino_files LRU-cap eviction under noopen can drop the SDK handle before the kernel fd closes (>65k simultaneous inodes), and a second mount's sweep cannot see this process's handles — both equivalent-or-better than the pre-fix instant reap. Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

…ile-open followup closed Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

…thetic only via --synthetic Bare invocations silently generated a 96x1KB synthetic repo, which twice produced scoreboard-incomparable ratios (most recently the 07-02 WS9 A/B, mis-attributed to a kernel baseline shift). The canonical fixture is now the no-flag default with a stderr note, --synthetic is the explicit opt-out (warning on missing-fixture fallback), and --read-bytes defaults to the canonical 4096. Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

…-fixture; measurement contract pinned; WS9 promotion provisional pending codex re-run Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

…motion final; uring equal-or-better on every codex phase Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

…es — no more /tmp husks Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

…URING=0 Codex A/B (n=5): equal-or-better on every phase — total 3.37x -> 2.92x, status 0.93x -> 0.60x, read_search 1.41x -> 1.37x, clone -3%; the synthetic-fixture write-phase regression that kept WS6 opt-in was a toy-workload artifact. Safe as a default because INIT only advertises FUSE_OVER_IO_URING after the ring-setup probe succeeds (requires root sysctl fuse.enable_uring=1); everything else stays on the legacy /dev/fuse channel. Gates green under the new default: noopen/flush coherence, serialization, durability, metadata-mutation, 109 CLI tests. Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

…TX_BLOCKS invalidation under writeback cache; accepted as floor Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

…whole-state double write, edit micro floor already <=3ms; deferred SETATTR third codex parity Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

…1s -> 0.754s (2.58x -> 2.22x) on codex ImportSession holds one pooled connection and the dir-path->ino map across chunk calls; agentfs clone imports directories up front, then overlaps blob parsing with bounded-channel import chunks. Also de-flakes overlay_reads_flag_off test (global counter -> per-inode has_pending). Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

…VM-validated — GETATTR storm 1095 -> 70, storm cycle 2.2x faster, du/mmap correctness intact Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

…ks, and kernel artifacts are the durable record Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

factory-ain3sh and others added 30 commits May 9, 2026 21:53

feat(agentfs): add phase 4 profiling counters

f6b9fbd

Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

feat(agentfs): add v0.5 inline storage

5853cb7

Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

feat(agentfs): add v0.5 copy migration

cac6f51

Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

feat(agentfs): coalesce fuse writes

f765fc2

Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

test(agentfs): expand phase 5 POSIX gate

2ca8ad4

Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

feat(agentfs): add phase 5 profiling scaffolding

a937919

Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

feat(agentfs): prototype partial-origin overlay copy-up

41cacb9

Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

fix(agentfs): honor NFS create write handles

3953aca

Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

feat(agentfs): add phase 5.5 read profiler

b320686

Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

feat(agentfs): cache inode attrs on read path

e18853e

Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

test(agentfs): harden partial-origin overlay

a2d9ce4

Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

test(agentfs): add macOS NFS git validation

4cfe515

Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

spike(agentfs): evaluate Turso 0.5 backend upgrade

fd443bb

Record Phase 5.5 backend spike results and make the helper capture measured validation outcomes for future upgrade/fallback decisions. Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

feat(agentfs): add production safety commands

caf308a

Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

factory-ain3sh and others added 28 commits June 11, 2026 11:30

docs(roadmap): WS3 verdict — agentfs clone 2.34x (from 8.41x), stage …

f0d20f8

…budget + deviations recorded Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

docs(roadmap): read-path verdict — 12.7x -> 4.0x (GO 8/8 pairs); floo…

b49180e

…r = OPEN+FLUSH round trips, next levers logged Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

docs(roadmap): WS5 verdict — status 0.71x / diff sub-native / checkou…

7ac1e49

…t+fsck under 1.5x; residual = OPEN+FLUSH round trips Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

docs(roadmap): WS6 io_uring verdict — 25-40% on RT-bound shapes (repe…

c9616fe

…ated-read 1.81x), opt-in pending idle-host A/B Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

docs(roadmap): WS7 ENOSYS-FLUSH verdict — close-time RT halved, defau…

550829e

…lt on, coherence gate added Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

docs(roadmap): WS8 verdict — open fast path wall-neutral; per-open fl…

963952a

…oor = 2 SELECTs; ENOSYS-OPEN is the lever Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

docs(spec): WS9 ENOSYS-OPEN spec + notes — pre-existing unlink-while-…

e3e1b52

…open gap classified, eval pending idle host Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

docs(roadmap): WS9 verdict — noopen promoted default-on; teardown-lea…

3746d82

…k RCA recorded Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

docs(spec): uring+noopen compound verdict (opt-in stands) + unlink-wh…

d532207

…ile-open followup closed Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

docs(roadmap): correction — 07-02 git-workload numbers were synthetic…

03b06bb

…-fixture; measurement contract pinned; WS9 promotion provisional pending codex re-run Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

docs(roadmap): codex re-run — WS9 GO bar met (read_search 1.41x), pro…

de239eb

…motion final; uring equal-or-better on every codex phase Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

fix(bench): temp-tree cleanup survives output errors and stubborn fil…

4de454a

…es — no more /tmp husks Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

docs(roadmap): read-path residual root-caused — kernel close-time STA…

01340a8

…TX_BLOCKS invalidation under writeback cache; accepted as floor Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

docs(roadmap): clone and edit dig verdicts — clone floored at ~2x by …

a04bcff

…whole-state double write, edit micro floor already <=3ms; deferred SETATTR third codex parity Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

docs(kernel): FUSE STATX_BLOCKS flush-invalidation patch written and …

bb8c2ce

…VM-validated — GETATTR storm 1095 -> 70, storm cycle 2.2x faster, du/mmap correctness intact Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

chore(agents): drop dated session-tail scratch dirs — specs, benchmar…

e52e9b1

…ks, and kernel artifacts are the durable record Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

factory-ain3sh merged commit 059bd52 into main Jul 3, 2026
24 of 34 checks passed

factory-ain3sh deleted the phase4-north-star-implementation branch July 3, 2026 03:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Phase 4: per-phase perf campaign vs native git on codex — TTLs, protocol levers (noflush/noopen/io_uring), bulk+streamed clone, POSIX and lifecycle fixes#2

Phase 4: per-phase perf campaign vs native git on codex — TTLs, protocol levers (noflush/noopen/io_uring), bulk+streamed clone, POSIX and lifecycle fixes#2
factory-ain3sh merged 77 commits into
mainfrom
phase4-north-star-implementation

factory-ain3sh commented Jul 3, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

factory-ain3sh commented Jul 3, 2026

What this is

Where we landed (codex, multi n=5)

Highlights

Next

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants