Skip to content

fix(mount): shrink incremental event pages#201

Merged
khaliqgant merged 1 commit into
mainfrom
codex/issue-200-incremental-backlog
May 22, 2026
Merged

fix(mount): shrink incremental event pages#201
khaliqgant merged 1 commit into
mainfrom
codex/issue-200-incremental-backlog

Conversation

@khaliqgant
Copy link
Copy Markdown
Member

Summary

  • lower mount incremental event page size from 500 to 50 so page-level cursor persistence advances during heavy large-file backlogs
  • keep changed-event metadata through incremental apply and skip remote ReadFile when event contentHash already matches the local mirror
  • add coverage for the reduced page limit and incremental contentHash skip path

Fixes #200

Review

  • checked the clean branch diff is isolated to internal/mountsync/syncer.go and internal/mountsync/syncer_test.go
  • verified conflicted paths retain existing skip semantics before contentHash fast-path state updates
  • confirmed no HTTP handler/OpenAPI surface changed

Tests

  • go test ./internal/mountsync -run 'TestPullRemoteIncremental(PersistsAppliedPageCursorOnListEventsError|ReturnsDeadlineWhenNoPageProgress|ResumesWithinAppliedPage|SkipsReadFileWhenLocalHashMatchesContentHash|CheckpointPreservesChangedPath404Delete)|TestPullDetectsRevReuseViaContentHash|TestPullRemoteFullTreeSkipsReadFileWhenLocalHashMatchesContentHash'\n- go test ./internal/mountsync\n- go test ./...\n- git diff --check

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 22, 2026

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: a4c21dc6-3280-4cd9-a54f-c188281a2bce

📥 Commits

Reviewing files that changed from the base of the PR and between 0c0c60e and caf89f5.

📒 Files selected for processing (2)
  • internal/mountsync/syncer.go
  • internal/mountsync/syncer_test.go

📝 Walkthrough

Walkthrough

This PR implements three complementary optimizations to incremental sync performance: smaller pagination (50 events vs. 500), full event retention to enable content-hash matching, and skip-on-hash probing to avoid redundant remote reads. The changes flow from event fetching through apply logic and include regression and pagination tests.

Changes

Incremental Sync Optimization with Pagination and Skip-on-Hash

Layer / File(s) Summary
Pagination Configuration and Event Fetching
internal/mountsync/syncer.go
Introduces defaultIncrementalEventPageLimit (50) constant and updates pullRemoteIncremental to use it instead of hardcoded 500; changes changed map from map[string]struct{} to map[string]FilesystemEvent to preserve full event details including contentHash.
Apply Logic and Skip-on-Hash Optimization
internal/mountsync/syncer.go
Updates applyIncrementalChanges parameter type to accept changed map[string]FilesystemEvent, implements per-path skip logic that calls new trySkipIncrementalRead helper: when contentHash matches local snapshot hash, skips remote ReadFile, updates tracked state (revision/content type/permissions), and marks checkpoint progress.
Hash-Match Regression Test
internal/mountsync/syncer_test.go
Adds TestPullRemoteIncrementalSkipsReadFileWhenLocalHashMatchesContentHash verifying that when local tracked hash matches the event's ContentHash, Reconcile advances state without remote ReadFile calls and maintains consistent mirror content.
Pagination Test Parameterization
internal/mountsync/syncer_test.go
Updates TestPullRemoteIncrementalPersistsAppliedPageCursorOnListEventsError to parameterize page size by defaultIncrementalEventPageLimit rather than hardcoded constants, validating correct ListEvents limit usage, symbolic cursor checkpoints, partial-page file tracking, and post-retry cursor advancement.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related issues

  • AgentWorkforce/relayfile#200: This PR directly implements fixes 1 (reduce page size to 50), 3 (skip-on-hash for incremental apply), and improves checkpoint consistency as requested in the issue to address cursor lag on large-file event backlogs.

Possibly related PRs

Poem

🐰 A faster hop through events so deep,
Pagination shrinks, no more 500-heap,
Hash probes skip reads when the mirrors align,
Cursor now races; no lag, no rewind!
Small pages make cycles—progress divine.

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title 'fix(mount): shrink incremental event pages' accurately summarizes the main change—reducing the incremental event page size from 500 to 50 to improve cursor advancement during large-file backlogs.
Description check ✅ Passed The description clearly explains the three key changes (reduce page size, preserve event metadata, skip ReadFile on hash match), provides review checklist, and references the linked issue #200.
Linked Issues check ✅ Passed The PR implements three of the four suggested fixes from #200: reducing page size (500→50), preserving changed-event metadata, and skipping ReadFile on contentHash match. All coding requirements are met.
Out of Scope Changes check ✅ Passed All changes are scoped to internal/mountsync (syncer.go and syncer_test.go) and directly address the objectives in #200; no unrelated modifications detected.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch codex/issue-200-incremental-backlog

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Copy Markdown

Relayfile Eval Review

Run: .relayfile/evals/runs/2026-05-22T09-03-01-609Z-HEAD-provider
Mode: provider
Git SHA: 450782c

Passed: 4 | Needs human: 0 | Reviewable: 0 | Missing output: 0 | Failed: 0 | Skipped: 0

Human Review Cases

No reviewable human-review cases captured Relayfile output.

@khaliqgant
Copy link
Copy Markdown
Member Author

Reviewed against #200 — takes the highest-impact pair of the four candidate fixes, with correct, well-tested implementation.

Checklist vs #200

  • (1) Page size 500 → 50. defaultIncrementalEventPageLimit = 50 constant, used in pullRemoteIncremental's ListEvents call. Existing pagination test (TestPullRemoteIncrementalPersistsAppliedPageCursorOnListEventsError) refactored to use the constant + adds an explicit guard (if call == 2 && limit != defaultIncrementalEventPageLimit) so a future regression that bumps the limit back up fails loudly. Cursor now persists 10× more often, which is the whole point.
  • (3) Skip-on-hash for incremental apply. New trySkipIncrementalRead mirrors the existing trySkipBootstrapRead (fix: speed up mount bootstrap fallback #185). When event.ContentHash matches the local snapshot hash, ReadFile is skipped entirely, tracked state advances to the event's revision, and the checkpoint is marked so a restart doesn't re-attempt. Test TestPullRemoteIncrementalSkipsReadFileWhenLocalHashMatchesContentHash asserts 0 ReadFile calls + tracked.revision advances to rev_2 from the event.
  • (2) parallel apply and ❌ (4) per-event cursor — not in this PR. Sensible — (1)+(3) should give the biggest win without the cross-cutting complexity. Worth keeping as follow-ups if (1)+(3) don't fully close the gap on the worst workspaces.

Implementation notes (LGTM)

  • The changed map type bump from map[string]struct{}map[string]FilesystemEvent is what threads the per-event contentHash through to applyIncrementalChanges — clean, no API churn since the field already existed.
  • trySkipIncrementalRead correctly:
    • Refuses to skip on Dirty / Denied (would otherwise drop local pending work).
    • Calls applyLocalPermissions on skip — permissions stay in sync even without a write.
    • Inherits revision from tracked state when event omits it.
  • Conflicted-path handling is explicit and BEFORE the skip attempt — preserves existing skip semantics. The PR body's "verified conflicted paths retain existing skip semantics before contentHash fast-path state updates" matches the diff.
  • markIncrementalCheckpoint(... remotePath) fires on the skip path too, so a daemon restart mid-skipped-page doesn't replay already-handled events.

Expected impact on rw_fc7b534b

The first 500-event page that wedged the daemon was 491 file.created/updated mostly on Gmail. With page=50 the cycle persists cursor after each page (~30s of work), so even worst-case the events backlog now drains linearly. With skip-on-hash, the periodic full-tree pull (every 20 cycles) has likely already updated many of those files — when the events arrive, skip kicks in and ReadFile is bypassed entirely. Net: cursor should actually advance on this workspace post-merge.

Tiny nit (not blocking)

defaultIncrementalEventPageLimit is hardcoded with no --incremental-page-limit flag override. Probably fine; useful if you ever need to retune per environment without a release.

LGTM — would approve if I weren't the issue author.

@khaliqgant khaliqgant merged commit fafd3fe into main May 22, 2026
9 checks passed
@khaliqgant khaliqgant deleted the codex/issue-200-incremental-backlog branch May 22, 2026 09:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

mount: incremental apply per-page granularity too coarse — cursor falls behind on workspaces with large-file events

1 participant