Skip to content

fix(results): harden remote sync dogfood#1332

Merged
christso merged 1 commit into
mainfrom
dogfood/av-fis-remote-sync
Jun 9, 2026
Merged

fix(results): harden remote sync dogfood#1332
christso merged 1 commit into
mainfrom
dogfood/av-fis-remote-sync

Conversation

@christso

@christso christso commented Jun 9, 2026

Copy link
Copy Markdown
Collaborator

Summary

Remote result sync is now safer under adversarial production-readiness scenarios: offline/auth-style sync failures keep reporting cached remote runs, and /api/runs no longer returns duplicate local plus remote:: rows for the same synced run.

The durable av-fis dogfood report, throwaway-remote evidence, Dashboard screenshots, and production-readiness verdict are stored outside this public repository:

  • Private repo: EntityProcess/agentv-private
  • Private branch: dogfood/av-fis-remote-sync-evidence
  • Private commit: 5d81c8d docs(dogfood): add av-fis remote sync evidence
  • Private path: dogfood/av-fis/2026-06-08-remote-sync/

This public PR contains only product/test code changes.

Dogfood Coverage

Covered scenarios include:

  • Remote tag add/edit/clear overlays, sync persistence, dirty-state behavior, and clean-clone verification.
  • Local/remote discovery, local delete after sync, empty remote, missing metadata, corrupt index.jsonl, branch/default-branch mismatch, offline/missing remote, and multiple project isolation.
  • Local combine/delete lifecycle, remote combine/delete rejection, safe dirty metadata push, unsafe dirty blocks, behind/ahead/diverged/conflicted states, interrupted retry simulation, and concurrent sync guarding.
  • Dashboard/browser paths plus CLI-adjacent paths: eval auto-export, agentv results combine, agentv results delete, and confirmation that no direct agentv results remote sync/status command currently exists.

No production remote was touched. All remote dogfood used throwaway /tmp file remotes and temporary clones.

Fixes

  • syncRemoteResults() now computes run_count from the cached results clone in its error path, so a failed fetch no longer makes available cached remote runs appear empty.
  • listMergedResultFiles() now dedupes merged local/remote run rows by canonical raw_filename, preferring the local copy before sorting and pagination.
  • Added focused serve.test.ts regressions for both behaviors.

Follow-Ups

  • av-fis.1: Dashboard sync button can remain Syncing... after concurrent remote sync; data path is safe, presentation can be misleading until reload.
  • av-xqm: Dashboard remote sync status can show stale last_error after a later conflict state.
  • av-fis.2: Decide whether AgentV needs a first-class CLI contract for remote results sync/status.
  • av-fis.3: Add committed regression coverage for interrupted sync retry; dogfood passed a simulated .git/index.lock retry, but coverage should be made durable before documenting retry guarantees.

Verification

  • bun test apps/cli/test/commands/results/serve.test.ts
    • 83 pass, 0 fail, 271 expect() calls
  • Pre-cleanup code verification, before private evidence relocation and with the same product fix shape:
    • bun test packages/core/test/evaluation/results-repo.test.ts apps/cli/test/commands/results/remote-metadata.test.ts apps/cli/test/commands/results/delete.test.ts apps/cli/test/commands/results/combine.test.ts apps/cli/test/commands/results/serve.test.ts apps/dashboard/src/lib/run-dedupe.test.ts apps/dashboard/src/lib/project-sync-status.test.ts apps/dashboard/src/lib/run-list-actions.test.ts
    • 121 pass, 0 fail, 381 expect() calls
    • bun run build passed.
  • git show --check HEAD passed.
  • Public diff hygiene: PR diff against current origin/main includes only apps/cli/src/commands/results/remote.ts and apps/cli/test/commands/results/serve.test.ts; no dogfood report/evidence files remain in the public branch.
  • Private evidence hygiene: dashboard-setup.env contains only throwaway /tmp fixture paths, and the private evidence tree was scanned for common secret/token patterns before commit.

Readiness verdict from the private report: the remote result data path is production-ready for controlled rollout after these fixes. The full Dashboard production experience still needs follow-up/waiver for av-fis.1 and av-xqm, and av-fis.3 should add durable retry regression coverage before documenting retry guarantees.


Compound Engineering
GPT_5

@cloudflare-workers-and-pages

cloudflare-workers-and-pages Bot commented Jun 9, 2026

Copy link
Copy Markdown

Deploying agentv with  Cloudflare Pages  Cloudflare Pages

Latest commit: c81c89c
Status: ✅  Deploy successful!
Preview URL: https://ee0e9fdb.agentv.pages.dev
Branch Preview URL: https://dogfood-av-fis-remote-sync.agentv.pages.dev

View logs

Preserve cached remote run counts when sync fails so offline/auth errors do not hide available cached runs. Dedupe merged local and remote run listings in favor of local copies before pagination. Dogfood evidence for av-fis is kept in the private evidence repository, not in this public code branch.
@christso christso force-pushed the dogfood/av-fis-remote-sync branch from e64364c to c81c89c Compare June 9, 2026 02:43
@christso christso merged commit 083e08c into main Jun 9, 2026
8 checks passed
@christso christso deleted the dogfood/av-fis-remote-sync branch June 9, 2026 03:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant