Skip to content

fix: workspace safety net — no more /root EACCES, atomic writes, version handshake#11

Merged
OmGuptaIND merged 6 commits intomainfrom
OmGuptaIND/fix-harness-routine
Apr 24, 2026
Merged

fix: workspace safety net — no more /root EACCES, atomic writes, version handshake#11
OmGuptaIND merged 6 commits intomainfrom
OmGuptaIND/fix-harness-routine

Conversation

@OmGuptaIND
Copy link
Copy Markdown
Contributor

Summary

Fixes the EACCES: permission denied, scandir '/root' error in the Files tab and hardens the surrounding workspace/persistence stack. Builds a four-layer safety net so this class of silent failure can't recur:

  1. Server self-healensureDefaultProject now backfills missing workspacePath, recreates absent workspace directories, writes the .anton.json marker when missing, and repairs the internal project dir. Logs console.warn on repair so drift is visible in logs.
  2. Atomic JSON writes + corruption recovery — all project-state writes in agent-config/projects.ts go through writeJsonAtomic (tmp + fsync + POSIX rename). loadIndex rebuilds from per-project records when index.json is unparseable. Startup sweep removes orphan .tmp files left by SIGKILL.
  3. Wire-protocol version handshake — new PROTOCOL_VERSION constant, server advertises it in auth_ok, desktop compares and surfaces a thin dismissible banner on mismatch (dismissal scoped per-version so new skews re-warn).
  4. Safe Files UIstartDir is now string | null with no /root fallback. Shows a spinner + "Loading projects…" during sync, "Select a project" after sync. Closes the one-frame flash of "This folder is empty" between startDir updates and the first fs_list response, while correctly allowing subdirectory navigation.

Commits

715f21b fix(agent-config) — crash-safe project persistence + default-project self-heal
aeb98d4 feat(protocol) — wire-protocol version handshake in auth_ok
b3bd269 feat(desktop) — protocol-mismatch banner
671aafa fix(desktop) — Files view no longer probes /root; proper loader during sync
5368009 chore(harness-routine) — pre-existing in-progress work bundled in

Note to reviewers: the last commit (5368009) bundles pre-existing uncommitted harness/routine work that was already on this branch when the Files fix started. It is unrelated to the workspace safety net — happy to split it into a follow-up PR if preferred.

What's fixed end-to-end

Failure mode Caught by
activeProject undefined on boot race Layer 4 (safe UI)
activeProject missing workspacePath Layer 4 + Layer 1
Workspace dir deleted on disk Layer 1
.anton.json marker deleted Layer 1
Internal project dir deleted Layer 1
Process crash mid-write of JSON Layer 2 (atomic)
SIGKILL leaves .tmp orphan Layer 2 (sweep)
index.json corrupted (pre-fix crash) Layer 2 (rebuild)
Desktop newer than server Layer 3 (banner)
Server newer than desktop Layer 3 (banner)
Server predates handshake entirely Layer 3 (banner)

Known limitations (intentionally not fixing here)

  • Stale fs_list response after project switch — pre-existing, requires per-request IDs.
  • Non-transactional pair writes across project.json + index.json — self-heal on next boot covers the worst cases; real fix is a write-ahead log.
  • Cross-process .tmp collision — single-process today; documented in the helper's docblock if Anton ever goes multi-process.

Test plan

  • Fresh install with an empty ~/.anton → Files tab shows "Loading projects…" spinner during sync, then the My Computer workspace with no errors
  • Deployed agent-server running as anton user → Files tab lists the workspace instead of EACCES
  • Manually delete workspacePath from a saved project.json → next connect logs [ensureDefaultProject] Repaired… and the file backfills
  • Manually corrupt index.json (truncate mid-record) → next boot logs [loadIndex] index.json unreadable… and rebuilds from proj_*/project.json
  • Navigate into subdirectory → list renders (no stuck loader regression)
  • Switch between projects quickly → no "empty folder" flash during transitions
  • Mismatched protocol version (temporarily bump local PROTOCOL_VERSION) → banner appears; dismissing persists only for that version

🤖 Generated with Claude Code

OmGuptaIND and others added 6 commits April 23, 2026 19:53
…elf-heal

- Add writeJsonAtomic() helper (tmp + fsync + rename); route all JSON
  state writes in projects.ts through it so a crash/OOM/power loss
  mid-write never leaves index.json or project.json truncated.
- sweepOrphanTmpFiles() on startup cleans up .tmp files left behind by a
  SIGKILL between openSync and renameSync.
- loadIndex() falls back to rebuildIndexFromDisk() when JSON.parse fails,
  scanning proj_*/project.json to reassemble the index. Previous behavior
  silently returned [] on parse error, which the next saveIndex would
  permanently overwrite.
- ensureDefaultProject() now validates and self-heals the existing
  default record: backfills missing workspacePath, recreates absent
  workspace dir, writes .anton.json marker when missing, recreates the
  internal project dir. Logs console.warn on repair so operators can
  detect drift.

Addresses the "EACCES: permission denied, scandir '/root'" symptom —
the client was falling back to /root when the stored default project
lacked workspacePath. Heal ensures that field is always populated on
every connect.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Add PROTOCOL_VERSION = 1 constant in @anton/protocol/version.ts with
  a bump-history comment for future breaking-change tracking.
- Extend AuthOkMessage with optional `protocolVersion?: number` so older
  servers that omit the field remain backwards-compatible.
- Server includes PROTOCOL_VERSION in its auth_ok payload.
- Desktop captures the server's reported version into
  connectionStore.serverProtocolVersion via controlHandler, cleared on
  disconnect so a reconnect to a different agent starts fresh.

Enables the mismatch banner (added in a follow-up commit) to surface
version skew between desktop and agent-server instead of letting it
silently produce confusing empty states.

Also contains unrelated harness/routine work (provider+model on
RoutineCreate/Update, server routing changes) that was already staged
on this branch.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- New ProtocolMismatchBanner component compares the server's reported
  protocolVersion against the compiled PROTOCOL_VERSION constant and
  surfaces a thin dismissible warning bar on mismatch. Copy flips based
  on whether the server is older ("update your server") or newer
  ("update your desktop"). Missing field (null) is treated as
  "server predates handshake".
- Dismissal state is scoped per server version, so connecting to a
  different server with a different skew surfaces the warning again
  instead of silently staying hidden.
- Gates rendering on `initPhase === 'syncing' || 'ready'` so the banner
  doesn't flash during the pre-auth window.
- Mounted alongside the existing UpdateBanner / DesktopUpdateBanner in
  the workspace shell. Uses the thin `.reconnect-banner`-style layout,
  not the full-screen `.update-banner` overlay — soft-warn only.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… sync

Root cause of the original "EACCES: permission denied, scandir '/root'"
error: ProjectFilesView and FileBrowser both fell back to the hardcoded
string '/root' whenever the active project's workspacePath was unknown.
On any non-root agent-server (i.e. every real deployment — ansible runs
it as `anton`), that path can't be scanned.

- startDir is now `string | null`. No `/root` fallback anywhere.
- When startDir is null: render a dedicated waiting state that shows a
  spinner + "Loading projects…" while connectionStore.initPhase !==
  'ready', then switches to a folder icon + "Select a project to
  browse its files." once sync is done but no project is active.
- When startDir becomes real but cwd hasn't caught up yet (initial
  null→real, or project switch), `awaitingInitialFetch` derives from
  "cwd is not within the startDir tree" and keeps the loader on. This
  closes the one-frame gap where the old code would flash "This folder
  is empty." The check explicitly treats subdirectories of startDir as
  already-loaded so normal navigation doesn't keep the loader stuck.
- All `listDir`/`refresh`/artifact-subscriber effects guard against
  null / empty paths so no fs_list is ever dispatched before
  workspacePath is known.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Pre-existing uncommitted changes on this branch that are unrelated to
the Files/workspace/protocol fixes in the earlier commits. Bundled here
so nothing is lost, but reviewers should treat this as separate work:

- agent-config: config.ts updates
- agent-server: agent-manager, webhooks/agent-runner, workflows/workflow-installer
- protocol: projects + workflows schema extensions
- specs: PROVIDERS, HARNESS_ARCHITECTURE, agents/agents updates

Consider splitting to a follow-up PR if any of it isn't ready to ship
with the Files safety net.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Conflicts resolved in:
- packages/agent-server/src/server.ts:
  • createHarnessSession opts — kept both pre-existing branch fields
    (agentInstructions, agentMemory, background for routine/cron runs)
    and main's thinkingLevel field.
  • protocol import — kept PROTOCOL_VERSION (from this branch) alongside
    main's ThinkingLevel type import.
- packages/desktop/src/App.tsx:
  • kept both ProtocolMismatchBanner (this branch) and OnboardingTour
    (main) imports.

Auto-merges accepted in messages.ts, ProjectFilesView.tsx, index.css.

All packages build clean; desktop typecheck + biome check pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@OmGuptaIND OmGuptaIND merged commit f391106 into main Apr 24, 2026
OmGuptaIND added a commit that referenced this pull request Apr 24, 2026
### Features
- multi-format file attachments with @-mentions and preview renderers (#12)

### Fixes
- pnpm lock
- workspace safety net — no more /root EACCES, atomic writes, version handshake (#11)

### Chores
- clean up biome lint + format across repo

### Other
- feat(desktop): remove new-project attachments, add text-file creator in files view (#10)
- feat(desktop): reasoning effort pill + provider modal redesign + onboarding tour (#9)
- fix(desktop): honor real harness readiness in provider UI (#8)
- fix(harness): persist session title to meta.json across reloads (#7)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant