fix(update): guide EACCES manual recovery by brokemac79 · Pull Request #83757 · openclaw/openclaw

brokemac79 · 2026-05-18T20:56:04Z

Summary

add EACCES recovery hints that tell managed-Gateway operators to stop the Gateway before sudo/manual package recovery
document the root-owned Linux system-global recovery path: stop Gateway, run the system npm install, refresh the Gateway service, restart, then verify
add focused coverage for staged/global install EACCES hint text

Why

Current main can restart the old managed Gateway after a staged package update fails with EACCES. The existing recovery hint points operators toward sudo/manual package recovery, but it does not say to stop the Gateway first. That leaves a window where the running Gateway can try to load core/plugin files while npm is replacing the package tree.

This follows ClawSweeper's narrow guidance on #83747: improve recovery hints/docs and focused tests without changing the updater lifecycle policy.

Closes #83747

Real behavior proof

Behavior or issue addressed: EACCES recovery guidance for root-owned/system-global npm installs now fails closed at the operator level by instructing users to stop the managed Gateway before manual package replacement.
Real environment tested: Ubuntu VPS disposable npm-global proof environment, Node v22.22.0, npm 10.9.4, throwaway OPENCLAW_PROFILE=proof83757, throwaway NPM_CONFIG_PREFIX under /tmp. Also validated in a Windows desktop source checkout from upstream main 424c6d0a5, Node v24.13.0, pnpm 11.1.0.
Exact steps or command run after this patch: Installed openclaw@latest into a temp npm prefix on the VPS, applied this PR's EACCES hint change to the packaged update CLI bundle in that disposable install, made only the temp prefix's lib/node_modules unwritable to trigger the real global install stage EACCES path, then ran OPENCLAW_PROFILE=proof83757 NPM_CONFIG_PREFIX=<temp-prefix> openclaw update --no-restart --yes --tag latest --timeout 20.
Evidence after fix: Redacted terminal output from the disposable VPS OpenClaw setup:

ENV: VPS disposable proof, node=v22.22.0 npm=10.9.4 profile=proof83757
PATCH: disposable install now contains PR #83757 EACCES hint text
OpenClaw 2026.5.18 (50a2481)
COMMAND: OPENCLAW_PROFILE=proof83757 NPM_CONFIG_PREFIX=<temp-prefix> openclaw update --no-restart --yes --tag latest --timeout 20
EXIT_CODE: 1

Update Result: ERROR
  Root: /tmp/openclaw-83757-proof-REDACTED/prefix/lib/node_modules/openclaw
  Reason: global install stage
  Before: 2026.5.18
  After: 2026.5.18

Steps:
  x global install stage (0ms)
      EACCES: permission denied, mkdtemp '/tmp/openclaw-83757-proof-REDACTED/prefix/lib/node_modules/.openclaw-update-stage-S2S8Ob'

Recovery hints:
  - Detected permission failure (EACCES). Re-run with a writable global prefix or sudo (for system-managed Node installs).
  - If you recover with sudo/manual package install on a managed Gateway, stop the Gateway first so it does not load files while the package tree is being replaced.
  - Example: npm config set prefix ~/.local && npm i -g openclaw@latest
  - System install outline: openclaw gateway stop -> sudo <system-npm> i -g openclaw@latest -> openclaw gateway install --force -> openclaw gateway restart.

Total time: 473ms
ASSERT: proof output contains EACCES recovery hints from patched CLI/update path
CLEANUP: removed disposable proof directory

Observed result after fix: The real CLI/update global install stage EACCES path prints the new stop-before-manual-recovery guidance and the system install outline with gateway install --force and gateway restart.
What was not tested: I did not perform a destructive package replacement against the live production OpenClaw install; this PR intentionally changes guidance/tests only and does not alter package-manager lifecycle behavior. The VPS proof used a disposable temp npm prefix/cache/home/profile only. The temp proof directory was removed after capture. I did not hotfix, cherry-pick into, stop, restart, or otherwise mutate the live VPS OpenClaw install/gateway.

Validation

node scripts/run-vitest.mjs src/cli/update-cli/progress.test.ts -- --reporter=verbose
Test Files  1 passed (1)
Tests       6 passed (6)

git diff --check
# no output

codex-review
codex-review clean: no accepted/actionable findings reported

clawsweeper · 2026-05-18T20:57:08Z

Codex review: needs maintainer review before merge.

Workflow note: Future ClawSweeper reviews update this same comment in place.

How this review workflow works

ClawSweeper keeps one durable marker-backed review comment per issue or PR.
Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
Maintainers can also comment @clawsweeper review to request a fresh review only.
Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

Summary
Adds managed-Gateway stop-first recovery hints for npm EACCES update failures, documents the root-owned Linux recovery sequence, adds focused hint assertions, and updates the changelog.

Reproducibility: yes. at the source-path level: npm global update or global install stage EACCES failures flow through inferUpdateFailureHints, and the linked report plus PR body show real EACCES output for that path. I did not run a destructive root-owned package replacement during this read-only review.

PR rating
Overall: 🦞 diamond lobster
Proof: 🦞 diamond lobster
Patch quality: 🦞 diamond lobster
Summary: Strong focused PR with supplied real CLI proof, targeted tests, and no blocking review findings.

What the crustacean ranks mean

🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

PR egg
✨ Hatched: 🌱 uncommon Sunspot Crabkin

        /\     /\            
      _/  \___/  \_          
     /  ( o   o )  \         
    |      \_/      |        
    |   /\  ===  /\ |        
     \_/  \_____/  \_/       
        _/|_| |_|\_          
       /__| | | |__\         
          ' ' ' '            
         /_/     \_\         
       .-----------.         
      '-------------'

Rarity: 🌱 uncommon.
Trait: hums during re-review.
Share on X: post this hatch
Copy: My PR egg hatched a 🌱 uncommon Sunspot Crabkin in ClawSweeper.

What is this egg doing here?

Eggs appear after the PR passes real-behavior proof. It is here for vibes, not verdicts: it does not change labels, ratings, merge decisions, or automation.
The shell reacts to review momentum: open follow-up work warms it up, re-review makes it wobble, and a clean final review lets it hatch.
How to hatch it: reach status: 👀 ready for maintainer look or status: 🚀 automerge armed; that usually means sufficient real-behavior proof, no blocking P0/P1/P2 findings, no security attention needed, and clean correctness.
The hatch is seeded from this repository and PR number, so the same PR keeps the same creature; the reviewed head SHA can only change safe visual details.
Rarity is just collectible sparkle: 🥚 common, 🌱 uncommon, 💎 rare, ✨ glimmer, and 🌈 legendary.

Real behavior proof
Sufficient (terminal): The PR body includes redacted after-patch terminal output from a disposable Ubuntu npm-global setup showing the updated EACCES recovery hint in the real CLI path.

Next step before merge
No repair lane is needed because the patch has no blocking findings; maintainers should use normal review and CI gating.

Security
Cleared: The diff changes docs, literal CLI recovery strings, a changelog entry, and focused tests; no concrete security or supply-chain regression was found.

Review details

Best possible solution:

Land the narrow recovery guidance and tests after normal CI and maintainer review, leaving any automatic fail-closed updater lifecycle change to a separate owner decision.

Do we have a high-confidence way to reproduce the issue?

Yes, at the source-path level: npm global update or global install stage EACCES failures flow through inferUpdateFailureHints, and the linked report plus PR body show real EACCES output for that path. I did not run a destructive root-owned package replacement during this read-only review.

Is this the best way to solve the issue?

Yes: updating the CLI recovery hint, docs, and focused tests is the narrow maintainable fix for the guidance gap. Changing whether the updater keeps the Gateway stopped after EACCES would be a separate availability and lifecycle policy decision.

Label justifications:

P2: This is a focused CLI/update recovery improvement for a limited Linux root-owned npm install path.

What I checked:

Current main CLI hint gap: Current main only prints the generic EACCES writable-prefix/sudo hint and the user-writable npm prefix example; it does not tell managed-Gateway operators to stop the Gateway before manual package replacement. (src/cli/update-cli/progress.ts:84, 583eb711ecb1)
Current main lifecycle context: The updater stops a running managed Gateway before package updates, but restarts it after a failed package update, matching the linked recovery-guidance gap rather than requiring this PR to change lifecycle policy. (src/cli/update-cli/update-command.ts:846, 583eb711ecb1)
Shipped release gap: The latest release tag v2026.5.18 has the same generic EACCES hints and manual-update docs as current main, so the PR is not obsolete on main or in the shipped release. (src/cli/update-cli/progress.ts:84, 50a2481652b6)
PR CLI change: The PR head adds a managed-Gateway stop-first warning and a system install outline to the existing npm global EACCES hint path. (src/cli/update-cli/progress.ts:88, a2883acb74c5)
PR test coverage: The PR head extends focused EACCES hint tests for both global update and staged package permission failures to assert the new Gateway and system-npm guidance. (src/cli/update-cli/progress.test.ts:67, a2883acb74c5)
PR docs change: The PR head documents stopping the managed Gateway before root-owned Linux npm recovery, reinstalling, refreshing the service, restarting, and verifying health. Public docs: docs/install/updating.md. (docs/install/updating.md:100, a2883acb74c5)

Likely related people:

Dallin Romney: Local blame for the current EACCES hint function, managed Gateway update lifecycle, and updating docs points to the same current-checkout baseline commit. (role: recent area contributor; confidence: medium; commits: cf194419c315; files: src/cli/update-cli/progress.ts, src/cli/update-cli/update-command.ts, docs/install/updating.md)
Josh Lehman: Recently changed npm managed install/update behavior around package freshness filters, adjacent to this PR's npm global update recovery path. (role: recent adjacent contributor; confidence: medium; commits: 85a3d5312f7d; files: src/infra/update-global.ts, src/infra/update-runner.test.ts, src/infra/npm-install-env.ts)
steipete: Prepared the v2026.5.18 release that the linked issue upgraded to and is assigned on this PR, making them a good route for update/release recovery review. (role: adjacent release owner; confidence: medium; commits: 50a2481652b6, 583eb711ecb1; files: CHANGELOG.md, package.json, scripts/notarize-mac-artifact.sh)

Codex review notes: model gpt-5.5, reasoning high; reviewed against 583eb711ecb1.

brokemac79 · 2026-05-18T21:28:53Z

@clawsweeper re-review

clawsweeper · 2026-05-18T21:30:30Z

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

Re-review progress:

State: Complete
Detail: The targeted re-review finished, the durable review comment was updated, and the synced verdict was routed.
Run: https://github.com/openclaw/clawsweeper/actions/runs/26061612768
Updated: 2026-05-18T21:37:13.467Z

…brokemac79

…brokemac79)

steipete · 2026-05-18T23:13:46Z

Maintainer verification before landing.

Behavior addressed: EACCES during manual npm recovery now tells users to stop the managed Gateway before replacing the package, then restart and verify the service.
Real environment tested: local source checkout on macOS; GitHub Actions on PR head a2883ac.
Exact steps or command run after this patch:

pnpm docs:list
node scripts/run-vitest.mjs src/cli/update-cli/progress.test.ts -- --reporter=verbose
git diff --check
git diff --check origin/main...HEAD
AUTOREVIEW_AUTO_TESTS=0 OPENCLAW_TESTBOX=1 .agents/skills/autoreview/scripts/autoreview --mode branch
Evidence after fix:
vitest: 1 file, 6 tests passed
diff checks: clean
autoreview: clean after fixing the accepted docs verification issue
CI: passed on a2883ac
CI run: https://github.com/openclaw/openclaw/actions/runs/26065592995
Real behavior proof: https://github.com/openclaw/openclaw/actions/runs/26065601813
CodeQL Critical Quality: https://github.com/openclaw/openclaw/actions/runs/26065592970
Observed result after fix: docs now verify service status with openclaw gateway status --deep --json and lint separately with openclaw doctor --lint --json.
What was not tested: no live root-owned npm EACCES repro was created on this machine.

openclaw-barnacle Bot added docs Improvements or additions to documentation cli CLI command changes size: XS triage: mock-only-proof Candidate: PR proof only shows tests, mocks, snapshots, lint, typecheck, or CI. labels May 18, 2026

clawsweeper Bot added the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 18, 2026

steipete self-assigned this May 18, 2026

brokemac79 and others added 3 commits May 18, 2026 23:56

fix(update): guide EACCES manual recovery

f515a47

docs: update changelog for EACCES recovery (openclaw#83757) (thanks @…

0b96130

…brokemac79)

docs: fix update recovery verification (openclaw#83757)

a2883ac

steipete force-pushed the fix/issue-83747-eacces-recovery branch from ad5155a to a2883ac Compare May 18, 2026 23:05

openclaw-barnacle Bot removed the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 18, 2026

clawsweeper Bot added the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 18, 2026

steipete merged commit 0903fa6 into openclaw:main May 18, 2026
109 of 110 checks passed

steipete added a commit that referenced this pull request May 18, 2026

docs: update changelog for EACCES recovery (#83757) (thanks @brokemac79)

bcdfbb8

steipete mentioned this pull request May 18, 2026

[Bug]: system npm upgrade can leave running gateway in transient half-swapped package/plugin state #83747

Closed

github-actions Bot mentioned this pull request May 19, 2026

📡 Upstream Digest — 2026-05-19 02:33 UTC curtismercier/openclaw-mods#895

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(update): guide EACCES manual recovery#83757

fix(update): guide EACCES manual recovery#83757
steipete merged 3 commits into
openclaw:mainfrom
brokemac79:fix/issue-83747-eacces-recovery

brokemac79 commented May 18, 2026 •

edited

Loading

Uh oh!

clawsweeper Bot commented May 18, 2026 •

edited

Loading

Uh oh!

brokemac79 commented May 18, 2026

Uh oh!

clawsweeper Bot commented May 18, 2026 •

edited

Loading

Uh oh!

steipete commented May 18, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

brokemac79 commented May 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Why

Real behavior proof

Validation

Uh oh!

clawsweeper Bot commented May 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

brokemac79 commented May 18, 2026

Uh oh!

clawsweeper Bot commented May 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

steipete commented May 18, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

brokemac79 commented May 18, 2026 •

edited

Loading

clawsweeper Bot commented May 18, 2026 •

edited

Loading

clawsweeper Bot commented May 18, 2026 •

edited

Loading