fix(update): guide EACCES manual recovery#83757
Conversation
|
Codex review: needs maintainer review before merge. Workflow note: Future ClawSweeper reviews update this same comment in place. How this review workflow works
Summary Reproducibility: yes. at the source-path level: npm global update or global install stage EACCES failures flow through inferUpdateFailureHints, and the linked report plus PR body show real EACCES output for that path. I did not run a destructive root-owned package replacement during this read-only review. PR rating What the crustacean ranks mean
Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics. PR egg Rarity: 🌱 uncommon. What is this egg doing here?
Real behavior proof Next step before merge Security Review detailsBest possible solution: Land the narrow recovery guidance and tests after normal CI and maintainer review, leaving any automatic fail-closed updater lifecycle change to a separate owner decision. Do we have a high-confidence way to reproduce the issue? Yes, at the source-path level: npm global update or global install stage EACCES failures flow through inferUpdateFailureHints, and the linked report plus PR body show real EACCES output for that path. I did not run a destructive root-owned package replacement during this read-only review. Is this the best way to solve the issue? Yes: updating the CLI recovery hint, docs, and focused tests is the narrow maintainable fix for the guidance gap. Changing whether the updater keeps the Gateway stopped after EACCES would be a separate availability and lifecycle policy decision. Label justifications:
What I checked:
Likely related people:
Codex review notes: model gpt-5.5, reasoning high; reviewed against 583eb711ecb1. |
|
@clawsweeper re-review |
|
🦞🧹 I asked ClawSweeper to review this item again. Re-review progress:
|
ad5155a to
a2883ac
Compare
|
Maintainer verification before landing. Behavior addressed: EACCES during manual npm recovery now tells users to stop the managed Gateway before replacing the package, then restart and verify the service.
|
Summary
Why
Current main can restart the old managed Gateway after a staged package update fails with EACCES. The existing recovery hint points operators toward sudo/manual package recovery, but it does not say to stop the Gateway first. That leaves a window where the running Gateway can try to load core/plugin files while npm is replacing the package tree.
This follows ClawSweeper's narrow guidance on #83747: improve recovery hints/docs and focused tests without changing the updater lifecycle policy.
Closes #83747
Real behavior proof
v22.22.0, npm10.9.4, throwawayOPENCLAW_PROFILE=proof83757, throwawayNPM_CONFIG_PREFIXunder/tmp. Also validated in a Windows desktop source checkout from upstream main424c6d0a5, Nodev24.13.0, pnpm11.1.0.openclaw@latestinto a temp npm prefix on the VPS, applied this PR's EACCES hint change to the packaged update CLI bundle in that disposable install, made only the temp prefix'slib/node_modulesunwritable to trigger the realglobal install stageEACCES path, then ranOPENCLAW_PROFILE=proof83757 NPM_CONFIG_PREFIX=<temp-prefix> openclaw update --no-restart --yes --tag latest --timeout 20.global install stageEACCES path prints the new stop-before-manual-recovery guidance and the system install outline withgateway install --forceandgateway restart.Validation