Skip to content

fix: harden Telegram restart recovery on macOS#484

Merged
artemgetmann merged 1 commit into
mainfrom
codex/gemini-api-key-runtime-20260417
Apr 17, 2026
Merged

fix: harden Telegram restart recovery on macOS#484
artemgetmann merged 1 commit into
mainfrom
codex/gemini-api-key-runtime-20260417

Conversation

@artemgetmann
Copy link
Copy Markdown
Owner

Review Fast Path

  • User path fixed: Telegram /restart on the shared main bot no longer uses the lane-local detached helper that can unload ai.openclaw.gateway, and Telegram polling restarts now keep the learned IPv4 fallback instead of rediscovering the same failing network path.
  • Proof: targeted tests passed in this worktree: pnpm vitest run src/infra/restart-trigger.test.ts src/auto-reply/reply/commands-restart.test.ts src/media/fetch.telegram-network.test.ts
  • Shared-state footgun removed: the canonical shared macOS LaunchAgent ai.openclaw.gateway is now excluded from scripts/restart-local-gateway.sh and from helper eligibility in /restart.
  • Still hurts: the live shared runtime still shows an operational startup/launchd problem after manual recovery; this PR fixes the code paths we identified, but does not claim the current runtime is fully healed without deploying and re-verifying on main.

Why This Matters

  • /restart from Telegram could briefly bring the bot back and then take the shared main runtime fully offline.
  • Telegram polling was repeatedly stalling on network fallback and restarting into the same broken transport state.
  • Both failures amplify each other: transport wedges encourage restart use, and restart could brick the shared bot.

Scope Boundary

  • Changed: restart helper eligibility, detached restart guardrails, Telegram transport sticky IPv4 fallback persistence, and regression tests.
  • Not changed: launchd/watchdog operational recovery flow for the currently broken shared runtime instance, model/provider auth logic, or unrelated channel behavior.

Verification

  • pnpm install --frozen-lockfile
  • pnpm vitest run src/infra/restart-trigger.test.ts src/auto-reply/reply/commands-restart.test.ts src/media/fetch.telegram-network.test.ts
  • Runtime evidence captured during investigation:
    • shared main runtime was killed by /restart after briefly recovering
    • logs showed repeated Telegram polling stalls plus repeated IPv4 fallback activation

AI Assistance

  • AI-assisted
  • Testing degree: targeted

- what: block the shared main launchd service from using the detached lane-local restart helper, and persist Telegram's learned sticky IPv4 fallback across fresh transport instances with regression tests.
- why: /restart could unload ai.openclaw.gateway and fail to relaunch it, while polling restarts kept re-entering the same failing IPv6/auto-family path and looping on stalls.
- risk: restart behavior changes for the canonical shared main runtime on macOS and Telegram transport now caches IPv4 fallback policy within a process; isolated lane helpers still use the old detached path.
@artemgetmann artemgetmann merged commit 7469b1d into main Apr 17, 2026
3 of 10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant