Skip to content

refactor: introduce OpenCodeSupervisor for server lifecycle management#200

Merged
chriswritescode-dev merged 1 commit intomainfrom
refactor/health
Apr 25, 2026
Merged

refactor: introduce OpenCodeSupervisor for server lifecycle management#200
chriswritescode-dev merged 1 commit intomainfrom
refactor/health

Conversation

@chriswritescode-dev
Copy link
Copy Markdown
Owner

Summary

  • Introduce OpenCodeSupervisor with lifecycle state machine (idle → starting → healthy/unhealthy → recovering → failed)
  • Add operation locking to opencodeServerManager to prevent concurrent start/stop/restart/reload races
  • Implement 4-step recovery pipeline: restart, debug capture, rollback last known good, seed default config
  • Add health polling with configurable interval, threshold, and enable/disable toggle
  • Wire supervisor into all route factories with graceful fallback to direct manager calls
  • Add archiveBrokenConfig method to SettingsService for backup before rollback
  • Add health watch configuration defaults and env vars (HEALTH_POLL_MS, HEALTH_FAILURE_THRESHOLD, HEALTH_WATCH_ENABLED)

Changes

  • backend/src/services/opencode-supervisor.ts - New supervisor class with lifecycle management
  • backend/src/services/opencode-single-server.ts - Operation locking and allowNested parameter
  • backend/src/services/settings.ts - New archiveBrokenConfig method
  • backend/src/index.ts - Wire up supervisor, update shutdown logic
  • backend/src/routes/health.ts - Use supervisor lifecycle state for health checks
  • backend/src/routes/oauth.ts - Delegate config reload to supervisor
  • backend/src/routes/providers.ts - Delegate config reload to supervisor
  • backend/src/routes/repos.ts - Delegate restart to supervisor
  • backend/src/routes/settings.ts - Delegate reload/restart to supervisor (~20 call sites)
  • shared/src/config/defaults.ts - Health watch config defaults
  • shared/src/config/env.ts - Health watch env vars

Type of Change

  • Bug fix
  • New feature
  • Refactor
  • Documentation

Checklist

  • Code follows project style (no comments, named imports)
  • TypeScript types are properly defined
  • Tests added/updated (80% coverage target)
  • pnpm lint passes locally
  • pnpm typecheck passes locally

Replace direct opencodeServerManager calls with a supervisor that
provides health polling, operation locking, and a 4-step recovery
pipeline (restart, debug capture, rollback, seed default config).

Add archiveBrokenConfig to SettingsService for backup before rollback.
Wire supervisor into all route factories with graceful fallback to
direct manager calls. Add health watch configuration defaults and
env vars.
@chriswritescode-dev chriswritescode-dev merged commit 3dd8e34 into main Apr 25, 2026
4 checks passed
@chriswritescode-dev chriswritescode-dev deleted the refactor/health branch April 25, 2026 23:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant