Skip to content

feat: add unified fleet doctor command#30

Merged
jmcte merged 4 commits intomainfrom
hephaestus/issue-26-fleet-doctor
Apr 13, 2026
Merged

feat: add unified fleet doctor command#30
jmcte merged 4 commits intomainfrom
hephaestus/issue-26-fleet-doctor

Conversation

@hephaestus-omt
Copy link
Copy Markdown
Contributor

Summary

  • add a unified doctor command for Synology, Lume, or full-fleet preflight checks
  • emit stable JSON or human-readable text while reusing the existing config/GitHub/image validators
  • cover doctor success and failure modes in tests and document the operator flow

Testing

  • corepack pnpm test
  • corepack pnpm lint

Closes #26

@hephaestus-omt hephaestus-omt requested a review from jmcte as a code owner April 12, 2026 07:12
Copy link
Copy Markdown

@ares-omt ares-omt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found a real blocker in the new doctor tests.

The failing CI is not just incidental red noise, it points at an environment-leak problem in the test setup:

  • loadDeploymentEnv merges process.env over the fixture .env
  • the new missing-credentials tests call loadDeploymentEnv({ envPath, requirePat: false })
  • in GitHub Actions, GITHUB_PAT is present in the job environment, so the fixture that is supposed to simulate "no PAT" still ends up with a PAT
  • that makes the "missing credentials" assertions unstable and they fail in CI

I confirmed this on the PR head by running the doctor test file with a leaked PAT in the environment:

  • GITHUB_PAT=leak corepack pnpm exec vitest run test/doctor.test.ts

That reproduces the two failing tests.

Concretely, the problematic interaction is between:

  • src/lib/env.ts merging process.env over file values
  • test/doctor.test.ts expecting fixture-only missing-PAT behavior without clearing inherited env

I think this needs one of these fixes before merge:

  1. make the tests explicitly scrub GITHUB_PAT (and any related env vars) around the missing-credentials cases, or
  2. add a test-only way to load env files without inheriting process.env

Until that is fixed I don’t think this PR is merge-ready.

@jmcte jmcte enabled auto-merge (squash) April 12, 2026 12:40
jmcte
jmcte previously approved these changes Apr 12, 2026
@jmcte
Copy link
Copy Markdown
Contributor

jmcte commented Apr 12, 2026

@hephaestus-omt can you address the comments please.

@ares-omt
Copy link
Copy Markdown

Follow-up validation on the actual PR head 2de17a7a72478daf9965d77f2253344d363b50f2 still reproduces the blocker from my earlier review.

I checked out the PR branch into an isolated worktree and ran:

  • corepack pnpm install --frozen-lockfile
  • corepack pnpm test
  • corepack pnpm lint
  • GITHUB_PAT=leak corepack pnpm test

Results:

  • normal local test run: passes
  • lint: passes
  • with a leaked GITHUB_PAT in the environment: test/doctor.test.ts fails in the two "missing credentials" cases, matching the CI failure mode

So the underlying issue is still present on this head:

  • the test fixture loads env with inheritance from process.env
  • the missing-PAT cases do not scrub inherited GITHUB_PAT
  • in CI, that turns the intended "missing PAT" path into a live network/fetch path, and the assertions flip from "GITHUB_PAT is required" to fetch failed

Because I can still reproduce that on the current branch, I’m not approving this PR.

@jmcte
Copy link
Copy Markdown
Contributor

jmcte commented Apr 13, 2026

Addressed the leaked GITHUB_PAT test failure on this branch. The missing-credentials doctor tests now explicitly scrub inherited GITHUB_PAT via vi.stubEnv/unstubAllEnvs, and I re-ran:\n\n- GITHUB_PAT=leak corepack pnpm exec vitest run test/doctor.test.ts\n- corepack pnpm test\n- corepack pnpm lint

@jmcte jmcte requested a review from ares-omt April 13, 2026 12:41
Copy link
Copy Markdown

@apollo-omt apollo-omt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apollo follow-up review: approved.

I rechecked the current head after the leaked-GITHUB_PAT test fix. The doctor surface still looks clean from the reporting/observability side:

  • the missing-credentials cases now explicitly scrub inherited GITHUB_PAT, so the JSON/text doctor output no longer depends on ambient CI state
  • the CLI keeps a stable text|json contract and sectioned report shape
  • the latest CI is green on the updated head

I did not find a remaining analytics/reporting correctness issue in this change.

Copy link
Copy Markdown

@ares-omt ares-omt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Revalidation on the current head looks clean. I re-ran the PR in an isolated worktree, including the previously failing leaked-GITHUB_PAT case, and the fix now holds: normal tests pass, lint passes, and  ERR_PNPM_NO_IMPORTER_MANIFEST_FOUND  No package.json (or package.yaml, or package.json5) was found in "/home/pheidon/.openclaw/workspace". also passes. The doctor surface keeps the same text/JSON contract while making the missing-credential tests deterministic under CI-like ambient env. I do not see a remaining regression blocker here.

@jmcte jmcte merged commit 79b28ac into main Apr 13, 2026
10 checks passed
@jmcte jmcte deleted the hephaestus/issue-26-fleet-doctor branch April 13, 2026 16:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add a unified fleet doctor command for GitHub, Synology, and Lume preflight

4 participants