From c344d9949b6275691736bbe27426bf8038ef36f3 Mon Sep 17 00:00:00 2001
From: Vernon Stinebaker <vernon.stinebaker@gmail.com>
Date: Tue, 5 May 2026 14:21:01 +0800
Subject: [PATCH 1/3] docs(testing): add subsystem coverage map and roadmap

---
 README.md  |   4 +
 TESTING.md | 374 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 378 insertions(+)
 create mode 100644 TESTING.md

diff --git a/README.md b/README.md
index 73f7c48..6b3c7e3 100644
--- a/README.md
+++ b/README.md
@@ -121,6 +121,10 @@ optional `NULLTICKETS_TOKEN`.
 
 ## Development
 
+Testing strategy and roadmap:
+
+- `TESTING.md`
+
 Backend:
 
 ```bash
diff --git a/TESTING.md b/TESTING.md
new file mode 100644
index 0000000..5d71cb7
--- /dev/null
+++ b/TESTING.md
@@ -0,0 +1,374 @@
+# Testing Strategy
+
+This document defines the path to bring NullHub's test discipline closer to NullClaw's while keeping each improvement shippable in small, isolated pull requests.
+
+The aim is not a single large testing rewrite. The aim is to improve confidence incrementally, with each PR standing on its own wherever possible.
+
+## Goals
+
+- make the existing backend test suite a reliable daily gate
+- expand coverage into the highest-risk backend areas
+- add the missing frontend unit-test layer
+- replace shell-only smoke reliance with structured integration coverage
+- keep browser E2E small and focused
+- adopt NullClaw-style expectations: every behavior change gets tests, every bug fix gets a regression test
+
+## Current Repository State
+
+As of the current `main` branch:
+
+- NullHub already has substantial Zig unit-test coverage in parts of the backend.
+- Coverage is concentrated heavily in API and routing code.
+- The project has a shell smoke script at `tests/test_e2e.sh`.
+- The project does not yet have a committed frontend unit-test harness.
+- CI currently runs backend tests, the shell smoke test on Linux, and release builds.
+
+This means the main gap is not "no tests". The gap is uneven coverage and missing layers.
+
+## Testing Principles
+
+NullHub should follow the same core discipline used by NullClaw.
+
+- Every code change must be accompanied by tests.
+- Every bug fix must include a regression test.
+- If a path is impractical to unit test, document why.
+- Keep tests as close as possible to the behavior they validate.
+- Prefer the smallest test that proves the contract.
+- Add test helpers only when they unlock repeated future coverage.
+- Keep fast tests fast; separate unit, integration, smoke, and browser E2E concerns.
+
+## Current Coverage Map
+
+The snapshot below is based on the current `src/` tree and the committed test distribution.
+
+| Area | Current assessment | Evidence in tree | Highest-value next work |
+|---|---|---|---|
+| API routing and instance endpoints | Strong | `src/api/instances.zig`, `src/server.zig`, `src/api/*` contain the densest test coverage | expand cross-module integration coverage instead of adding more narrow route parsing tests |
+| Installer | Medium | `src/installer/orchestrator.zig`, `registry.zig`, `downloader.zig`, `ui_modules.zig`, `builder.zig` | add rollback, partial-failure cleanup, and fixture-driven install/update scenarios |
+| Supervisor and process lifecycle | Medium | `src/supervisor/manager.zig`, `process.zig`, `health.zig`, `runtime_state.zig` | add restart/backoff, boot reconciliation, and deterministic lifecycle integration tests |
+| Config, state, and paths | Medium | `src/core/state.zig`, `src/api/config.zig`, `src/core/paths.zig` | add tests around persisted-state restoration and migration-sensitive behavior |
+| Auth and access control | Light | `src/auth.zig`, `src/access.zig` | add unauthorized origin, token failure, and sensitive-route boundary tests |
+| Service install/uninstall/status | Light | `src/service.zig` | add stronger platform-specific generation and failure-path tests |
+| Orchestration proxy | Light | `src/api/orchestration.zig` | add upstream error mapping, token/header forwarding, and store-vs-boiler routing tests |
+| Discovery, mDNS, and compat layers | Light | `src/discovery.zig`, `src/mdns.zig`, `src/compat/*` | add degraded-mode and missing-tool fallback coverage |
+| Frontend UI logic | Missing | no committed UI test harness in `ui/` | add Vitest and Testing Library first |
+| Structured backend integration tests | Light | shell smoke only in `tests/test_e2e.sh` | add a real HTTP/integration harness with fixtures |
+| Browser end-to-end | Missing | no Playwright or equivalent suite | add a very small critical-flow suite after UI unit tests land |
+
+## Current Test Distribution Snapshot
+
+The current backend suite is broad in file count but uneven in depth.
+
+Most heavily tested files on `main` include:
+
+- `src/api/instances.zig`
+- `src/server.zig`
+- `src/core/state.zig`
+- `src/cli.zig`
+- `src/api/logs.zig`
+- `src/api/wizard.zig`
+- `src/supervisor/manager.zig`
+- `src/api/providers.zig`
+- `src/api/config.zig`
+- `src/installer/orchestrator.zig`
+
+Refresh this snapshot with:
+
+```bash
+rg -n --glob '*.zig' '^test\s+"' src | awk -F: '{count[$1]++} END {for (f in count) print count[f], f}' | sort -nr
+```
+
+## Test Layers To Build Toward
+
+NullHub should converge on four layers.
+
+### 1. Backend Unit Tests
+
+Use for:
+
+- parsing and normalization
+- route matching
+- config and state transforms
+- installer decision logic
+- supervisor state transitions
+- auth and access rules
+
+Primary command:
+
+```bash
+zig build test -Dbuild-ui=false --summary all
+```
+
+### 2. Backend Integration Tests
+
+Use for:
+
+- HTTP route behavior across modules
+- boot and runtime lifecycle flows
+- managed-instance interactions
+- orchestration proxy behavior with fake upstreams
+- installer and update scenarios using fixtures
+
+These should not require a browser.
+
+### 3. Frontend Unit and Component Tests
+
+Use for:
+
+- API client helpers
+- stores and route transforms
+- form validation and state behavior
+- orchestration helpers and key UI components
+
+Recommended tooling:
+
+- `vitest`
+- `@testing-library/svelte`
+
+### 4. Browser End-to-End Tests
+
+Use for:
+
+- route loading and hydration sanity
+- critical user flows
+- embedded asset/runtime integration
+
+Recommended tooling:
+
+- Playwright
+
+Keep this layer intentionally small.
+
+## Default TDD Workflow
+
+Every testing PR should follow this pattern unless it is documentation-only.
+
+1. Pick one behavior, contract, or regression.
+2. Add a failing test that expresses the expected behavior.
+3. Make the smallest code change that makes the test pass.
+4. Run the smallest relevant validation first.
+5. Run the broader project gate before opening the PR.
+6. Document anything skipped.
+
+For bug fixes, prefer explicit regression naming or a short regression comment.
+
+## Incremental PR Roadmap
+
+The sequence below is designed for clean, isolated PRs.
+
+### Phase 0: Policy and Documentation
+
+Purpose:
+
+- document the test contract
+- align contributor expectations with NullClaw's model
+
+Suggested PR:
+
+- `docs(testing): add testing strategy and contributor expectations`
+
+Dependencies:
+
+- none
+
+### Phase 1: Smoke Harness Hardening
+
+Purpose:
+
+- make the shell smoke test fail on real server crashes
+- keep smoke runs isolated from developer-local state
+
+Suggested PR:
+
+- `test(smoke): isolate E2E home and detect server exits`
+
+Dependencies:
+
+- none
+
+### Phase 2: Coverage Map and Gap Inventory
+
+Purpose:
+
+- make current strengths and weaknesses explicit
+- give later test PRs a scoped target list
+
+Suggested PR:
+
+- `docs(testing): add subsystem coverage map and gap inventory`
+
+Dependencies:
+
+- none
+
+### Phase 3: Backend Test Entry Stabilization
+
+Purpose:
+
+- make backend tests the undisputed daily gate
+- reduce confusion around UI asset coupling during test runs
+
+Suggested PR:
+
+- `build(test): make backend test entrypoint deterministic and documented`
+
+Dependencies:
+
+- none
+
+### Phase 4: Shared Backend Fixtures
+
+Purpose:
+
+- make installer, supervisor, and orchestration tests cheaper to write
+
+Suggested PR:
+
+- `test(fixtures): add reusable backend test helpers for state and upstream fakes`
+
+Dependencies:
+
+- Phase 3 preferred
+
+### Phase 5: High-Risk Backend Coverage
+
+Target order:
+
+1. supervisor and process lifecycle
+2. installer and updates
+3. auth and access control
+4. orchestration proxy behavior
+5. service generation and status behavior
+6. discovery and degraded-mode fallbacks
+
+Example PRs:
+
+- `test(supervisor): cover restart threshold and crash recovery transitions`
+- `test(installer): cover rollback and duplicate-instance failure paths`
+- `test(auth): cover unauthorized origin and bearer-token failure paths`
+- `test(orchestration): cover upstream error mapping and token forwarding`
+- `test(service): cover launchd/systemd generation and failure paths`
+
+Dependencies:
+
+- Phase 4 recommended for several of these areas
+
+### Phase 6: Structured Backend Integration Harness
+
+Purpose:
+
+- stop relying on a shell script as the only assembled-behavior check
+
+Suggested PRs:
+
+- `test(integration): add structured HTTP smoke harness`
+- `test(integration): cover instance lifecycle and config mutation flows`
+- `test(integration): cover orchestration proxy scenarios`
+
+Dependencies:
+
+- Phase 4 strongly recommended
+
+### Phase 7: Frontend Unit-Test Harness
+
+Purpose:
+
+- add the missing UI logic test layer
+
+Suggested PRs:
+
+- `test(ui): add Vitest and Testing Library harness`
+- `test(ui): cover API client and config-form helpers`
+- `test(ui): cover orchestration helpers and key components`
+
+Dependencies:
+
+- none
+
+### Phase 8: Minimal Browser E2E
+
+Purpose:
+
+- catch browser-only regressions without growing a large flaky suite
+
+Suggested PRs:
+
+- `test(e2e): add Playwright harness and dashboard smoke flow`
+- `test(e2e): cover instances and settings journeys`
+- `test(e2e): cover wizard happy path`
+
+Dependencies:
+
+- Phase 7 recommended
+
+### Phase 9: CI and Hook Enforcement
+
+Purpose:
+
+- make testing discipline the default workflow rather than tribal knowledge
+
+Suggested PRs:
+
+- `ci(test): split backend, smoke, and release jobs`
+- `hooks(test): add pre-push backend test enforcement`
+- `ci(ui): add frontend unit and browser E2E jobs`
+
+Dependencies:
+
+- depends on the corresponding earlier phases for any enforced suites
+
+### Phase 10: Coverage Visibility
+
+Purpose:
+
+- make gaps visible without optimizing for vanity percentages too early
+
+Suggested PR:
+
+- `ci(coverage): publish test suite summary and UI coverage artifacts`
+
+Dependencies:
+
+- frontend harness in place first
+
+## Recommended Validation By Change Type
+
+Docs-only changes:
+
+```bash
+git diff --check
+```
+
+Backend code changes:
+
+```bash
+zig build test -Dbuild-ui=false --summary all
+```
+
+Smoke or lifecycle changes:
+
+```bash
+zig build test -Dbuild-ui=false --summary all
+bash tests/test_e2e.sh
+```
+
+Future UI test changes after the harness exists:
+
+```bash
+npm --prefix ui test -- --run
+zig build test -Dbuild-ui=false --summary all
+```
+
+If any validation is skipped, the PR description should say exactly what was skipped and why.
+
+## Definition of Done
+
+NullHub should be considered aligned with NullClaw's testing model when all of the following are true:
+
+- contributor docs require tests for every code change
+- backend tests are reliable and treated as the primary local gate
+- high-risk backend subsystems have direct failure-mode coverage
+- structured backend integration tests exist beyond shell-only smoke
+- frontend unit tests run locally and in CI
+- a minimal browser E2E suite covers critical user journeys
+- CI and hooks reinforce the workflow

From 3dde31051a72abc8eab6d5a566706c2e770673bd Mon Sep 17 00:00:00 2001
From: Igor Somov <donprusne@gmail.com>
Date: Wed, 6 May 2026 12:48:57 -0300
Subject: [PATCH 2/3] docs(testing): align coverage map with main

---
 README.md                                     |  6 +-
 TESTING.md                                    | 58 ++++++++++++-------
 .../plans/2026-03-18-report-command.md        | 14 ++---
 3 files changed, 46 insertions(+), 32 deletions(-)

diff --git a/README.md b/README.md
index 6b3c7e3..db5367d 100644
--- a/README.md
+++ b/README.md
@@ -121,14 +121,12 @@ optional `NULLTICKETS_TOKEN`.
 
 ## Development
 
-Testing strategy and roadmap:
-
-- `TESTING.md`
+Testing strategy and roadmap live in [TESTING.md](TESTING.md).
 
 Backend:
 
 ```bash
-zig build test
+zig build test --summary all
 ```
 
 Frontend:
diff --git a/TESTING.md b/TESTING.md
index 5d71cb7..98eadb3 100644
--- a/TESTING.md
+++ b/TESTING.md
@@ -21,7 +21,7 @@ As of the current `main` branch:
 - Coverage is concentrated heavily in API and routing code.
 - The project has a shell smoke script at `tests/test_e2e.sh`.
 - The project does not yet have a committed frontend unit-test harness.
-- CI currently runs backend tests, the shell smoke test on Linux, and release builds.
+- CI currently runs backend tests, the shell smoke test on Linux, and ReleaseSmall binary builds.
 
 This means the main gap is not "no tests". The gap is uneven coverage and missing layers.
 
@@ -59,18 +59,18 @@ The snapshot below is based on the current `src/` tree and the committed test di
 
 The current backend suite is broad in file count but uneven in depth.
 
-Most heavily tested files on `main` include:
+Most heavily tested files on `main` currently are:
 
-- `src/api/instances.zig`
-- `src/server.zig`
-- `src/core/state.zig`
-- `src/cli.zig`
-- `src/api/logs.zig`
-- `src/api/wizard.zig`
-- `src/supervisor/manager.zig`
-- `src/api/providers.zig`
-- `src/api/config.zig`
-- `src/installer/orchestrator.zig`
+- `src/api/instances.zig` (90 tests)
+- `src/server.zig` (53 tests)
+- `src/api/providers.zig` (35 tests)
+- `src/core/state.zig` (34 tests)
+- `src/cli.zig` (28 tests)
+- `src/api/wizard.zig` (17 tests)
+- `src/api/logs.zig` (17 tests)
+- `src/installer/orchestrator.zig` (16 tests)
+- `src/supervisor/manager.zig` (15 tests)
+- `src/api/config.zig` (14 tests)
 
 Refresh this snapshot with:
 
@@ -93,7 +93,13 @@ Use for:
 - supervisor state transitions
 - auth and access rules
 
-Primary command:
+Primary local command:
+
+```bash
+zig build test --summary all
+```
+
+CI-style command after `ui/build` has already been generated:
 
 ```bash
 zig build test -Dbuild-ui=false --summary all
@@ -163,9 +169,9 @@ Purpose:
 - document the test contract
 - align contributor expectations with NullClaw's model
 
-Suggested PR:
+Status:
 
-- `docs(testing): add testing strategy and contributor expectations`
+- covered by this document
 
 Dependencies:
 
@@ -178,9 +184,13 @@ Purpose:
 - make the shell smoke test fail on real server crashes
 - keep smoke runs isolated from developer-local state
 
-Suggested PR:
+Landed scope:
+
+- `test(smoke): harden e2e server diagnostics`
+
+Status:
 
-- `test(smoke): isolate E2E home and detect server exits`
+- already landed on `main` in `tests/test_e2e.sh`; do not open a duplicate smoke-hardening PR unless new smoke gaps are identified
 
 Dependencies:
 
@@ -193,9 +203,9 @@ Purpose:
 - make current strengths and weaknesses explicit
 - give later test PRs a scoped target list
 
-Suggested PR:
+Status:
 
-- `docs(testing): add subsystem coverage map and gap inventory`
+- covered by this document
 
 Dependencies:
 
@@ -341,6 +351,12 @@ git diff --check
 
 Backend code changes:
 
+```bash
+zig build test --summary all
+```
+
+CI-style rerun after `ui/build` already exists:
+
 ```bash
 zig build test -Dbuild-ui=false --summary all
 ```
@@ -348,7 +364,7 @@ zig build test -Dbuild-ui=false --summary all
 Smoke or lifecycle changes:
 
 ```bash
-zig build test -Dbuild-ui=false --summary all
+zig build test --summary all
 bash tests/test_e2e.sh
 ```
 
@@ -356,7 +372,7 @@ Future UI test changes after the harness exists:
 
 ```bash
 npm --prefix ui test -- --run
-zig build test -Dbuild-ui=false --summary all
+zig build test --summary all
 ```
 
 If any validation is skipped, the PR description should say exactly what was skipped and why.
diff --git a/docs/superpowers/plans/2026-03-18-report-command.md b/docs/superpowers/plans/2026-03-18-report-command.md
index 4dfd066..9bbd122 100644
--- a/docs/superpowers/plans/2026-03-18-report-command.md
+++ b/docs/superpowers/plans/2026-03-18-report-command.md
@@ -6,7 +6,7 @@
 
 **Architecture:** Core logic lives in `report.zig` (enums, system data collection, issue body formatting, submission fallback chain). CLI interactive flow in `report_cli.zig`. API handlers in `api/report.zig`. Svelte form page at `ui/src/routes/report/`. Wired into existing CLI parser, server router, sidebar nav, and API client.
 
-**Tech Stack:** Zig 0.15.2, Svelte 5 + SvelteKit, GitHub API via `gh` CLI / curl fallback
+**Tech Stack:** Zig 0.16.0, Svelte 5 + SvelteKit, GitHub API via `gh` CLI / curl fallback
 
 **Spec:** `docs/superpowers/specs/2026-03-18-report-command-design.md`
 
@@ -251,7 +251,7 @@ test "ReportOptions defaults" {
 
 - [ ] **Step 8: Run tests to verify**
 
-Run: `zig build test 2>&1 | head -20`
+Run: `zig build test --summary all 2>&1 | head -20`
 Expected: all tests pass
 
 - [ ] **Step 9: Commit**
@@ -765,7 +765,7 @@ test "writeJsonEscaped" {
 
 - [ ] **Step 2: Run tests**
 
-Run: `zig build test 2>&1 | head -20`
+Run: `zig build test --summary all 2>&1 | head -20`
 Expected: all tests pass
 
 - [ ] **Step 3: Commit**
@@ -793,7 +793,7 @@ Add `_ = report;` in the test block after `_ = registry;`.
 
 - [ ] **Step 2: Run build and tests**
 
-Run: `zig build test 2>&1 | head -20`
+Run: `zig build test --summary all 2>&1 | head -20`
 Expected: all tests pass
 
 - [ ] **Step 3: Commit**
@@ -1061,7 +1061,7 @@ Add before `.help =>` in the switch:
 
 - [ ] **Step 3: Run build and tests**
 
-Run: `zig build test 2>&1 | head -20`
+Run: `zig build test --summary all 2>&1 | head -20`
 Expected: builds and tests pass
 
 - [ ] **Step 4: Commit**
@@ -1324,7 +1324,7 @@ Add `_ = report_api;` in test block.
 
 - [ ] **Step 4: Run build and tests**
 
-Run: `zig build test 2>&1 | head -20`
+Run: `zig build test --summary all 2>&1 | head -20`
 Expected: all tests pass
 
 - [ ] **Step 5: Commit**
@@ -1876,7 +1876,7 @@ Expected: 3 labels shown
 
 - [ ] **Step 1: Run full test suite**
 
-Run: `zig build test 2>&1`
+Run: `zig build test --summary all 2>&1`
 Expected: all tests pass
 
 - [ ] **Step 2: Build the binary**

From ff40f7b3ba85cf9854a6a03c81d263d41c656b89 Mon Sep 17 00:00:00 2001
From: Igor Somov <donprusne@gmail.com>
Date: Wed, 6 May 2026 12:51:53 -0300
Subject: [PATCH 3/3] docs(testing): avoid volatile test counts

---
 TESTING.md | 24 ++++++++++++------------
 1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/TESTING.md b/TESTING.md
index 98eadb3..8e4bff8 100644
--- a/TESTING.md
+++ b/TESTING.md
@@ -59,18 +59,18 @@ The snapshot below is based on the current `src/` tree and the committed test di
 
 The current backend suite is broad in file count but uneven in depth.
 
-Most heavily tested files on `main` currently are:
-
-- `src/api/instances.zig` (90 tests)
-- `src/server.zig` (53 tests)
-- `src/api/providers.zig` (35 tests)
-- `src/core/state.zig` (34 tests)
-- `src/cli.zig` (28 tests)
-- `src/api/wizard.zig` (17 tests)
-- `src/api/logs.zig` (17 tests)
-- `src/installer/orchestrator.zig` (16 tests)
-- `src/supervisor/manager.zig` (15 tests)
-- `src/api/config.zig` (14 tests)
+Files that sit near the high end of the current distribution include:
+
+- `src/api/instances.zig`
+- `src/server.zig`
+- `src/api/providers.zig`
+- `src/core/state.zig`
+- `src/cli.zig`
+- `src/api/wizard.zig`
+- `src/api/logs.zig`
+- `src/installer/orchestrator.zig`
+- `src/supervisor/manager.zig`
+- `src/api/config.zig`
 
 Refresh this snapshot with: