Add badge branding, exec-plans, and Conductor setup script by WellDunDun · Pull Request #21 · selftune-dev/selftune

WellDunDun · 2026-03-05T07:41:41Z

Summary

Added selftune logo branding to badge SVG renderer with base64 data URI embedding
Re-added exec-plans to public repo (part of reins harness for contributors/agents)
Narrowed gitignore to only internal business strategy docs (symlinked from canonical location)
Created Conductor setup script to provide workspace context across environments
Updated ROADMAP with public-facing Badge Showcase section and removed internal infrastructure details

Changes

Badge logo: 14px embedded in label section with 3px padding, white fill from canonical SVG
Exec-plans: agent-first-skill-restructure, multi-agent-sandbox, tech-debt-tracker restored to public tracking
Gitignore: removed docs/exec-plans/ (now public), kept docs/strategy/ and Plans/ (internal)
Conductor setup: symlinks /Users/danielpetro/Documents/Projects/FOSS/selftune into workspace for docs/strategy/, .context/internal/
Tests: added logo image element and data URI verification tests

Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com

…ed service - Add `selftune badge` CLI command with --skill, --format (svg/markdown/url), --output options - Badge data computation: color-coded by pass rate (green >80%, yellow 60-80%, red <60%, gray no-data) - SVG renderer: shields.io flat-style with Verdana 11px char-width table, zero external deps - Dashboard: badge route (GET /badge/:skill) and report route (GET /report/:skill) - Dashboard header: inline selftune logo mark - README: centered logo above badge rows - Architecture lint: badge module registered with proper dependency boundaries - Hosted service scaffold: Fly.io deployment, badge/report/submit/health routes - 35 badge tests (unit + integration), all passing - Workflow documentation for badge command usage Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Remove service/ directory, deployment workflow, and related test files. Update lint-architecture.ts and tsconfig.json to remove service references. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- Extract BADGE_THRESHOLDS, BADGE_COLORS, TREND_ARROWS as exported constants matching cloud app pattern - Embed selftune logo as base64 data URI in badge SVG label section - Update badge service URLs to badge.selftune.dev and selftune-api.fly.dev - Add 4 local badge CLI tests and 2 live badge.selftune.dev smoke tests to sandbox orchestrator - Add logo presence tests to badge-svg test suite - Clean up stale exec-plans and update ROADMAP Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Exec-plans are part of the reins harness that contributors and agents need for architectural context. Only business strategy docs (GTM, ICP) are internal and symlinked from the canonical location via Conductor setup script. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

coderabbitai · 2026-03-05T07:42:00Z

📝 Walkthrough

Summary by CodeRabbit

New Features
- Generate skill health badges (SVG, Markdown, URL) via new CLI command and dashboard endpoints.
- Contribution flow can submit to a hosted service with GitHub fallback; bundle schema now supports v1.1 and includes skill naming.
Documentation
- Updated README (logo, resized image, project badges) and added badge workflow docs and roadmap badge showcase.
Tests
- Extensive badge unit/integration tests and dashboard badge/report endpoint tests.

Walkthrough

Adds a badge system: data model, SVG renderer, CLI command, dashboard endpoints (/badge, /report), tests and docs; bumps contribution bundle schema to 1.1 and adds skill_name; removes two long strategy docs and updates lint/package/branding metadata.

Changes

Cohort / File(s)	Summary
Ignore & Package `\.gitignore`, `package.json`	Added `.test-data/` and `Plans/` to ignores; included `assets/` in published package files.
Branding & Docs Index `README.md`, `dashboard/index.html`, `docs/design-docs/index.md`	Replaced logo path/size and added badges in README; inlined SVG logo into dashboard HTML (duplicated in two spots); registered `badge-service.md`.
Roadmap & Strategy Docs `ROADMAP.md`, `docs/strategy/icp-gtm-strategy.md`, `docs/strategy/openclaw-integration.md`	Expanded ROADMAP with badge and badge-related infra; removed two large strategy docs (ICP/GTM and OpenClaw integration).
Badge Core (data & rendering) `cli/selftune/badge/badge-data.ts`, `cli/selftune/badge/badge-svg.ts`	New pure modules: computeBadgeData/findSkillBadgeData with thresholds, trend handling, and BadgeData types; SVG renderer with text measurement, XML escaping, embedded base64 logo, and format routing (svg/markdown/url).
Badge CLI & Wiring `cli/selftune/badge/badge.ts`, `cli/selftune/index.ts`	New `badge` CLI command (arg parsing, log ingestion, doctor/status pipeline) producing svg/markdown/url outputs; top-level CLI wired to dynamically import and run badge command; updated contribute to async.
Dashboard Server Integration `cli/selftune/dashboard-server.ts`, `tests/dashboard/badge-routes.test.ts`	Added `/badge/:skillName` and `/report/:skillName` routes, log-based computeStatusFromLogs, report HTML builder, proper headers (CORS/no-cache), and route tests.
Contribution & Types `cli/selftune/contribute/bundle.ts`, `cli/selftune/contribute/contribute.ts`, `cli/selftune/types.ts`	Bumped bundle schema to `1.1`, added `skill_name` to ContributionBundle, made contribute cliMain async, added service submission with GitHub fallback and submitToService.
Lint / Architecture `lint-architecture.ts`	Registered badge files and added BADGE_FORBIDDEN import rules preventing badge modules from importing several internal subsystems.
Workflows & Guides `skill/Workflows/Badge.md`	New workflow doc describing badge CLI usage, formats, embedding guidance, dashboard routes, and hosted badge service phases.
Tests: unit / integration / smoke `tests/badge/`, `tests/contribute/`, `tests/sandbox/`, `tests/dashboard/`	Added comprehensive badge unit and integration tests (data, SVG, markdown, URL), dashboard route tests, sandbox/CLI smoke tests, and updated contribute tests to expect schema 1.1.
Test Refactors / Fixes `tests/evolution/pareto.test.ts`, `tests/grading/pre-gates.test.ts`, `tests/sandbox/docker/run-with-llm.ts`	Minor test safety/refactor changes: optional chaining, helper lookup, and unused stdout/stderr variable renames.

Sequence Diagram(s)

sequenceDiagram
    participant CLI as Badge CLI
    participant LogReader as Log Reader
    participant Doctor as System Doctor
    participant StatusCompute as computeStatus
    participant BadgeData as computeBadgeData
    participant Renderer as formatBadgeOutput
    participant Output as File/stdout

    CLI->>LogReader: Read telemetry, skill usage, query, evolution logs
    LogReader-->>StatusCompute: Parsed JSONL data
    StatusCompute->>Doctor: Run health checks
    Doctor-->>StatusCompute: System health
    StatusCompute-->>BadgeData: StatusResult with skill metrics
    BadgeData->>BadgeData: Apply thresholds and attach trend arrow
    BadgeData-->>Renderer: BadgeData (color, message, trend)
    Renderer->>Renderer: Route by format (svg / markdown / url)
    alt svg
        Renderer->>Renderer: renderBadgeSvg (logo, gradients, text measurement)
    else markdown/url
        Renderer->>Renderer: Generate shields.io markdown/URL
    end
    Renderer-->>Output: Formatted badge string
    CLI->>Output: Write to file or stdout

sequenceDiagram
    participant Client as HTTP Client
    participant DashServer as Dashboard Server
    participant LogReader as Log Reader
    participant StatusCompute as computeStatus
    participant Finder as findSkillBadgeData
    participant Renderer as formatBadgeOutput
    participant HTMLBuilder as buildReportHTML
    participant Response as HTTP Response

    Client->>DashServer: GET /badge/:skillName?format=svg
    DashServer->>LogReader: Read telemetry & logs
    LogReader-->>StatusCompute: Parsed logs
    StatusCompute-->>Finder: StatusResult
    Finder->>Finder: Lookup skill by name
    alt Skill Found
        Finder-->>Renderer: BadgeData
        Renderer-->>Response: SVG/markdown/URL (200)
    else Skill Not Found
        Finder-->>Renderer: Neutral not-found BadgeData
        Renderer-->>Response: 404 SVG (neutral)
    end

    Client->>DashServer: GET /report/:skillName
    DashServer->>LogReader: Read telemetry & logs
    LogReader-->>StatusCompute: Parsed logs
    StatusCompute->>HTMLBuilder: Build report (pass rate, trend, snapshot)
    HTMLBuilder-->>Response: HTML report (200) or 404 if missing

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related PRs

Add PAI Infrastructure patterns to selftune #15 — overlaps dashboard-server changes and badge/report endpoint integration.
Add replay and contribute: retroactive backfill and community contribution export (v0.7) #13 — touches contribution bundle schema and types; directly related to schema_version → 1.1 and skill_name.
Sync dev with master, remove duplicate README badges #14 — overlaps README badge/header edits and branding changes.

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (2 warnings)

Check name	Status	Explanation	Resolution
Title check	⚠️ Warning	Title does not follow conventional commits format (must start with feat\|fix\|docs\|chore\|refactor\|test\|perf\|ci).	Prefix the title with a conventional commit type, e.g., 'feat: Add badge branding, exec-plans, and Conductor setup script'.
Docstring Coverage	⚠️ Warning	Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (1 passed)

Check name	Status	Explanation
Description check	✅ Passed	Description is detailed and directly related to the changeset, covering badge logo embedding, exec-plans restoration, gitignore narrowing, Conductor setup, and ROADMAP updates.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings (stacked PR)
📝 Generate docstrings (commit on current branch)

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch custom/prefix/router-1772543690607

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Bug fixes: - Fix routing retries using wrong refiner in evolve-body.ts (#15) - Fix baseline gate evaluating wrong skill body in evolve.ts (#19) - Fix Gate 1 accepting routing tables with zero data rows (#20) - Fix section heading prefix match allowing false positives (#13) Security: - Add ReDoS protection for regex assertions (length cap, try-catch, u flag) (#10) - Harden YES/NO parsing to first-token extraction with punctuation stripping (#23) Type safety & validation: - Add type params to readJsonl calls in baseline.ts (#3) - Validate LLM JSON shape before treating as SkillUnitTest[] (#4) - Add JSON.parse validation in unit-test-cli.ts (#9) Code quality: - Replace console.warn with structured JSON logging (#5, #11) - Sort readdirSync for deterministic traversal (#6) - Tighten fuzzy match to prevent over-selection (#7) - Rename evalFailures to positiveEvals for clarity (#8) - Enforce timeout_ms via Promise.race in runUnitTest (#12) - Enrich deploy audit entry with gates snapshot (#16) - Add Pareto 5th dimension comment (#18) - Extract composability CLI into composability-cli.ts (#21) - Add missing eval files to lint-architecture.ts (#28) - Reformat AssertionType union (#22), template literal (#14), imports (#17) - Reformat baseline conditional (#2) Documentation: - Renumber scope expansion list (#24) - Add text language to fenced code blocks (#25, #27, #30) - Add rollback audit payload note (#26) - Clarify --dir expects corpus root (#29) - Make --tests optional in syntax (#31) Tests: - Remove unused afterEach import (#32) - Replace ! assertion with ?. optional chaining (#33) - Fix temp dir cleanup leak with afterAll (#34) - Add Gate 2 status assertion (#40) - Strengthen routing validation assertions (#41) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

coderabbitai

Actionable comments posted: 12

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

cli/selftune/dashboard-server.ts (1)

14-26: ⚠️ Potential issue | 🟡 Minor

Fix import organization.

Pipeline failure indicates imports need to be sorted. Run the formatter/linter auto-fix.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@cli/selftune/dashboard-server.ts` around lines 14 - 26, The import block at
the top of dashboard-server.ts is not organized/sorted causing pipeline
failures; reorder and group the imports (built-ins first like node:fs/node:path,
then project-local imports) and alphabetize within groups so imports such as
existsSync, readFileSync, dirname, join, resolve, EVOLUTION_AUDIT_LOG,
QUERY_LOG, SKILL_LOG, TELEMETRY_LOG, getLastDeployedProposal, readDecisions,
computeMonitoringSnapshot, computeStatus, StatusResult, doctor,
findSkillBadgeData, BadgeData, renderBadgeSvg, formatBadgeOutput, and
BadgeFormat are consistently ordered; then run your formatter/linter auto-fix
(e.g., prettier and/or eslint --fix) to apply the change.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@cli/selftune/badge/badge.ts`:
- Around line 60-69: The code assigns format a default "svg" before checking for
invalid input, which is confusing; change the order so you validate
values.format against VALID_FORMATS first and exit on invalid input, then
compute const format: BadgeFormat = values.format ? (values.format as
BadgeFormat) : "svg". Update the logic around values.format, VALID_FORMATS, the
format constant, and the existing error branch that calls process.exit(1) and
prints HELP (referencing values.format, VALID_FORMATS, format, and HELP) so
invalid values are rejected before any defaulting occurs.

In `@cli/selftune/dashboard-server.ts`:
- Around line 128-135: computeStatusFromLogs currently re-reads TELEMETRY_LOG,
SKILL_LOG, QUERY_LOG, and EVOLUTION_AUDIT_LOG on every call which duplicates the
I/O already performed by collectData; refactor so computeStatusFromLogs reuses
cached data or the result from collectData instead of calling readJsonl
repeatedly — either (a) change callers (e.g., badge/report handlers) to call
collectData once and pass the telemetry/skill/query/audit arrays into
computeStatus (or a new computeStatusFromData wrapper), or (b) implement a small
module-level memoization in computeStatusFromLogs (with optional TTL or explicit
invalidation) that stores the last-read telemetry, skillRecords, queryRecords,
and auditEntries and returns computeStatus using that cache; preserve existing
computeStatus(telemetry, skillRecords, queryRecords, auditEntries, doctorResult)
call and avoid duplicating readJsonl invocations.
- Line 417: The code unsafely asserts the query param into BadgeFormat by using
(url.searchParams.get("format") as BadgeFormat); instead, validate the raw
string returned by url.searchParams.get("format") against the allowed
BadgeFormat values (e.g., "svg", "png", etc.) before assigning to format, and
fallback to "svg" if the value is missing or invalid; update the logic around
the format variable declaration so it checks the string and only sets format to
a trusted BadgeFormat when it matches an allowed value.

In `@dashboard/index.html`:
- Around line 549-557: The SVG logo is decorative and should be removed from the
accessibility tree: update the <svg> element (the one with
xmlns="http://www.w3.org/2000/svg" width="28" height="28" viewBox="0 0 250 250"
fill="none" style="flex-shrink:0") to include aria-hidden="true" (and remove any
<title> or <desc> if present) so screen readers ignore it while the adjacent
<h1> provides the accessible text.

In `@README.md`:
- Line 12: The TypeScript badge in the README hardcodes "TypeScript-5.0" which
will get stale; update the badge markup (the line containing the
[![TypeScript-5.0] image badge) to use a non-versioned label such as
"TypeScript" (e.g., change the alt/label from "TypeScript-5.0" to "TypeScript")
or remove the version segment entirely, or alternatively replace the badge with
a note documenting that TypeScript is provided by the Bun runtime and recommend
documenting the Bun version elsewhere; edit the badge string in README.md where
the current TypeScript badge markup appears.

In `@tests/badge/badge-svg.test.ts`:
- Around line 9-16: Remove the unused type imports BadgeData and BadgeFormat
from the import statement in tests/badge/badge-svg.test.ts — keep the used
imports (computeBadgeData, findSkillBadgeData) and the other imports
(formatBadgeOutput, renderBadgeSvg, SkillStatus, StatusResult); update the
import line so it no longer imports the unused type symbols BadgeData and
BadgeFormat to satisfy the linter.

In `@tests/badge/badge.test.ts`:
- Around line 171-176: The assertions use non-null assertions on the test
variable badge (badge!.label, badge!.passRate, badge!.color, badge!.message);
replace those non-null assertions with optional chaining (badge?.label,
badge?.passRate, badge?.color, badge?.message) so the expectations use optional
access instead of the non-null operator while keeping the existing
expect(badge).not.toBeNull() check.
- Around line 60-74: The test currently uses non-null assertions when accessing
properties of apiBadge, dbBadge, and newBadge (e.g., apiBadge!.color) after
asserting not.toBeNull(); update the assertions to use optional chaining instead
of `!` (e.g., access via apiBadge?.color, dbBadge?.message, newBadge?.status) so
the pattern expect(x).not.toBeNull(); x!.prop becomes expect(x).not.toBeNull();
x?.prop across the checks for apiBadge, dbBadge, and newBadge in the
badge.test.ts file.

In `@tests/dashboard/badge-routes.test.ts`:
- Around line 32-65: Add happy-path tests to the existing describe("GET
/badge/:skillName") suite to exercise successful badge responses: create at
least one test that requests /badge/test-skill (or a known skill) and asserts
status 200, content-type includes "image/svg+xml", and the body starts with
"<svg" and contains the expected skill text; add another test for
?format=markdown that hits a known skill and asserts status 200, content-type
"text/plain" and that the body contains the expected markdown badge text. Ensure
these tests set up or mock the necessary skill data/resource (so the route
handler for GET /badge/:skillName returns a real badge) and reuse the same fetch
pattern used in the file (server.port) so they integrate with the existing
fixtures.
- Around line 67-82: Add a happy-path test in the describe("GET
/report/:skillName") block that requests a known existing skill and asserts a
200 status and that the Content-Type header contains "text/html"; locate the
block around the existing tests for GET /report/:skillName (in
tests/dashboard/badge-routes.test.ts) and add an it(...) that fetches
`http://localhost:${server.port}/report/<existing-skill>` (use whatever fixture
or seed skill your test harness provides, e.g., "test-skill" if created in
setup) and asserts res.status === 200 and
res.headers.get("content-type").includes("text/html").

In `@tests/sandbox/run-sandbox.ts`:
- Around line 669-677: The code block that defines const isNetworkError and the
returned object's properties has formatting that doesn't match the project's
formatter; run the project's automatic formatter to fix it by executing the
formatting script (run the lint fix command) so the return object and the
isNetworkError expression are reformatted (e.g., run "bun run lint:fix"); this
will correct spacing/line breaks around the isNetworkError declaration and the
multi-line return with the error property.
- Around line 700-701: The variable hasNoData is declared but never used; remove
the unused declaration (the line defining hasNoData) or incorporate it into the
pass/fail logic if intended (e.g., combine with isSvgOrValid or the existing
checks on res/body). Locate the declaration of hasNoData near the isSvgOrValid
assignment in the test harness (variables res and body) and either delete that
hasNoData line or update the conditional checks that determine pass/fail to
reference hasNoData as needed.

---

Outside diff comments:
In `@cli/selftune/dashboard-server.ts`:
- Around line 14-26: The import block at the top of dashboard-server.ts is not
organized/sorted causing pipeline failures; reorder and group the imports
(built-ins first like node:fs/node:path, then project-local imports) and
alphabetize within groups so imports such as existsSync, readFileSync, dirname,
join, resolve, EVOLUTION_AUDIT_LOG, QUERY_LOG, SKILL_LOG, TELEMETRY_LOG,
getLastDeployedProposal, readDecisions, computeMonitoringSnapshot,
computeStatus, StatusResult, doctor, findSkillBadgeData, BadgeData,
renderBadgeSvg, formatBadgeOutput, and BadgeFormat are consistently ordered;
then run your formatter/linter auto-fix (e.g., prettier and/or eslint --fix) to
apply the change.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 8e53f333-d5a5-4dba-b2ab-5e266b59846c

📥 Commits

Reviewing files that changed from the base of the PR and between 04fe24a and 1a89f26.

⛔ Files ignored due to path filters (2)

assets/logo.svg is excluded by !**/*.svg
assets/skill-health-badge.svg is excluded by !**/*.svg

📒 Files selected for processing (24)

.gitignore
README.md
ROADMAP.md
cli/selftune/badge/badge-data.ts
cli/selftune/badge/badge-svg.ts
cli/selftune/badge/badge.ts
cli/selftune/contribute/bundle.ts
cli/selftune/contribute/contribute.ts
cli/selftune/dashboard-server.ts
cli/selftune/index.ts
cli/selftune/types.ts
dashboard/index.html
docs/design-docs/index.md
docs/strategy/icp-gtm-strategy.md
docs/strategy/openclaw-integration.md
lint-architecture.ts
package.json
skill/Workflows/Badge.md
tests/badge/badge-svg.test.ts
tests/badge/badge.test.ts
tests/contribute/bundle.test.ts
tests/contribute/contribute.test.ts
tests/dashboard/badge-routes.test.ts
tests/sandbox/run-sandbox.ts

💤 Files with no reviewable changes (2)

docs/strategy/openclaw-integration.md
docs/strategy/icp-gtm-strategy.md

cli/selftune/badge/badge.ts

coderabbitai · 2026-03-05T07:46:20Z

cli/selftune/dashboard-server.ts

+function computeStatusFromLogs(): StatusResult {
+  const telemetry = readJsonl<SessionTelemetryRecord>(TELEMETRY_LOG);
+  const skillRecords = readJsonl<SkillUsageRecord>(SKILL_LOG);
+  const queryRecords = readJsonl<QueryLogRecord>(QUERY_LOG);
+  const auditEntries = readJsonl<EvolutionAuditEntry>(EVOLUTION_AUDIT_LOG);
+  const doctorResult = doctor();
+  return computeStatus(telemetry, skillRecords, queryRecords, auditEntries, doctorResult);
+}


🧹 Nitpick | 🔵 Trivial

Duplicate log reading in computeStatusFromLogs.

This function reads the same log files that collectData() reads. For /badge/ and /report/ endpoints, consider caching or sharing the data load to avoid redundant I/O on each request.

Not blocking, but worth considering if these endpoints see high traffic.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@cli/selftune/dashboard-server.ts` around lines 128 - 135, computeStatusFromLogs currently re-reads TELEMETRY_LOG, SKILL_LOG, QUERY_LOG, and EVOLUTION_AUDIT_LOG on every call which duplicates the I/O already performed by collectData; refactor so computeStatusFromLogs reuses cached data or the result from collectData instead of calling readJsonl repeatedly — either (a) change callers (e.g., badge/report handlers) to call collectData once and pass the telemetry/skill/query/audit arrays into computeStatus (or a new computeStatusFromData wrapper), or (b) implement a small module-level memoization in computeStatusFromLogs (with optional TTL or explicit invalidation) that stores the last-read telemetry, skillRecords, queryRecords, and auditEntries and returns computeStatus using that cache; preserve existing computeStatus(telemetry, skillRecords, queryRecords, auditEntries, doctorResult) call and avoid duplicating readJsonl invocations.

cli/selftune/dashboard-server.ts

dashboard/index.html

README.md

tests/badge/badge.test.ts

coderabbitai · 2026-03-05T07:46:20Z

tests/dashboard/badge-routes.test.ts

+  describe("GET /badge/:skillName", () => {
+    it("returns SVG content type for unknown skill", async () => {
+      const res = await fetch(`http://localhost:${server.port}/badge/nonexistent-skill`);
+      expect(res.status).toBe(404);
+      expect(res.headers.get("content-type")).toContain("image/svg+xml");
+      const body = await res.text();
+      expect(body).toContain("<svg");
+      expect(body).toContain("not found");
+    });
+
+    it("returns valid SVG badge (not JSON error)", async () => {
+      const res = await fetch(`http://localhost:${server.port}/badge/nonexistent-skill`);
+      const body = await res.text();
+      // Should be valid SVG, not a JSON error
+      expect(body.startsWith("<svg")).toBe(true);
+    });
+
+    it("includes Cache-Control no-cache header", async () => {
+      const res = await fetch(`http://localhost:${server.port}/badge/test-skill`);
+      expect(res.headers.get("cache-control")).toBe("no-cache, no-store");
+    });
+
+    it("includes CORS headers", async () => {
+      const res = await fetch(`http://localhost:${server.port}/badge/test-skill`);
+      expect(res.headers.get("access-control-allow-origin")).toBe("*");
+    });
+
+    it("returns text/plain for ?format=markdown", async () => {
+      const res = await fetch(`http://localhost:${server.port}/badge/nonexistent?format=markdown`);
+      // For unknown skills, still returns SVG 404 (badge not found)
+      // But for known skills would return text/plain
+      expect(res.status).toBe(404);
+    });
+  });


🧹 Nitpick | 🔵 Trivial

Tests only exercise error paths—consider adding happy-path coverage.

All /badge/:skillName tests use nonexistent skills. There's no test verifying a valid badge response (200 status, correct SVG content for a known skill). Similarly, the ?format=markdown test only checks 404 behavior, not the actual text/plain content-type when a skill exists.

Consider adding tests with mocked or real skill data to cover successful responses.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@tests/dashboard/badge-routes.test.ts` around lines 32 - 65, Add happy-path tests to the existing describe("GET /badge/:skillName") suite to exercise successful badge responses: create at least one test that requests /badge/test-skill (or a known skill) and asserts status 200, content-type includes "image/svg+xml", and the body starts with "<svg" and contains the expected skill text; add another test for ?format=markdown that hits a known skill and asserts status 200, content-type "text/plain" and that the body contains the expected markdown badge text. Ensure these tests set up or mock the necessary skill data/resource (so the route handler for GET /badge/:skillName returns a real badge) and reuse the same fetch pattern used in the file (server.port) so they integrate with the existing fixtures.

coderabbitai · 2026-03-05T07:46:20Z

tests/dashboard/badge-routes.test.ts

+  describe("GET /report/:skillName", () => {
+    it("returns 404 for unknown skill", async () => {
+      const res = await fetch(`http://localhost:${server.port}/report/nonexistent-skill`);
+      expect(res.status).toBe(404);
+    });
+
+    it("includes CORS headers", async () => {
+      const res = await fetch(`http://localhost:${server.port}/report/test-skill`);
+      expect(res.headers.get("access-control-allow-origin")).toBe("*");
+    });
+
+    it("returns text/plain for missing skill", async () => {
+      const res = await fetch(`http://localhost:${server.port}/report/nonexistent`);
+      expect(res.headers.get("content-type")).toContain("text/plain");
+    });
+  });


🧹 Nitpick | 🔵 Trivial

Same observation: /report/:skillName tests lack happy-path coverage.

All tests use nonexistent skills. No test verifies that a valid skill returns 200 with text/html content-type.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@tests/dashboard/badge-routes.test.ts` around lines 67 - 82, Add a happy-path test in the describe("GET /report/:skillName") block that requests a known existing skill and asserts a 200 status and that the Content-Type header contains "text/html"; locate the block around the existing tests for GET /report/:skillName (in tests/dashboard/badge-routes.test.ts) and add an it(...) that fetches `http://localhost:${server.port}/report/<existing-skill>` (use whatever fixture or seed skill your test harness provides, e.g., "test-skill" if created in setup) and asserts res.status === 200 and res.headers.get("content-type").includes("text/html").

tests/sandbox/run-sandbox.ts

…ation - Remove unused BadgeData/BadgeFormat imports (CI lint failure) - Replace non-null assertions with optional chaining in tests - Reorder format validation before defaulting in badge CLI - Validate format query param in dashboard server - Add aria-hidden to decorative SVG in dashboard - Remove hardcoded TypeScript version from README badge - Remove unused hasNoData variable in sandbox tests - Fix formatting in sandbox error handling Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

coderabbitai

Actionable comments posted: 2

♻️ Duplicate comments (1)

cli/selftune/dashboard-server.ts (1)
128-135: 🧹 Nitpick | 🔵 Trivial

Duplicate log I/O on badge/report endpoints.

computeStatusFromLogs re-reads all four log files that collectData() also reads. For the /badge/ and /report/ endpoints, this means redundant disk I/O on every request. Consider caching or sharing the data load if these endpoints see traffic.

Not blocking, but worth noting for performance.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cli/selftune/dashboard-server.ts` around lines 128 - 135,
computeStatusFromLogs currently re-reads TELEMETRY_LOG, SKILL_LOG, QUERY_LOG and
EVOLUTION_AUDIT_LOG on every call causing duplicate I/O with collectData();
avoid this by sharing cached data: modify computeStatusFromLogs to accept
preloaded arrays (telemetry, skillRecords, queryRecords, auditEntries) or add a
module-level cache populated by collectData(); update the /badge/ and /report/
handlers to call collectData() once (or read cache) and pass that data into
computeStatusFromLogs (or use the cache) and implement a simple TTL or
invalidation after writes so cached reads stay fresh; keep doctor() call where
needed and ensure function names computeStatusFromLogs, collectData, and
computeStatus are updated accordingly.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/badge/badge.test.ts`:
- Around line 8-11: The imports at the top of the test file are not sorted
according to the project's lint rules; reorder them (or run the auto-fixer) so
the import groups and alphabetical order match the linter: external test helpers
(beforeEach, describe, expect, it), type imports (SkillStatus, StatusResult),
then internal module imports (computeBadgeData, findSkillBadgeData,
formatBadgeOutput, renderBadgeSvg). Fix by running the auto-fixer command `bun
run lint:fix` or manually sort the import lines referencing those symbols to
satisfy the import organization rule.

In `@tests/sandbox/run-sandbox.ts`:
- Around line 636-722: The CI failure is due to formatting in the live badge
tests around the async blocks that produce badgeLiveResult and
badgeLiveFallbackResult; run the project formatter (bun run lint:fix) or
reformat the code in those blocks so spacing/indentation matches the project's
style (ensure the try/catch blocks, object literals returned in both async
IIFEs, and the ternary/error lines in badgeLiveResult and
badgeLiveFallbackResult are correctly indented and wrapped); verify after
formatting that the symbols badgeLiveResult, badgeLiveFallbackResult and their
returned object shapes remain unchanged and re-run the lint/CI.

---

Duplicate comments:
In `@cli/selftune/dashboard-server.ts`:
- Around line 128-135: computeStatusFromLogs currently re-reads TELEMETRY_LOG,
SKILL_LOG, QUERY_LOG and EVOLUTION_AUDIT_LOG on every call causing duplicate I/O
with collectData(); avoid this by sharing cached data: modify
computeStatusFromLogs to accept preloaded arrays (telemetry, skillRecords,
queryRecords, auditEntries) or add a module-level cache populated by
collectData(); update the /badge/ and /report/ handlers to call collectData()
once (or read cache) and pass that data into computeStatusFromLogs (or use the
cache) and implement a simple TTL or invalidation after writes so cached reads
stay fresh; keep doctor() call where needed and ensure function names
computeStatusFromLogs, collectData, and computeStatus are updated accordingly.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: bb8b4342-4ebc-4c42-9976-f06e3bcd2926

📥 Commits

Reviewing files that changed from the base of the PR and between 1a89f26 and 35b7b88.

📒 Files selected for processing (7)

README.md
cli/selftune/badge/badge.ts
cli/selftune/dashboard-server.ts
dashboard/index.html
tests/badge/badge-svg.test.ts
tests/badge/badge.test.ts
tests/sandbox/run-sandbox.ts

coderabbitai · 2026-03-05T07:57:53Z

tests/badge/badge.test.ts

+import { beforeEach, describe, expect, it } from "bun:test";
+import type { SkillStatus, StatusResult } from "../../cli/selftune/status.js";
+import { computeBadgeData, findSkillBadgeData } from "../../cli/selftune/badge/badge-data.js";
+import { formatBadgeOutput, renderBadgeSvg } from "../../cli/selftune/badge/badge-svg.js";


⚠️ Potential issue | 🟡 Minor

Pipeline failure: Import organization required.

CI reports imports are not sorted. Run bun run lint:fix to fix.

🧰 Tools

🪛 GitHub Actions: CI

[error] 8-11: Assist/organizeImports: The imports and exports are not sorted.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@tests/badge/badge.test.ts` around lines 8 - 11, The imports at the top of the test file are not sorted according to the project's lint rules; reorder them (or run the auto-fixer) so the import groups and alphabetical order match the linter: external test helpers (beforeEach, describe, expect, it), type imports (SkillStatus, StatusResult), then internal module imports (computeBadgeData, findSkillBadgeData, formatBadgeOutput, renderBadgeSvg). Fix by running the auto-fixer command `bun run lint:fix` or manually sort the import lines referencing those symbols to satisfy the import organization rule.

tests/sandbox/run-sandbox.ts

- Auto-format with biome 2.4.5 (was running 0.3.3 locally) - Sort imports in dashboard-server.ts, badge-svg.test.ts, badge.test.ts - Replace non-null assertions with helper/optional chaining in pre-gates and pareto tests - Prefix unused destructured vars in run-with-llm.ts - Format contribute.ts long lines and dashboard-server.ts ternaries Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…sability/unit-test commands Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

coderabbitai

Actionable comments posted: 5

♻️ Duplicate comments (1)

cli/selftune/dashboard-server.ts (1)

128-135: 🧹 Nitpick | 🔵 Trivial

Avoid duplicated synchronous log reads in status computation.

computeStatusFromLogs() re-reads the same JSONL sources that are already loaded by collectData(). This duplicates I/O paths and keeps two parallel data-loading flows to maintain.

♻️ Refactor sketch

+function computeStatusFromData(data: Pick<DashboardData, "telemetry" | "skills" | "queries" | "evolution">): StatusResult {
+  const doctorResult = doctor();
+  return computeStatus(data.telemetry, data.skills, data.queries, data.evolution, doctorResult);
+}
+
 function computeStatusFromLogs(): StatusResult {
-  const telemetry = readJsonl<SessionTelemetryRecord>(TELEMETRY_LOG);
-  const skillRecords = readJsonl<SkillUsageRecord>(SKILL_LOG);
-  const queryRecords = readJsonl<QueryLogRecord>(QUERY_LOG);
-  const auditEntries = readJsonl<EvolutionAuditEntry>(EVOLUTION_AUDIT_LOG);
-  const doctorResult = doctor();
-  return computeStatus(telemetry, skillRecords, queryRecords, auditEntries, doctorResult);
+  const data = collectData();
+  return computeStatusFromData(data);
 }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@cli/selftune/dashboard-server.ts` around lines 128 - 135,
computeStatusFromLogs currently duplicates I/O by re-reading JSONL files instead
of reusing the data loaded by collectData; change computeStatusFromLogs to
accept the already-collected data (telemetry, skillRecords, queryRecords,
auditEntries) or call collectData() once and pass its results into computeStatus
so all reads happen in one place, and remove the direct readJsonl calls inside
computeStatusFromLogs (refer to computeStatusFromLogs, collectData,
computeStatus, and readJsonl to locate and update the code).

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@cli/selftune/dashboard-server.ts`:
- Around line 434-453: The missing-skill branch returns an SVG regardless of
requested format; update the handler that checks badgeData (the branch using
renderBadgeSvg and returning new Response(svg, ...)) to return the appropriate
not-found payload based on the incoming format parameter (e.g., when format ===
"markdown" return plain-text/markdown message, when format === "url" return the
plain URL/text, and for image formats continue returning the SVG). Locate the
code that builds notFoundData and the response (uses renderBadgeSvg, BadgeData,
and corsHeaders()) and add conditional response formatting to produce consistent
Content-Type and body for "markdown" and "url" formats (and reuse the same 404
status and cache/CORS headers).
- Line 238: Escape the value before injecting it into the HTML template: replace
the direct insertion of statusResult.lastSession with an HTML-escaped version
(e.g., use or add a small helper like escapeHtml and call
escapeHtml(statusResult.lastSession)) and preserve the fallback (the em-dash)
when lastSession is null/undefined. Update the template expression where
statusResult.lastSession is interpolated so it uses the escaped value; ensure
the helper properly encodes &, <, >, ", and ' to prevent HTML/script injection.

In `@README.md`:
- Around line 1-16: The README duplicates header elements; consolidate into a
single hero/header by removing duplicate badges and one logo: keep only one logo
(choose either assets/logo.svg or ./images/selftune-logo.svg) and remove
repeated badges for CI, npm version, License, and Zero Dependencies so each
badge appears once; update the remaining header block (where badges and logo
remain) to include all unique badges and a single logo and delete the alternate
header block to avoid visual/maintenance redundancy.

In `@tests/badge/badge.test.ts`:
- Around line 17-43: Extract the duplicated fixture factories makeSkillStatus,
makeStatusResult and the fixtureCounter into a shared module (e.g., export them
from a new fixtures module) and replace the in-file definitions with imports;
ensure the exported symbols retain the same names (makeSkillStatus,
makeStatusResult, fixtureCounter) and types (SkillStatus, StatusResult) so tests
like badge.test.ts and badge-svg.test.ts simply import { makeSkillStatus,
makeStatusResult, fixtureCounter } from the new module and remove the local
duplicates.

In `@tests/sandbox/run-sandbox.ts`:
- Around line 631-748: Both live smoke-test catch blocks (the IIFEs producing
badgeLiveResult and badgeLiveFallbackResult) mark transient network errors as
failures; change them to treat "Network unreachable (skippable)" as a
skipped/pass case by detecting the same network condition used elsewhere and
returning passed: true, exitCode: 0 (and keep the error text for diagnostics).
Concretely, in the first IIFE (badgeLiveResult) update the catch branch that now
computes isNetworkError to return exitCode: 0 and passed: true when
isNetworkError is true (keep stderr/fullStdout empty and preserve the error
string), and in the second IIFE (badgeLiveFallbackResult) add the same network
detection in its catch and return exitCode: 0 and passed: true for
network/unreachable errors instead of passed: false.

---

Duplicate comments:
In `@cli/selftune/dashboard-server.ts`:
- Around line 128-135: computeStatusFromLogs currently duplicates I/O by
re-reading JSONL files instead of reusing the data loaded by collectData; change
computeStatusFromLogs to accept the already-collected data (telemetry,
skillRecords, queryRecords, auditEntries) or call collectData() once and pass
its results into computeStatus so all reads happen in one place, and remove the
direct readJsonl calls inside computeStatusFromLogs (refer to
computeStatusFromLogs, collectData, computeStatus, and readJsonl to locate and
update the code).

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: a5f4d301-4ed4-4959-b3d8-523a87963053

📥 Commits

Reviewing files that changed from the base of the PR and between 35b7b88 and 5387fd4.

📒 Files selected for processing (13)

README.md
cli/selftune/contribute/contribute.ts
cli/selftune/dashboard-server.ts
cli/selftune/index.ts
cli/selftune/types.ts
docs/design-docs/index.md
lint-architecture.ts
tests/badge/badge-svg.test.ts
tests/badge/badge.test.ts
tests/evolution/pareto.test.ts
tests/grading/pre-gates.test.ts
tests/sandbox/docker/run-with-llm.ts
tests/sandbox/run-sandbox.ts

cli/selftune/dashboard-server.ts

README.md

tests/badge/badge.test.ts

tests/sandbox/run-sandbox.ts

…edup README, shared fixtures, flaky live tests - Escape statusResult.lastSession in report HTML to prevent injection - Return format-aware response for not-found badges (markdown/url) - Consolidate duplicate logo and badges in README into single header - Extract shared makeSkillStatus/makeStatusResult to tests/badge/fixtures.ts - Treat network errors in live smoke tests as skipped (not failed) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

coderabbitai

♻️ Duplicate comments (1)

tests/sandbox/run-sandbox.ts (1)

733-745: ⚠️ Potential issue | 🟠 Major

Do not treat all fallback fetch exceptions as skippable.

At Line 733, the catch path marks every exception as passed. That can hide real regressions (non-network failures). Only network/timeout errors should be skipped.

Suggested fix

       } catch (err) {
         const durationMs = Math.round(performance.now() - start);
+        const isNetworkError =
+          err instanceof Error && (err.name === "AbortError" || err.message.includes("fetch"));
         return {
           name,
           command,
-          exitCode: 0,
-          passed: true,
+          exitCode: isNetworkError ? 0 : 1,
+          passed: isNetworkError,
           durationMs,
           stdout: "",
           stderr: "",
           fullStdout: "",
-          error: `Network unreachable (skipped): ${err instanceof Error ? err.message : String(err)}`,
+          error: isNetworkError
+            ? `Network unreachable (skipped): ${err instanceof Error ? err.message : String(err)}`
+            : err instanceof Error
+              ? err.message
+              : String(err),
         };
       }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@tests/sandbox/run-sandbox.ts` around lines 733 - 745, The catch currently
treats every exception as skippable and returns passed: true; change this to
only mark as skipped for network/timeout errors by detecting known
network/timeout codes/messages (e.g. ETIMEDOUT, ECONNREFUSED, ENOTFOUND,
EAI_AGAIN, "Network unreachable", "timeout") or by adding a small helper
isNetworkError(err) and calling it from the catch; if isNetworkError(err) return
the existing skipped response (passed: true, exitCode: 0, error message),
otherwise return a failure response (passed: false, non-zero exitCode, include
err.message and full error info in stderr/fullStdout) so real regressions are
not hidden. Ensure you reference and update the same returned fields (name,
command, exitCode, passed, durationMs, stdout, stderr, fullStdout, error) in
this catch block.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@tests/sandbox/run-sandbox.ts`:
- Around line 733-745: The catch currently treats every exception as skippable
and returns passed: true; change this to only mark as skipped for
network/timeout errors by detecting known network/timeout codes/messages (e.g.
ETIMEDOUT, ECONNREFUSED, ENOTFOUND, EAI_AGAIN, "Network unreachable", "timeout")
or by adding a small helper isNetworkError(err) and calling it from the catch;
if isNetworkError(err) return the existing skipped response (passed: true,
exitCode: 0, error message), otherwise return a failure response (passed: false,
non-zero exitCode, include err.message and full error info in stderr/fullStdout)
so real regressions are not hidden. Ensure you reference and update the same
returned fields (name, command, exitCode, passed, durationMs, stdout, stderr,
fullStdout, error) in this catch block.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: b973cefc-9ed0-4faf-b1d9-69b2614e9eaf

📥 Commits

Reviewing files that changed from the base of the PR and between 5387fd4 and bc6ceb9.

📒 Files selected for processing (6)

README.md
cli/selftune/dashboard-server.ts
tests/badge/badge-svg.test.ts
tests/badge/badge.test.ts
tests/badge/fixtures.ts
tests/sandbox/run-sandbox.ts

WellDunDun and others added 4 commits March 3, 2026 20:25

chore: Remove hosted service and clean up references

d1fc582

Remove service/ directory, deployment workflow, and related test files. Update lint-architecture.ts and tsconfig.json to remove service references. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

coderabbitai bot reviewed Mar 5, 2026

View reviewed changes

WellDunDun and others added 2 commits March 5, 2026 11:02

merge: Resolve conflicts merging origin/dev — keep badge + eval/compo…

5387fd4

…sability/unit-test commands Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

coderabbitai bot reviewed Mar 5, 2026

View reviewed changes

WellDunDun merged commit 03e3ea2 into dev Mar 5, 2026
5 checks passed

coderabbitai bot mentioned this pull request Mar 7, 2026

Implement dashboard audit: 4-state status, eval feed, search, data contract #23

Merged

WellDunDun deleted the custom/prefix/router-1772543690607 branch March 8, 2026 12:11

coderabbitai bot mentioned this pull request Mar 12, 2026

feat: add local dashboard SPA with React + Vite #39

Open

Conversation

WellDunDun commented Mar 5, 2026

Summary

Changes

Uh oh!

coderabbitai bot commented Mar 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

❌ Failed checks (2 warnings)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

coderabbitai bot commented Mar 5, 2026 •

edited

Loading