Skip to content

devkade/ilchul

Repository files navigation

Kapi

Pi-native thin workflow harness for explicit, evidence-backed workflows.

Shout out to Ouroboros, OMX, pi-mono, and Chedex. Kapi borrows the spirit of durable workflow discipline from those projects while staying Pi-native and thin.

Kapi keeps ordinary Pi turns lightweight. It adds state, artifacts, worker awareness, and verification only after an explicit /kapi-* command or matching agent tool. The design is Chedex-inspired, but Pi-scoped: Kapi uses Pi extension surfaces instead of importing Chedex's Codex install/runtime machinery.

At A Glance

Use Kapi when a task needs one or more of these:

  • resumable requirements, planning, execution, or validation state;
  • durable artifacts such as context.md, interview.md, IMPLEMENTATION_PLAN.md, handoff.json, contract.md, benchmark.sh, ledger.jsonl, merge-plan.md, integration-report.md, and verify.md;
  • evidence-gated completion instead of narrative claims;
  • explicit worker planning for tmux terminals or isolated git worktrees;
  • lightweight safety rails around Ralph validation evidence and Integrate merge boundaries.

The operating model is simple:

  • ordinary work stays transparent;
  • one non-terminal workflow owns a workspace at a time;
  • human slash workflow commands intentionally switch workflows and hand off previous context;
  • service/tool workflow starts require explicit replace or /kapi-clear;
  • selected or terminal workflow inspection does not steal active ownership;
  • artifacts are durable checkpoints, not conversational scratchpads.

What Ships

Durable Modes

  • /kapi-deep-interview — high-rigor requirements work with context.md, interview.md, decision-report.md, and verify.md; it owns context extraction, not implementation specs.
  • /kapi-autoresearch — governed optimization setup and experiment ledger work with contract.md, benchmark.sh, ledger.jsonl, ideas.md, checks.sh, decision-report.md, and verify.md.
  • /kapi-ralph — skill-driven planning/build execution that interviews when context is insufficient, keeps state.json.ralphState as the loop state, performs one-task iterations, and records hard verification evidence.
  • /kapi-integrate — Kapi-owned branch/worktree integration toward dev with dependency, conflict, merge-plan, and verification reporting.

Support Commands

  • /kapi-status [workflow|slug] — inspect active or selected workflow status; /kapi-status all lists recorded workflows in the current project.
  • /kapi-status validate [workflow|slug] — check lifecycle, evidence, worker, and artifact consistency.
  • /kapi-status artifacts [list [workflow|slug]] | read <name> [workflow|slug] | write <name> [--replace] [--required] -- <content> — inspect or update workflow artifacts.
  • /kapi-status evidence [list [workflow|slug]] | add ... — inspect or record validation evidence.
  • /kapi-status complete ... — complete the active workflow with evidence.
  • /kapi-status fail --reason <reason> ... — fail the active workflow with evidence.
  • /kapi-status prepare-worker --tmux — tmux-focused worker preparation shortcut.
  • /kapi-status prepare-worker --worktree — git-worktree-focused worker preparation shortcut.
  • /kapi-status [workflow|slug] — inspect current or selected workflow status without taking active ownership.
  • /kapi-clear [--target workflow|slug] [--reason reason] — detach the active workflow attachment, or detach only when the selected workflow is active.

Run /kapi-status help for compact command syntax. Use focused support commands for artifact, evidence, validation, completion, failure, and worker operations.

Local CLI Runtime

The package exposes a kapi bin for explicit runtime control outside Pi turns. From a repo checkout, use the package-local command without global setup: npm exec -- kapi --help.

For a normal shell command during local development, run the setup helper once so npm installs the portable bin shim on PATH:

npm run setup:cli
kapi --help

For an isolated install prefix, pass --prefix; POSIX npm shims live under <prefix>/bin, while Windows shims live at <prefix>.

The bin shim runs Kapi through the repo-local TypeScript entrypoint, so the old direct workaround (node --import tsx ./src/cli/kapi-cli.ts ...) is not required for normal use.

kapi start ralph "<goal>" [--from <repo>] [--slug <slug>] [--dry-run] [--json]
kapi start autoresearch "<goal>" [--from <repo>] [--slug <slug>] [--dry-run] [--json]
kapi status [slug] [--from <repo>] [--json]
kapi list [--from <repo>] [--json]
kapi attach <slug> [--from <repo>] [--json]
kapi probe <slug> [--from <repo>] [--json]
kapi report <slug> [--from <repo>] [--json] [--lines <n>]
kapi doctor [--from <repo>] [--json]

start creates a same-slug worktree, branch, tmux session, and registry entry under .kapi/registry/, launches Pi, waits for the tmux readiness marker, and dispatches the mode-specific planning prompt. The registry in the base repository is a control-plane pointer; execution truth and workflow artifacts live in the created worktree. Kapi rejects recorded slug, worktree, branch, and tmux collisions while the recorded owner is still non-terminal and not probed as stale-registry; entries with lifecycle completed, failed, cancelled, or inactive, and entries whose Pi launch status is stale-registry, remain inspectable until a same-slug retry replaces that registry record, and do not permanently block retries. list --from <repo> --json shows all recorded workers for that supervised repo, while each slug remains independently inspectable through status, report, probe, and attach. probe refreshes Pi readiness from tmux capture output, preserving exact KAPI_READY <slug> <nonce> success while distinguishing missing tmux sessions (stale-registry) from live panes without the marker (alive-but-unverified, running-with-output, or completed-output-present). attach prints the tmux attach command, and doctor checks registry consistency such as recorded worktree paths and prompt dispatch state without deleting retained worktrees.

hermes, openclaw supervised coding orchestration

Kapi's CLI runtime is the execution surface for coding work supervised by hermes, openclaw, not the final authority for accepting changes:

User request
→ hermes, openclaw classifies the work and starts bounded Kapi workers when useful
→ Kapi creates isolated worktrees, tmux sessions, registry entries, and workflow artifacts
→ workers produce candidate diffs, artifacts, and evidence
→ hermes, openclaw inspects status, tmux output, artifacts, git diff, and verification
→ hermes, openclaw accepts, rejects, integrates, or asks for another bounded worker slice
→ the human/project owner remains the final PR/merge/deploy authority

Supervisor contract:

  • --from <repo> selects the repository being supervised; Kapi behavior must not depend on this repository being Kapi itself.
  • Registry state is a control-plane pointer; source diffs and workflow artifacts in the worker worktree are the candidate output to inspect.
  • status, list, probe, doctor, attach, and future reporting commands are read-only supervisor inspection surfaces unless a command explicitly says otherwise.
  • Kapi worker output is incomplete until hermes, openclaw reviews the diff, checks evidence, runs verification, and decides how to integrate it.
  • Kapi does not create final PR decisions, merge, deploy, or grant review bots implementation authority.

Issue #37 is intentionally split into reviewable child slices: executable CLI setup (#38), repo-generic planning (#39), probe/readiness reliability (#40), doctor diagnostics (#41), worker reporting (#42), and multi-worker orchestration (#43).

Review CLI Harness

kapi-review github-pr emits non-posting JSON for kapi-agent review/check automation. The harness now reports semantic change metrics, a TRIVIAL/LOW_RISK/STANDARD/HIGH_RISK/CRITICAL risk profile, required evidence gates, structured finding validation, bundled repo-local review guidance, isolated read-only runner prompt provenance, optional --runner-output-file structured findings ingestion, optional --runner-command isolated runner invocation, and the legacy changed-line context. Runner commands receive a temporary KAPI_REVIEW_RUNNER_INPUT JSON file containing the read-only runner metadata, risk/context, review body, and the bundled-guidance prompt material, execute from that temporary workspace with a minimal sanitized environment, then return structured findings JSON for deterministic validation. Size is a risk signal rather than the only decision: docs/generated-heavy changes can pass the size gate when semantic source risk is low, while sensitive paths require stronger evidence even when small. Low-confidence BLOCKER findings are normalized to non-blocking QUESTION findings so uncertain reviewer signals do not masquerade as merge-blocking defects. GitHub merge enforcement for formal kapi-agent approval lives in .github/workflows/kapi-agent-formal-approval-gate.yml; require require formal kapi-agent approval plus kapi-agent/review in branch protection/rulesets. Re-review requests after stale/non-approving kapi-agent reviews must put @kapi-agent review, the current head SHA, What changed, Why this closes the prior feedback, and Verification in the same author comment; see docs/kapi-agent-approval-gate.md.

Agent Tools

Kapi exposes the same conceptual operations as tools for AI agents:

  • kapi_start_workflow
  • kapi_get_status
  • kapi_resume_workflow
  • kapi_get_workflow_contract
  • kapi_list_workflows
  • kapi_list_artifacts
  • kapi_list_evidence
  • kapi_list_workers
  • kapi_validate_workflow
  • kapi_update_workflow
  • kapi_record_evidence
  • kapi_complete_workflow
  • kapi_fail_workflow
  • kapi_write_artifact
  • kapi_write_artifacts — batch checkpoint writes for multiple active-workflow artifacts.
  • kapi_read_artifact
  • kapi_get_worker_capabilities
  • kapi_plan_worker_strategy
  • kapi_prepare_worker
  • kapi_prepare_tmux_worker
  • kapi_prepare_worktree_worker
  • kapi_dispatch_worker_task
  • kapi_refresh_worker_status
  • kapi_clear_workflow

Artifact names reject / and \ path separators, tracked artifact paths must stay under the workflow artifact root, and reads distinguish missing files from existing empty artifacts.

Kapi does not own subagent orchestration. Use pi-subagents for agent delegation, chains, parallel subagents, and async/forked-context work. Kapi may record those results as artifacts or evidence, but it does not create or manage subagents.

Workflow Map

Lightweight Or Artifact-Backed Workflows

  • Direct Pi turns: no Kapi state unless explicitly requested.
  • /kapi-deep-interview: durable requirements trail under .kapi/workflows/deep-interview/<slug>/; typically context.md, interview.md, decision-report.md, and verify.md; downstream Ralph or Autoresearch owns implementation specs/plans. Completion is proposal-gated by the Deep Interview readiness judge, which can run inline or through KAPI_DEEP_INTERVIEW_JUDGE=child-rpc to isolate snapshot review in a child process.
  • /kapi-autoresearch: durable optimization contract under .kapi/workflows/autoresearch/<slug>/; prepares contract.md, benchmark.sh, ledger.jsonl, ideas.md, checks.sh, decision-report.md, and verify.md for bounded experiment loops.
  • /kapi-ralph: durable planning/build state under .kapi/workflows/ralph/<slug>/; uses AGENTS.md, IMPLEMENTATION_PLAN.md, handoff.json, decision-report.md, and verify.md.
  • /kapi-integrate: durable integration state under .kapi/workflows/integrate/<slug>/; uses merge-plan.md, conflict-matrix.md, integration-report.md, decision-report.md, and verify.md.

Governed Workflows

  • /kapi-autoresearch: governed optimization loop with stable comparison boundaries and a durable ledger.
  • /kapi-ralph: uses the kapi-ralph skill to assess context, interview when needed, then plan/build exactly one highest-priority unfinished task at a time with verifier closeout.
  • /kapi-integrate: governed integration for Kapi-owned branches/worktrees, with dependency, conflict, and verification reporting before any merge toward dev.

Artifact Cadence

Artifacts are durable checkpoints, not conversational scratchpads. Workflow contracts expose an artifact cadence so agents write context/spec/plan/progress/verify files at meaningful checkpoints, phase boundaries, blockers, verification results, or completion gates instead of after every turn.

Typical cadence:

  • requirements workflows write after question batches or decision-critical answers;
  • planning workflows write after the plan stabilizes or is approved;
  • governed execution writes at phase boundaries, milestones, blockers, and verification points;
  • autoresearch loops write one bounded experiment result per iteration;
  • Ralph writes RED/GREEN or verifier evidence at task, blocker, validation, and closeout boundaries.

Use kapi_write_artifacts when one checkpoint updates multiple files.

State And Artifact Model

All workflows share the lifecycle vocabulary from GOAL.md:

inactive -> active -> blocked|verifying|completed|failed|cancelled

Workflow-specific phases live under the shared lifecycle. Artifacts are stored under .kapi/workflows/<short-workflow-name>/<slug>/, with the kapi- prefix removed from folder names; /kapi-ralph and /kapi-ralph both use .kapi/workflows/ralph/<slug>/. .kapi/active.json points at the current non-terminal workflow.

State behavior:

  • explicit slugs are sanitized before artifact paths are built;
  • active-workflow conflicts and mutation failures report as Kapi UI/tool feedback;
  • stale active pointers to terminal workflows, missing state files, corrupt JSON, or out-of-workspace state paths are cleared transparently;
  • corrupt recorded workflow state is skipped during history listing;
  • terminal workflow inspection does not reactivate the workflow or steal unrelated active ownership.

Context behavior:

  • workflow start injects a project context snapshot into the prompt;
  • Kapi persists that snapshot to verify.md only for workflows that track verify.md;
  • brownfield discovery includes guidance headings, package scripts, verification scripts, dependencies, tsconfig, git state, source/test counts, architecture signals, sample source files, and common source/test directories;
  • generated/reference directories such as references/, .omx/, .kapi/, dist/, build/, and coverage/ are excluded from active project guidance;
  • greenfield work receives a checklist for purpose, users/stakeholders, constraints, success criteria, architecture direction, acceptance criteria, initial execution plan, and non-goals.

Governed artifacts:

  • verify.md is the resumable progress and closeout verification record for governed workflows that track it;
  • handoff.json is the plan-to-execution ratchet for governed work;
  • verify.md is the durable evidence log;
  • governed handoff.json follows Chedex-style admission fields including execution_workflow, approved_at, acceptance_criteria, verification_targets, delegation_roster, source_artifacts, and structured approvals;
  • governed verify.md records closeout verification state plus a verifier review record when completion is satisfied;
  • /kapi-ralph and /kapi-ralph use the shared Kapi state.json with nested ralphState as their loop state source of truth instead of a separate verify.md.

Validation checks lifecycle schema consistency, required artifacts, state artifact freshness, evidence shape, worker consistency, and workflow-specific closeout rules. Completed governed workflows require verifier pass evidence plus required architect/verifier handoff approvals when handoff artifacts are tracked. Completed /kapi-ralph workflows require a pi-subagents reviewer subagent verdict. Completed /kapi-ralph workflows require RED and GREEN evidence.

RunContract Harness Boundary

RunContract Harness is a generic, pre-workflow projection over existing Kapi run state. It does not create a second source of truth and it does not persist a durable contract.json; it reads the current WorkflowState plus WorkflowDefinition and exposes a compact run/contract/evidence/artifact/completion/quality view for supervisors.

Layer split:

  1. Kapi Core / RunContract Harness owns generic run state projection, evidence records, artifact references, contract preset shape, completion criteria, advisory quality hints, and generic steering hints.
  2. Runtime and presentation adapters decide how to show that generic status through CLI, tools, widgets, reports, or other read-only supervisor surfaces.
  3. External workflow adapters may interpret generic status for a specific operating environment, but those meanings stay outside core.

Core does not own repository review assumptions, GitHub issue or PR semantics, Discord lane semantics, kapi-agent policy, merge/tracker cleanup, or Kade/Ragna authority rules. Those are adapter interpretations layered on top of the generic run contract when needed.

The current GitHub workflow adapter is read-only and additive. It maps a projected RunContract plus existing worker registry issue/PR inspection into supervisor hints for linked issue context, PR state, kapi-agent review freshness, dev merge readiness, and post-merge tracker reconciliation. It must not mutate GitHub issues, merge PRs, close trackers, or write external workflow state from core; those actions remain explicit supervisor/runtime operations outside the generic RunContract model.

Candidate vocabulary is deliberately small and additive: ContractPreset, EvidenceExpectation, CompletionCriteria, and ScoringHint. RunContract must not start as a PolicyModule or plugin runtime. Ilchul remains product/documentation branding only; reusable code, API, tool, and serialized identifiers should use semantic names such as run, contract, preset, harness, evidence, score, and steer.

Naming and storage compatibility decisions are governed by Ilchul naming and compatibility policy. In short: do not perform broad kapi -> ilchul replacements; keep current kapi commands, serialized identifiers, and .kapi storage compatible until a scoped migration issue explicitly changes them; use .ilchul only as the forward storage direction for reviewed migration work.

Implementation rhythm for the RunContract track is behavior-preserving: document the boundary first, add the generic projection second, add evidence/completion primitives third, add advisory quality hints fourth, render compact supervisor status fifth, and only then map optional external workflow adapter semantics. Each slice should keep existing workflow APIs, WorkflowState, WorkflowDefinition, artifacts, validation gates, and CLI output backward-compatible except for intentional additive fields or sections.

RunContract scoring, preset, and governance changes should use the docs/runcontract-harness-evaluator.md checklist to separate real harness quality from visible metric optimization. The checklist is advisory and does not add completion authority, runtime gates, kapi-agent policy, or score hard-blocking behavior.

Thin Harness Standard

Kapi is evaluated as a thin harness, not just a feature surface. When no workflow is active, Kapi should stay transparent: no hidden workflow activation, no workflow artifacts, no workers, no tool blocking, and no heavy UI ownership.

Thinness checks used during review:

  • default transparency for ordinary Pi turns;
  • explicit workflow activation only;
  • artifact-on-demand rather than universal logging;
  • proportional enforcement by workflow depth;
  • ownership restraint over only Kapi workflow state and Kapi-created workers;
  • Pi-native minimalism instead of external orchestration weight.

Pi-native Affordances

Kapi uses Pi extension surfaces as thin safety rails rather than a separate orchestration runtime:

  • resources_discover exposes Kapi's local skills/ and prompts/ resources.
  • before_agent_start injects hidden active-workflow context only while a workflow is active and preserves loaded Pi skill/prompt intent from systemPromptOptions.
  • tool_call blocks mutation tools in read-only review workflows and enforces RED evidence before mutation during /kapi-ralph red phase.
  • agent_end sends a follow-up message when active-workflow validation has blocking issues.
  • session_before_switch, session_before_fork, session_before_tree, session_before_compact, and session_shutdown keep workflow ownership visible across Pi session lifecycle operations without taking over ordinary turns; governed workflows get soft warnings on switch and shutdown.
  • Session labels mark workflow start/resume checkpoints for /tree, and @kapi: autocomplete suggests workflow references.

Architecture

  • src/domain: pure workflow definitions, lifecycle transitions, artifact naming, validation, and worker strategy. No Pi, filesystem, tmux, git, or process dependencies.
  • src/application: Kapi use cases, state artifact builders, and ports. Deep Interview readiness logic stays protocol-independent here so inline and child-RPC adapters share one authority core.
  • src/adapters: filesystem store, project context discovery, local tmux/git worktree worker substrate, worker closeout, and optional child-process adapters such as the Deep Interview readiness RPC reviewer.
  • src/presentation: Pi command/tool registration, parsers, messages, status UI, hooks, and hidden active-workflow context injection.

Layout

  • README.md — human-facing overview and operating model.
  • GOAL.md — completeness objective and P0-P5 gates.
  • docs/chedex-completeness.md — Chedex comparison boundary and intentional Pi-native differences.
  • docs/runcontract-harness-evaluator.md — evaluator and anti-Goodhart checklist for RunContract scoring, presets, and harness-governance changes.
  • docs/ralph-live-qa.md — operator live QA checklist for proving /kapi-ralph start, planning, approval, build, evidence, closeout, and resume behavior in a real Pi/Kapi runtime.
  • skills/kapi-workflow/SKILL.md — active-workflow behavior reminders for agents.
  • prompts/ — Kapi prompt resources exposed to Pi.
  • src/domain — pure workflow definitions, lifecycle transitions, artifact naming, validation, and worker strategy. No Pi, filesystem, tmux, git, or process dependencies.
  • src/application — Kapi use cases, state artifact builders, and ports.
  • src/adapters — filesystem store, project context discovery, local tmux/git worktree worker substrate, and worker closeout.
  • src/presentation — Pi command/tool registration, parsers, messages, status UI, hooks, and hidden active-workflow context injection.
  • test/ — node:test coverage for workflow contracts, state, hooks, presentation, workers, quality checks, and validation.
  • scripts/ — quality reporting and verification helpers.
  • references/chedex/ — local Chedex reference snapshot; not active Kapi runtime guidance.

How To Evolve Kapi

  1. Keep ordinary Pi behavior transparent by default.
  2. Add or refine workflow contracts in src/domain/workflows.ts.
  3. Keep lifecycle and validation rules in src/domain pure.
  4. Add use-case behavior through src/application ports before touching Pi presentation code.
  5. Expose human commands in src/presentation/commands.ts and agent tools in src/presentation/tools.ts.
  6. Keep prompt/skill guidance aligned with skills/kapi-workflow/SKILL.md.
  7. Update tests and run verification before review.

Presentation code should stay split by Pi-facing responsibility:

  • src/presentation/commands.ts for human slash command registration;
  • src/presentation/tools.ts for agent tool registration;
  • src/presentation/hooks.ts for Pi lifecycle/tool hooks;
  • src/presentation/parsers.ts for human command parsing;
  • src/presentation/schemas.ts for tool parameter schemas;
  • src/presentation/messages.ts for formatting;
  • src/presentation/pi-extension.ts as the composition root.

Refactors should preserve command/tool behavior, artifact formats, state compatibility, and thin-default semantics. They should reduce coupling and duplication without introducing a heavy runtime manager or command framework.

Chedex Completeness Boundary

Kapi tracks Chedex-inspired completeness without copying Chedex's Codex-specific install/runtime machinery. See docs/chedex-completeness.md for the comparison boundary, intentional differences, and P0-P5 completeness gates used by autoresearch.

Chedex-like concepts that Kapi keeps:

  • explicit workflows;
  • durable context/spec/plan/progress/handoff/verify artifacts where useful;
  • governed closeout with verifier evidence;
  • architect/verifier approval provenance for governed handoffs;
  • stop/switch visibility for broad governed work.

Chedex concepts that Kapi intentionally does not copy:

  • Codex global install/mirror machinery;
  • external tmux team runtime ownership;
  • global HUD/mailbox/linked-mode overlays;
  • Kapi-managed subagent orchestration.

Guarded Workflow Policy

Guarded workflows enforce only correctness-critical boundaries:

  • /kapi-ralph planning and review phases block direct edits plus unsafe shell mutations while allowing read-only inspection commands.
  • /kapi-ralph RED phase allows test-file mutation needed to create the failing test, but blocks production or unknown mutations until RED evidence is recorded.
  • after critic, architect, and human approval, /kapi-ralph may perform the single scoped build iteration selected from the approved plan; live QA can use a tracked docs fixture for that bounded mutation.
  • governed closeout remains evidence-gated instead of treating narrative claims as proof.

Verification

npm run verify

npm run verify runs the standard local gate: npm test, npm run check, and npm run quality:budgets.

Completeness autoresearch uses the stricter reproducible scorer:

bash autoresearch.sh

Autoresearch verification also tracks maintainability quality metrics as secondary monitors:

  • Code coverage (code_coverage_pct) — coverage percentage when a coverage summary is available.
  • Cyclomatic complexity (max_cyclomatic_complexity) — proxy for highest function-level control-flow complexity.
  • Duplicated code (duplicated_code_pct) — proxy for repeated three-line source blocks across the codebase, so common single-line TypeScript syntax does not dominate the budget.
  • Code smells (code_smells) — proxy count for long files, long functions, and oversized parameter lists.
  • Dependency and coupling (coupling_max_imports, module_dependency_edges, facade_dependency_files) — proxies for the most import or re-export fanout in a single source file, total module edges, and multi-edge facade count, so facade/barrel modules cannot hide true dependency breadth. The architecture score gives small headroom credit only when total edges and facade-file counts stay under the documented Kapi thresholds, with additional edge-count credit below stricter 170-edge, 160-edge, 158-edge, 157-edge, 155-edge, 149-edge, 148-edge, 147-edge, 145-edge, 144-edge, 142-edge, 140-edge, 139-edge, 138-edge, 135-edge, 133-edge, 129-edge, and 128-edge, 127-edge, 126-edge, 125-edge, 124-edge, 123-edge, 122-edge, 121-edge, 120-edge, 119-edge, 118-edge, 117-edge, 116-edge, 115-edge, 114-edge, 113-edge, 112-edge, 111-edge, 110-edge, 109-edge, 108-edge, 107-edge, 106-edge, 105-edge, 104-edge, 103-edge, 102-edge, 101-edge, and 100-edge tiers, plus stricter facade-file credit at 47, 46, 45, 44, 43, 42, 41, 40, 39, 38, 37, 36, 35, 34, 33, 32, 31, 30, 29, 28, 27, and 26 files.
  • Budget warning count (budget_warn_count), budget pass count (budget_pass_count), and budget not-configured count (budget_not_configured_count) — summary counts for optional maintainability budget status.
  • Semantic Autoresearch consistency (semantic_consistency_score, bridge_term_misuse_count, root_autoresearch_dependency_count, autoresearch_artifact_mismatch_count, source_of_truth_conflict_count) — diagnostic checks that Kapi Autoresearch is described and implemented as an embedded durable engine, not an ambiguous bridge to root-level autoresearch.* files such as autoresearch.md, autoresearch.sh, autoresearch.checks.sh, autoresearch.jsonl, autoresearch.ideas.md, or autoresearch.config.json.
  • pi-autoresearch reference coverage (pi_autoresearch_reference_score, expected_pi_autoresearch_role_coverage, pi_autoresearch_metric_parsing_role, pi_autoresearch_resume_reconstruction_role) — diagnostic checks that Kapi maps the reference loop roles into durable-mode artifacts and behavior: contract, benchmark, checks, ledger, ideas, keep/discard/crash/checks_failed, resume/reconstruction, and metric parsing.
  • Runtime Autoresearch start diagnostics (runtime_autoresearch_probe_executed, runtime_autoresearch_start_pass, runtime_autoresearch_start_contract_pass) — executable probes from bash autoresearch.sh that start /kapi-autoresearch in a temporary workspace and verify the Kapi-owned durable artifact contract is actually created on disk.
  • Cross-mode runtime readiness (runtime_deep_interview_start_contract_pass, runtime_ralph_start_contract_pass, runtime_integrate_start_contract_pass, mode_runtime_probe_coverage) — temp-workspace probes that prove non-Autoresearch durable modes start cleanly and create their declared artifacts.
  • Event/snapshot semantics (event_log_jsonl_parse_pass, snapshot_json_parse_pass, state_json_parse_pass) — runtime probes that parse created events.jsonl, snapshot.json, and state.json instead of accepting filename presence.
  • Human command-surface diagnostics (command_surface_probe_executed, exact_command_surface_pass, extra_human_command_count, missing_mode_subcommand_count, mode_subcommand_behavior_pass) — source-of-truth command inventory plus static subcommand checks for the durable mode command surface and required status|resume|approve mode subcommands.
  • Readiness/blocker diagnostics (kapi_readiness_score, ship_blocker_count, runtime_blocker_count, semantic_blocker_count) — separate ship-readiness indicators that aggregate runtime, event/snapshot, command-surface, semantic ownership, artifact, and source-of-truth blockers without hiding them behind the maintainability-inflated architecture score.

Use node scripts/code-quality-report.mjs --help to inspect quality-report options. Convenience scripts are available as npm run quality:json, npm run quality:markdown, npm run quality:budgets, and npm run quality:strict. Use node scripts/code-quality-report.mjs --json when CI or another tool needs the same metrics in machine-readable form. Use node scripts/code-quality-report.mjs --markdown for a human-readable table in issue comments or release notes. Add --budgets to include pass/warn budget status against Kapi's lightweight maintainability targets; add --fail-on-warn or run npm run quality:strict when a CI job should fail on budget warnings. Budget thresholds can be tuned with KAPI_QUALITY_COVERAGE_MIN, KAPI_QUALITY_COMPLEXITY_MAX, KAPI_QUALITY_DUPLICATION_MAX, KAPI_QUALITY_SMELLS_MAX, KAPI_QUALITY_COUPLING_MAX, KAPI_QUALITY_MODULE_EDGES_MAX, and KAPI_QUALITY_FACADE_FILES_MAX.

Current Gaps

  • kapi_write_artifacts prevalidates artifact names and paths before writing, but it is not a full filesystem transaction if a mid-write I/O failure occurs.
  • Kapi quality metrics are lightweight proxy checks, not full coverage or complexity analysis.
  • docs/chedex-completeness.md defines the Chedex comparison boundary, but not every Chedex feature is a Kapi goal.
  • Removed subagent-specific Kapi surfaces should be called out in release notes for users who previously invoked /kapi-subagent or kapi_plan_subagent_strategy.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors