Skip to content

Central-package /implement skill as agent-neutral (Claude Code + Codex) #791

@Brad-Edwards

Description

@Brad-Edwards

Summary

The /implement skill is currently duplicated across four repos (Ground-Control, shifter, pulsar, aptl). ~80% of the content is byte-identical; the rest is repo-specific path conventions, UID example strings, and one behavioral fork (pulsar's plan-approval mode). Edits to the workflow don't propagate; bugs get fixed N times.

Beyond the dedup, the skill is currently only invokable from Claude Code. The same workflow needs to be drivable from Codex as well — same content, same MCP tool calls, same review pipeline — so the team isn't locked into a single agent runtime.

This issue captures the interim work to centrally package the workflow as an agent-neutral artifact, parameterized by per-repo .ground-control.yaml configuration, until GC-O009 lands the durable Temporal-workflow rewrite.

Requirement UIDs

  • GC-O007 — Gated Agentic Development Loop (the workflow contract this skill embodies)
  • GC-O009 — Workflow Orchestration via Temporal (the durable end state this work feeds into)

ADR Impact

  • ADR-027 added — Central packaging of the implement workflow + dual-driver (Claude Code + Codex) compatibility + codex-as-reviewer-of-record invariant. Documents the .ground-control.yaml schema additions (docs, example_paths, plan, preflight, cross_cutting_concerns, requirements.uid_examples), the agent-neutral content layout, and the constraint that all review steps continue to route through gc_codex_* MCP tools regardless of driver.

ADR-021 (Gated Agentic Development Loop) is not superseded — that ADR governs the workflow shape; ADR-027 governs how it's packaged and which agents can drive it.

What changed (proposed)

.ground-control.yaml schema extension

docs:
  adr_dir: architecture/adrs/
  architecture_overview: docs/architecture/ARCHITECTURE.md
  coding_standards: docs/CODING_STANDARDS.md
  workflow_reference: docs/DEVELOPMENT_WORKFLOW.md
  knowledge_base: docs/knowledge/
example_paths:
  source: backend/src/main/java/com/keplerops/groundcontrol/
  test:   backend/src/test/java/com/keplerops/groundcontrol/
requirements:
  uid_examples: ["GC-X001", "OBS-042"]
plan:
  approval_gate: user-mode      # default; uses EnterPlanMode (Claude) or equivalent in Codex
  # approval_gate: issue-comment  # pulsar's variant
preflight:
  required_before_planning: false
cross_cutting_concerns:
  description: |
    Logger: SLF4J via @Slf4j; structured logging via logstash-encoder
    Validation: Bean Validation + @Validated; Zod on the frontend
    Errors: ErrorResponse envelope via GlobalExceptionHandler
    Tests: @WebMvcTest, @DataJpaTest, ArchUnit

gc_get_repo_ground_control_context extension

Return the new blocks alongside existing fields. Backwards compatible — repos without the new keys get sensible defaults (matching today's hardcoded values).

Skill content layout

  • Single canonical workflow file lives in one place per host (~/.claude/skills/implement/SKILL.md for the interim local approach; promoted to a packaged distribution if the Claude Code plugin format stabilizes — separately scoped, not blocking this work).
  • Per-repo .claude/skills/implement/ becomes optional override only if the repo legitimately needs to tweak something the schema doesn't capture.
  • Codex-side: equivalent pointer via Codex's prompt/AGENTS conventions so both runtimes load the same canonical content.
  • Skill prose renders against cfg.docs.*, cfg.example_paths.*, cfg.cross_cutting_concerns.description instead of hardcoding paths.

Behavioral knobs as config

  • plan.approval_gate=user-mode (default) → skill uses EnterPlanMode (Claude) or Codex equivalent and waits for approval.
  • plan.approval_gate=issue-comment (pulsar) → skill posts plan as gh issue comment and proceeds directly to TDD.
  • preflight.required_before_planning=true (pulsar) → skill must complete codex preflight before producing the plan.

Codex-as-reviewer-of-record (invariant)

Regardless of which agent runtime drives the workflow, every review step continues to route through gc_codex_* MCP tools (gc_codex_review, gc_codex_verify_finding, gc_codex_architecture_preflight). Reviews don't switch to whichever agent runtime is driving — codex remains the reviewer of record. This must hold through ADR-027's central packaging AND through GC-O009's eventual Temporal migration.

Acceptance criteria

  • .ground-control.yaml schema extended with docs, example_paths, plan, preflight, cross_cutting_concerns, requirements.uid_examples blocks; backend deserializer updated; existing repos continue to load with sensible defaults for missing blocks.
  • gc_get_repo_ground_control_context MCP tool returns the new blocks; backwards compatible.
  • Canonical implement skill lives in one location (initially ~/.claude/skills/); per-repo .claude/skills/implement/ either deleted (when no override needed) or reduced to a minimal override.
  • Skill prose renders repo-specific values from config rather than hardcoding paths.
  • plan.approval_gate and preflight.required_before_planning config knobs work end-to-end (verified by toggling pulsar's config and seeing the appropriate behavior).
  • Workflow is invokable from BOTH Claude Code AND Codex against the same canonical file.
  • Every review step still routes through gc_codex_* MCP tools — verified by running a full /implement cycle from Codex and confirming codex (not the driver agent) produces the review.
  • All four current repos (Ground-Control, shifter, pulsar, aptl) migrated to the new model — their per-repo SKILL.md is either gone or a minimal override.
  • ADR-027 written and ACTIVE.

Ground Control Checks

  • make policy passes
  • gc_evaluate_quality_gates passes or is unchanged by this repo-only change
  • gc_run_sweep reviewed or intentionally deferred with reason

This is a tooling/workflow change. Sweep is unaffected by it; deferred with reason.

  • Hard cap codex review cycles at 2. SKILL.md currently has a soft cap at 3 (Step 12); change it to a hard 2 with no escape. Past cycle 2 codex hits diminishing returns and starts repeating out-of-scope findings. The cap is in addition to Step 6.5's pre-push iteration which uses the same 2-cycle limit.

Test plan

  • Unit tests for the new MCP-tool fields (gc_get_repo_ground_control_context returns the new blocks; defaults applied when blocks are missing).
  • Run /implement on a small dummy issue from Claude Code against Ground-Control; verify it works end-to-end with the new config.
  • Run /implement on the same dummy issue from Codex; verify it works end-to-end and codex review still fires through gc_codex_review.
  • Toggle plan.approval_gate and verify both branches (user-mode triggers EnterPlanMode/Codex equivalent; issue-comment posts to the issue).
  • Migrate one of {shifter, pulsar, aptl} as a pilot and verify nothing regresses; then migrate the others.

Traceability

  • IMPLEMENTS: GC-O007 → canonical SKILL.md + the implement workflow in ~/.claude/skills/ (or chosen central location)
  • IMPLEMENTS: GC-O009 → contributes the configuration model that the future Temporal workflow will consume; preserves codex-as-reviewer-of-record invariant
  • IMPLEMENTS: GC-O007 → .ground-control.yaml schema extensions (backend GroundControlYamlConfig-equivalent)
  • IMPLEMENTS: GC-O009 → architecture/adrs/027-implement-workflow-packaging.md
  • TESTS: GC-O007 → unit tests for gc_get_repo_ground_control_context schema returns
  • TESTS: GC-O007 → end-to-end smoke run of /implement from Claude Code AND Codex against the dummy issue

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions