diff --git a/CHANGELOG.md b/CHANGELOG.md index 8f99f4a..1baf594 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -4,6 +4,28 @@ All notable changes to forge will be documented in this file. Format follows [Ke ## [Unreleased] +## [1.5.0] — 2026-05-28 + +### Added + +- **P1: Model tier routing** — `LLMPipe` now uses `tierrouter` for complexity-driven model selection. `nano`/`micro` → T0 (cheap), `standard` → T1 (balanced), `complex` → T2 (powerful). Controlled via `SetComplexityTier`. +- **P1: 3-layer knowledge base** (`internal/knowledge/layered.go`) — KB loader merges embedded global KB, user-scoped `~/.forge/kb/`, and project-scoped `.forge/kb/` in priority order. Project KB entries override global ones. +- **P2: OpenTelemetry checkpoint spans** — `telemetry.StartPipelineSpan` and `telemetry.EmitCheckpointSpan` emit per-checkpoint OTEL spans to `.forge/telemetry.jsonl`. Wired into the `forge ship` pipeline. +- **P2: Prometheus metrics export** — `forge metrics` command reads `.forge/token-ledger.jsonl` and outputs `forge_tokens_total` and `forge_cost_usd_total` counter series in Prometheus text format, labelled by model. +- **P2: forge undo integration** — `writeShipTrashManifest` records every ship run to `.forge/trash//manifest.json`; `snapOnFail` takes a best-effort snapshot on checkpoint failure. Both wired into `forge ship`. +- **`forge companion`** — zero-setup AI pairing command with four subcommands: `install` (writes expert persona files for VS Code Copilot / Claude / Cursor / Windsurf), `update` (force-refreshes skill files), `status` (shows per-platform install state), `guide` (prints the vibe-coding quick-start cheatsheet with the top-10 daily prompts). +- **`forge init` companion hint** — after scaffolding completes, `forge init` prints a `forge companion install` hint so new projects are prompted to set up AI pairing immediately. +- **Command groups in `forge --help`** — 50 commands now displayed in 7 named groups: Core, Build & Ship, Analysis & Quality, Operations, AI & Automation, Config & Tools, Advanced. +- **Vibe-coding workflows in skill templates** — all platform templates (Copilot, Claude, Cursor, Windsurf) now include a "Daily Vibe-Coding Patterns" section with feature/bugfix/security/standup/review workflow examples. +- **Error code ranges** — `FORGE-6600..6649` reserved for `cli/metrics`; `FORGE-6650..6699` reserved for `cli/companion`. + +### Fixed + +- `companion.go` raw-string backtick syntax error — guide string switched to concatenation to allow embedded backtick characters. +- Duplicate `FORGE-6800` registration — `cmdmetrics` moved from 6800 to 6600; `cmdcompanion` moved from 6900 (unregistered) to 6650. +- `captureProvider` in `llmpipe_tier_test.go` — added missing `Capabilities()` method to satisfy `llmprovider.Provider` interface. +- `TestInvoke_FallbackToDirectWhenRouterNil` — test now uses `newLLMPipeWithProvider` (initialises `rewriter`) and then nils the router, avoiding the nil-pointer dereference. + ## [1.3.0] — 2026-05-26 ### Added diff --git a/docs/ERROR_CODES.md b/docs/ERROR_CODES.md index a80a256..54784b2 100644 --- a/docs/ERROR_CODES.md +++ b/docs/ERROR_CODES.md @@ -103,4 +103,8 @@ You can also look up any code with `forge ask error `. | `FORGE-6550` | bugfix failed | [forge.dev/errors/6550](https://forge.dev/errors/6550) | | `FORGE-6551` | no bug source specified — provide --bug, --finding, or --test | [forge.dev/errors/6551](https://forge.dev/errors/6551) | | `FORGE-6552` | finding not found in review results | [forge.dev/errors/6552](https://forge.dev/errors/6552) | +| `FORGE-6553` | LLM call failed — forge bugfix cannot proceed without a working LLM | [forge.dev/errors/6553](https://forge.dev/errors/6553) | +| `FORGE-6554` | no LLM provider configured | [forge.dev/errors/6554](https://forge.dev/errors/6554) | | `FORGE-6700` | skill operation failed (forge skill install/list/remove) | [forge.dev/errors/6700](https://forge.dev/errors/6700) | +| `FORGE-6600` | metrics export failed | [forge.dev/errors/6600](https://forge.dev/errors/6600) | +| `FORGE-6650` | companion setup failed | [forge.dev/errors/6650](https://forge.dev/errors/6650) | diff --git a/docs/RELEASE_GUIDE.md b/docs/RELEASE_GUIDE.md index 369d143..483343d 100644 --- a/docs/RELEASE_GUIDE.md +++ b/docs/RELEASE_GUIDE.md @@ -49,6 +49,29 @@ git commit -m "chore: prepare release vX.Y.Z" git push ``` +### 1.1 — Version scope gate (required before tagging) + +Do not bump to the next minor/major by default. Pick the **smallest valid** semver scope: + +| If the release contains | Bump | +|---|---| +| Bug fixes, hardening, docs, internal refactors, non-breaking behavior corrections | `PATCH` | +| Backward-compatible additive capability (new verb/flag/output field that does not break existing usage) | `MINOR` | +| Intentional breaking contract change | `MAJOR` | + +Required evidence before tagging: + +1. Write a short "Version Scope Decision" in the release PR/body: + - `Chosen bump`: PATCH/MINOR/MAJOR + - `Why not smaller`: one line + - `Breaking impact`: none / described +2. Confirm the relevant feature spec(s) have a top `Status Summary` block with: + - `Lifecycle` + - `Version Scope` (+ rationale) + - `Last Updated` + - `Checkpoint Progress` +3. If uncertain between `PATCH` and `MINOR`, ship `PATCH` first (or cut `-rc.N`). + ### 2 — Tag and push ```sh @@ -175,6 +198,8 @@ Forge follows **Semantic Versioning** (`MAJOR.MINOR.PATCH`): | Bug fix, security patch | PATCH | | Pre-release candidate | `x.y.z-rc.N` | +Default release behavior: **prefer PATCH unless a larger scope is clearly justified**. + All six npm packages (`@forge/cli` + 5 platform packages) are always published at the same version and kept in lockstep. --- diff --git a/forge.exe~ b/forge.exe~ new file mode 100644 index 0000000..ef9a760 Binary files /dev/null and b/forge.exe~ differ diff --git a/internal/cli/cmdbugfix/bugfix.go b/internal/cli/cmdbugfix/bugfix.go index 49b02b1..e189833 100644 --- a/internal/cli/cmdbugfix/bugfix.go +++ b/internal/cli/cmdbugfix/bugfix.go @@ -53,6 +53,8 @@ var ( ErrBugfixFailed = errcode.Register(errcode.Code(6550), "bugfix failed") ErrNoSourceSpecified = errcode.Register(errcode.Code(6551), "no bug source specified — provide --bug, --finding, or --test") ErrFindingNotFound = errcode.Register(errcode.Code(6552), "finding not found in review results") + ErrLLMCallFailed = errcode.Register(errcode.Code(6553), "LLM call failed — forge bugfix cannot proceed without a working LLM") + ErrNoLLMProvider = errcode.Register(errcode.Code(6554), "no LLM provider configured") ) // Source constants identify where the bug came from. @@ -83,8 +85,16 @@ type RunContext struct { Files []string // source file paths to include in the LLM context ExtraCtx string // free-form additional context supplied by the caller Model string // LLM model override (e.g. "gpt-4o", "claude-sonnet-4-5") + // testProvider may be set by tests in this package to inject a fake LLM + // provider without real network calls. It is always nil for callers outside + // package cmdbugfix (unexported field — zero value is nil). + testProvider llmprovider.Provider } +// testProviderHook may be set by package-internal tests to inject a fake +// provider into the cobra command handler. It must remain nil in production. +var testProviderHook llmprovider.Provider + // BugfixResult is the full output of one bugfix run. type BugfixResult struct { Root string `json:"root"` @@ -122,7 +132,7 @@ func init() { "with --apply: writes patch to source files + regression test; appends to .forge/audit.log", }, GatesTouched: []string{"§4 bugfix", "DEV-M1-48"}, - ErrorCodes: []errcode.Code{ErrBugfixFailed, ErrNoSourceSpecified, ErrFindingNotFound}, + ErrorCodes: []errcode.Code{ErrBugfixFailed, ErrNoSourceSpecified, ErrFindingNotFound, ErrLLMCallFailed, ErrNoLLMProvider}, }) } @@ -193,10 +203,11 @@ func New() *cobra.Command { } rc := RunContext{ - Stack: stack, - Files: files, - ExtraCtx: extraCtx, - Model: model, + Stack: stack, + Files: files, + ExtraCtx: extraCtx, + Model: model, + testProvider: testProviderHook, } result, err := Run(root, mode, bug, finding, test, rc) if err != nil { @@ -258,15 +269,18 @@ func Run(root, mode, bug, finding, test string, rcs ...RunContext) (BugfixResult ctx := loadContext(root) // Try LLM-backed diagnosis. - provider, err := llmprovider.Detect() - if err != nil { - // No LLM — return a structured placeholder so callers get a valid result. - result.RootCause = "LLM provider not configured. Options: set ANTHROPIC_API_KEY, OPENAI_API_KEY, GEMINI_API_KEY, or GH_TOKEN (GitHub Copilot — if you have a Copilot subscription, run: gh auth login)." - result.Summary = fmt.Sprintf("no LLM provider detected — cannot diagnose %s %q", result.Source, result.Input) - return result, nil + p := rc.testProvider + if p == nil { + var detectErr error + p, detectErr = llmprovider.Detect() + if detectErr != nil { + return result, errcode.New(ErrNoLLMProvider, + "set ANTHROPIC_API_KEY, OPENAI_API_KEY, GEMINI_API_KEY, or GH_TOKEN (GitHub Copilot — run: gh auth login)", + detectErr) + } } - return llmBugfix(result, provider, ctx, rc) + return llmBugfix(result, p, ctx, rc) } // llmBugfix calls the LLM to diagnose root cause, produce a patch, and write @@ -346,9 +360,9 @@ Respond with a JSON object: resp, err := provider.Complete(context.Background(), req) if err != nil { - result.RootCause = fmt.Sprintf("LLM call failed: %v", err) - result.Summary = "LLM call failed — cannot produce fix" - return result, nil + return result, errcode.New(ErrLLMCallFailed, + fmt.Sprintf("provider error: %v — check model name, quota, and credentials", err), + err) } // Parse LLM JSON response, stripping any markdown fences. @@ -372,10 +386,9 @@ Respond with a JSON object: } if err := json.Unmarshal([]byte(cleaned), &parsed); err != nil { - // LLM returned non-JSON — surface the raw content as root cause. - result.RootCause = cleaned - result.Summary = "LLM response could not be parsed as JSON; raw output shown above" - return result, nil + return result, errcode.New(ErrLLMCallFailed, + "LLM response is not valid JSON — cannot produce a structured fix (raw response logged below)", + fmt.Errorf("parse error: %w; raw: %.300s", err, cleaned)) } result.RootCause = parsed.RootCause diff --git a/internal/cli/cmdbugfix/bugfix_test.go b/internal/cli/cmdbugfix/bugfix_test.go index b8a5914..28117ce 100644 --- a/internal/cli/cmdbugfix/bugfix_test.go +++ b/internal/cli/cmdbugfix/bugfix_test.go @@ -16,13 +16,51 @@ package cmdbugfix import ( "bytes" + "context" "encoding/json" + "fmt" "os" "path/filepath" "strings" "testing" + + "github.com/teragrid/forge/internal/llmprovider" ) +// ── Test helpers ────────────────────────────────────────────────────────────── + +// mockProvider is a fake LLM provider for unit tests. It captures the request +// so tests can assert on what was sent to the LLM. +type mockProvider struct { + capturedReq *llmprovider.Request + resp *llmprovider.Response + err error +} + +func (m *mockProvider) Name() string { return "mock" } + +func (m *mockProvider) Complete(_ context.Context, req *llmprovider.Request) (*llmprovider.Response, error) { + m.capturedReq = req + if m.err != nil { + return nil, m.err + } + return m.resp, nil +} + +func (m *mockProvider) Capabilities() llmprovider.Capabilities { + return llmprovider.Capabilities{} +} + +// goodLLMResponse returns a well-formed LLM JSON response for use in tests. +func goodLLMResponse() *llmprovider.Response { + return &llmprovider.Response{Content: `{ + "root_cause": "nil pointer dereference in the payment handler", + "fix": {"file": "payment.go", "patch": "- if p == nil {\n+ if p == nil { return }", "confidence": "high"}, + "regression_test": {"file": "payment_test.go", "code": "func TestPaymentNilGuard(t *testing.T) {}"}, + "summary": "added nil guard to payment handler" + }`} +} + // ── Source constants ────────────────────────────────────────────────────────── func TestSourceConstants(t *testing.T) { @@ -40,20 +78,28 @@ func TestSourceConstants(t *testing.T) { // ── Run: no-LLM fallback ────────────────────────────────────────────────────── -// TestRun_Bug_NoLLM verifies that --bug succeeds without an LLM provider, -// returning a structured result with a helpful placeholder message. -func TestRun_Bug_NoLLM(t *testing.T) { +// TestRun_NoLLMProvider_ReturnsError verifies that Run returns a non-nil error +// when no LLM provider is configured. The partial result still has Source, +// Input, Mode, and Root set (they are resolved before the LLM call). +func TestRun_NoLLMProvider_ReturnsError(t *testing.T) { t.Parallel() root := t.TempDir() + // No testProvider set — llmprovider.Detect() will fail in test environments + // that have no real API keys configured (the normal case). result, err := Run(root, "dry-run", "login fails when email has a +", "", "") - if err != nil { - t.Fatalf("Run returned unexpected error: %v", err) + if err == nil { + // A real LLM is present in this environment; verify the happy path. + if result.Source != SourceBug { + t.Errorf("Source: got %q want %q", result.Source, SourceBug) + } + return } + // No LLM — partial result still has Source, Input, Mode and Root set. if result.Source != SourceBug { t.Errorf("Source: got %q want %q", result.Source, SourceBug) } - if result.Input == "" { - t.Error("Input must not be empty") + if result.Input != "login fails when email has a +" { + t.Errorf("Input: got %q", result.Input) } if result.Mode != "dry-run" { t.Errorf("Mode: got %q want %q", result.Mode, "dry-run") @@ -61,22 +107,16 @@ func TestRun_Bug_NoLLM(t *testing.T) { if result.Root != root { t.Errorf("Root: got %q want %q", result.Root, root) } - // Without an LLM, we expect the RootCause to explain the missing provider. - if !strings.Contains(result.RootCause, "LLM provider not configured") && - !strings.Contains(result.RootCause, "LLM call failed") && - result.RootCause == "" { - t.Errorf("unexpected empty RootCause without LLM") - } } -// TestRun_Test_NoLLM verifies that --test also succeeds without an LLM provider. -func TestRun_Test_NoLLM(t *testing.T) { +// TestRun_Test_NoLLMProvider verifies that Source and Input are populated even +// when Run fails with ErrNoLLMProvider (the partial result is always set before +// the LLM call). +func TestRun_Test_NoLLMProvider(t *testing.T) { t.Parallel() root := t.TempDir() - result, err := Run(root, "dry-run", "", "", "TestLoginHandler_PlusSign") - if err != nil { - t.Fatalf("Run returned unexpected error: %v", err) - } + result, _ := Run(root, "dry-run", "", "", "TestLoginHandler_PlusSign") + // Source and Input are resolved before the LLM call. if result.Source != SourceTest { t.Errorf("Source: got %q want %q", result.Source, SourceTest) } @@ -116,7 +156,9 @@ func TestRun_Finding_Found(t *testing.T) { t.Fatal(err) } - result, err := Run(root, "dry-run", "", "SEC-001", "") + result, err := Run(root, "dry-run", "", "SEC-001", "", RunContext{ + testProvider: &mockProvider{resp: goodLLMResponse()}, + }) if err != nil { t.Fatalf("Run returned unexpected error for existing finding: %v", err) } @@ -149,15 +191,21 @@ func TestNew_NoFlags_ReturnsError(t *testing.T) { // TestNew_JSONOutput_Bug verifies that --json emits valid JSON with the correct // Source and Mode fields when --bug is provided. +// Uses testProviderHook to inject a mock LLM so no real API calls are made. func TestNew_JSONOutput_Bug(t *testing.T) { - t.Parallel() + // Not t.Parallel() — uses package-level testProviderHook. + testProviderHook = &mockProvider{resp: goodLLMResponse()} + defer func() { testProviderHook = nil }() + root := t.TempDir() cmd := New() var out bytes.Buffer cmd.SetOut(&out) cmd.SetErr(&out) cmd.SetArgs([]string{"--root", root, "--bug", "button click does nothing", "--json"}) - _ = cmd.Execute() + if err := cmd.Execute(); err != nil { + t.Fatalf("Execute: %v", err) + } var result BugfixResult if err := json.NewDecoder(&out).Decode(&result); err != nil { @@ -175,15 +223,21 @@ func TestNew_JSONOutput_Bug(t *testing.T) { } // TestNew_JSONOutput_Test verifies JSON output when --test is provided. +// Uses testProviderHook to inject a mock LLM so no real API calls are made. func TestNew_JSONOutput_Test(t *testing.T) { - t.Parallel() + // Not t.Parallel() — uses package-level testProviderHook. + testProviderHook = &mockProvider{resp: goodLLMResponse()} + defer func() { testProviderHook = nil }() + root := t.TempDir() cmd := New() var out bytes.Buffer cmd.SetOut(&out) cmd.SetErr(&out) cmd.SetArgs([]string{"--root", root, "--test", "TestCheckout_NilCart", "--json"}) - _ = cmd.Execute() + if err := cmd.Execute(); err != nil { + t.Fatalf("Execute: %v", err) + } var result BugfixResult if err := json.NewDecoder(&out).Decode(&result); err != nil { @@ -202,11 +256,13 @@ func TestRun_Idempotent(t *testing.T) { t.Parallel() root := t.TempDir() bugDesc := "panic on nil pointer in payment handler" - r1, err := Run(root, "dry-run", bugDesc, "", "") + mock := &mockProvider{resp: goodLLMResponse()} + rc := RunContext{testProvider: mock} + r1, err := Run(root, "dry-run", bugDesc, "", "", rc) if err != nil { t.Fatalf("first run: %v", err) } - r2, err := Run(root, "dry-run", bugDesc, "", "") + r2, err := Run(root, "dry-run", bugDesc, "", "", rc) if err != nil { t.Fatalf("second run: %v", err) } @@ -221,16 +277,21 @@ func TestRun_Idempotent(t *testing.T) { // ── Apply mode: dry-run guard ───────────────────────────────────────────────── -// TestNew_ApplyFlag_Sets mode verifies that --apply changes the mode to "apply". +// TestNew_ApplyFlag_SetsMode verifies that --apply changes the mode to "apply". func TestNew_ApplyFlag_SetsMode(t *testing.T) { - t.Parallel() + // Not t.Parallel() — uses package-level testProviderHook. + testProviderHook = &mockProvider{resp: goodLLMResponse()} + defer func() { testProviderHook = nil }() + root := t.TempDir() cmd := New() var out bytes.Buffer cmd.SetOut(&out) cmd.SetErr(&out) cmd.SetArgs([]string{"--root", root, "--bug", "search returns wrong results", "--apply", "--json"}) - _ = cmd.Execute() + if err := cmd.Execute(); err != nil { + t.Fatalf("Execute with --apply: %v", err) + } var result BugfixResult if err := json.NewDecoder(&out).Decode(&result); err != nil { @@ -261,28 +322,28 @@ func TestRun_Bug_NoBugTextNoInput(t *testing.T) { // ── RunContext / new real-world flags ───────────────────────────────────────── // TestRun_BackwardCompat verifies the 5-arg Run signature still compiles and -// works (no RunContext supplied). This is the regression guard for the variadic -// change. +// that Source is set in the partial result even when Run fails with +// ErrNoLLMProvider (regression guard for the variadic-arg change). func TestRun_BackwardCompat(t *testing.T) { t.Parallel() root := t.TempDir() - // Must compile and not panic; no LLM → placeholder result. - result, err := Run(root, "dry-run", "crash on startup", "", "") - if err != nil { - t.Fatalf("5-arg Run: %v", err) - } + // 5-arg form: RunContext is optional. Source is resolved before the LLM + // call, so result.Source is populated regardless of whether err is nil. + result, _ := Run(root, "dry-run", "crash on startup", "", "") if result.Source != SourceBug { t.Errorf("Source: got %q want %q", result.Source, SourceBug) } } // TestRun_RunContext_Stack passes a RunContext with a stack trace and verifies -// the call does not error out (no LLM in test env). +// the stack trace is included in the LLM prompt. func TestRun_RunContext_Stack(t *testing.T) { t.Parallel() root := t.TempDir() + mock := &mockProvider{resp: goodLLMResponse()} rc := RunContext{ - Stack: "goroutine 1 [running]:\nmain.main()\n\t/main.go:42 +0x80", + Stack: "goroutine 1 [running]:\nmain.main()\n\t/main.go:42 +0x80", + testProvider: mock, } result, err := Run(root, "dry-run", "nil pointer dereference", "", "", rc) if err != nil { @@ -291,6 +352,13 @@ func TestRun_RunContext_Stack(t *testing.T) { if result.Input != "nil pointer dereference" { t.Errorf("Input: got %q", result.Input) } + // The stack trace must be included in the LLM prompt. + if mock.capturedReq == nil { + t.Fatal("LLM was not called") + } + if !strings.Contains(mock.capturedReq.UserPrompt, "goroutine 1 [running]") { + t.Errorf("stack trace not found in LLM prompt:\n%s", mock.capturedReq.UserPrompt) + } } // TestRun_RunContext_Files includes a source file that doesn't exist — Run must @@ -298,8 +366,10 @@ func TestRun_RunContext_Stack(t *testing.T) { func TestRun_RunContext_Files(t *testing.T) { t.Parallel() root := t.TempDir() + mock := &mockProvider{resp: goodLLMResponse()} rc := RunContext{ - Files: []string{filepath.Join(root, "nonexistent.go")}, + Files: []string{filepath.Join(root, "nonexistent.go")}, + testProvider: mock, } result, err := Run(root, "dry-run", "timeout on checkout", "", "", rc) if err != nil { @@ -308,13 +378,22 @@ func TestRun_RunContext_Files(t *testing.T) { if result.Source != SourceBug { t.Errorf("Source: got %q want %q", result.Source, SourceBug) } + // The (missing) file must still be mentioned in the LLM prompt (graceful fallback). + if mock.capturedReq != nil && !strings.Contains(mock.capturedReq.UserPrompt, "nonexistent.go") { + t.Errorf("missing file not mentioned in LLM prompt:\n%s", mock.capturedReq.UserPrompt) + } } -// TestRun_RunContext_ExtraCtx passes free-form context; verifies no error. +// TestRun_RunContext_ExtraCtx passes free-form context; verifies no error and +// that the extra context is included in the LLM prompt. func TestRun_RunContext_ExtraCtx(t *testing.T) { t.Parallel() root := t.TempDir() - rc := RunContext{ExtraCtx: "This only happens on the EU cluster, not US."} + mock := &mockProvider{resp: goodLLMResponse()} + rc := RunContext{ + ExtraCtx: "This only happens on the EU cluster, not US.", + testProvider: mock, + } result, err := Run(root, "dry-run", "payment gateway timeout", "", "", rc) if err != nil { t.Fatalf("Run with extra context: %v", err) @@ -322,11 +401,20 @@ func TestRun_RunContext_ExtraCtx(t *testing.T) { if result.Mode != "dry-run" { t.Errorf("Mode: got %q want dry-run", result.Mode) } + // Extra context must appear in the LLM prompt. + if mock.capturedReq != nil && !strings.Contains(mock.capturedReq.UserPrompt, "EU cluster") { + t.Errorf("extra context not found in LLM prompt:\n%s", mock.capturedReq.UserPrompt) + } } -// TestNew_StackFlag verifies the --stack CLI flag is accepted. +// TestNew_StackFlag verifies the --stack CLI flag is accepted and forwarded +// to the LLM request. func TestNew_StackFlag(t *testing.T) { - t.Parallel() + // Not t.Parallel() — uses package-level testProviderHook. + mock := &mockProvider{resp: goodLLMResponse()} + testProviderHook = mock + defer func() { testProviderHook = nil }() + root := t.TempDir() cmd := New() var out bytes.Buffer @@ -348,11 +436,23 @@ func TestNew_StackFlag(t *testing.T) { if result.Source != SourceBug { t.Errorf("Source: got %q", result.Source) } + // Verify the stack trace was forwarded to the LLM. + if mock.capturedReq == nil { + t.Fatal("LLM was not called") + } + if !strings.Contains(mock.capturedReq.UserPrompt, "goroutine 1 [running]") { + t.Errorf("--stack value not found in LLM prompt") + } } -// TestNew_FileFlag verifies the --file CLI flag is accepted (repeatable). +// TestNew_FileFlag verifies the --file CLI flag is accepted and the file +// content is included in the LLM prompt. func TestNew_FileFlag(t *testing.T) { - t.Parallel() + // Not t.Parallel() — uses package-level testProviderHook. + mock := &mockProvider{resp: goodLLMResponse()} + testProviderHook = mock + defer func() { testProviderHook = nil }() + root := t.TempDir() // Write a real file to include. srcFile := filepath.Join(root, "auth.go") @@ -372,11 +472,23 @@ func TestNew_FileFlag(t *testing.T) { if err := cmd.Execute(); err != nil { t.Fatalf("Execute with --file: %v", err) } + // Verify the file content was forwarded to the LLM. + if mock.capturedReq == nil { + t.Fatal("LLM was not called") + } + if !strings.Contains(mock.capturedReq.UserPrompt, "auth.go") { + t.Errorf("--file path not found in LLM prompt") + } } -// TestNew_ModelFlag verifies the --model CLI flag is accepted. +// TestNew_ModelFlag verifies the --model CLI flag is accepted and forwarded +// to the LLM request. func TestNew_ModelFlag(t *testing.T) { - t.Parallel() + // Not t.Parallel() — uses package-level testProviderHook. + mock := &mockProvider{resp: goodLLMResponse()} + testProviderHook = mock + defer func() { testProviderHook = nil }() + root := t.TempDir() cmd := New() var out bytes.Buffer @@ -391,6 +503,78 @@ func TestNew_ModelFlag(t *testing.T) { if err := cmd.Execute(); err != nil { t.Fatalf("Execute with --model: %v", err) } + // Verify the model override was forwarded to the LLM request. + if mock.capturedReq == nil { + t.Fatal("LLM was not called") + } + if mock.capturedReq.Model != "gpt-4o" { + t.Errorf("Model in LLM request: got %q want %q", mock.capturedReq.Model, "gpt-4o") + } +} + +// ── Regression tests for issue #18 ─────────────────────────────────────────── + +// TestRun_LLMCallFailed_ExitsNonZero is the primary regression guard for +// issue #18. Before the fix, a failing provider.Complete returned (result, nil), +// giving forge bugfix a silent exit-0 and bypassing all quality gates. +// The fix must return a non-nil error on ANY LLM failure. +func TestRun_LLMCallFailed_ExitsNonZero(t *testing.T) { + t.Parallel() + root := t.TempDir() + mock := &mockProvider{err: fmt.Errorf("429 Too Many Requests — retry after 30s")} + _, err := Run(root, "dry-run", "login panic", "", "", RunContext{testProvider: mock}) + if err == nil { + t.Fatal("LLM call failure must return a non-nil error (regression guard for issue #18): " + + "before the fix, Run returned (result, nil) silently, causing exit-0 and bypassed quality gates") + } +} + +// TestRun_LLMResponse_ValidJSON_HappyPath verifies the happy path: when the LLM +// returns valid JSON, Run succeeds and the result contains RootCause, Fix, Summary. +func TestRun_LLMResponse_ValidJSON_HappyPath(t *testing.T) { + t.Parallel() + root := t.TempDir() + mock := &mockProvider{resp: goodLLMResponse()} + result, err := Run(root, "dry-run", "nil pointer dereference in payment handler", "", "", + RunContext{testProvider: mock}) + if err != nil { + t.Fatalf("Run: %v", err) + } + if result.RootCause == "" { + t.Error("RootCause must be populated on success") + } + if result.Fix == nil { + t.Error("Fix must be populated on success") + } + if result.Summary == "" { + t.Error("Summary must be populated on success") + } +} + +// TestRun_LLMResponse_InvalidJSON_ExitsNonZero verifies that a non-JSON LLM +// response returns a non-nil error (boundary: corrupted or non-JSON LLM output). +func TestRun_LLMResponse_InvalidJSON_ExitsNonZero(t *testing.T) { + t.Parallel() + root := t.TempDir() + mock := &mockProvider{resp: &llmprovider.Response{ + Content: "I cannot help with that request.", + }} + _, err := Run(root, "dry-run", "login panic", "", "", RunContext{testProvider: mock}) + if err == nil { + t.Fatal("non-JSON LLM response must return a non-nil error") + } +} + +// TestRun_LLMCallFailed_FalsePositiveGuard is a false-positive guard: when the +// LLM succeeds with valid JSON, Run must NOT return an error. +func TestRun_LLMCallFailed_FalsePositiveGuard(t *testing.T) { + t.Parallel() + root := t.TempDir() + mock := &mockProvider{resp: goodLLMResponse()} + _, err := Run(root, "dry-run", "button does nothing", "", "", RunContext{testProvider: mock}) + if err != nil { + t.Errorf("valid LLM response must not produce an error, got: %v", err) + } } // ── applyPatch ──────────────────────────────────────────────────────────────── diff --git a/internal/cli/cmdcompanion/companion.go b/internal/cli/cmdcompanion/companion.go new file mode 100644 index 0000000..4c3d983 --- /dev/null +++ b/internal/cli/cmdcompanion/companion.go @@ -0,0 +1,383 @@ +// Copyright 2024 The Forge Authors +// +// Licensed under the Apache License, Version 2.0 (the "License"); +// you may not use this file except in compliance with the License. +// You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +// Package cmdcompanion implements `forge companion` — zero-setup AI pairing for +// the Forge framework. +// +// forge companion is the recommended first command to run after `forge init`. +// It detects which AI tools are installed (VS Code Copilot, Claude, Cursor, +// Windsurf), generates the matching skill / agent files, and prints a +// copy-paste vibe-coding quick-start guide. +// +// Subcommands: +// +// forge companion — detect + install missing skill files (interactive) +// forge companion install — alias for forge skill install --for all +// forge companion update — regenerate all skill files (picks up KB changes) +// forge companion status — show which platforms are configured +// forge companion guide — print the vibe-coding quick-start guide +package cmdcompanion + +import ( + "fmt" + "os" + "path/filepath" + + "github.com/spf13/cobra" + + "github.com/teragrid/forge/internal/cli/cmdskill" + "github.com/teragrid/forge/internal/errcode" + "github.com/teragrid/forge/internal/verbmeta" +) + +// Reserved error codes (range 6650..6699). +var ( + ErrCompanionFailed = errcode.Register(errcode.Code(6650), "companion setup failed") +) + +func init() { + verbmeta.Register(verbmeta.Manifest{ + Verb: "companion", + Summary: "Zero-setup AI pairing: install the Forge expert persona in every AI tool " + + "(VS Code Copilot, Claude, Cursor, Windsurf) and get a vibe-coding quick-start.", + Inputs: []string{ + "[install] — install skill files for all detected AI platforms", + "[update] — regenerate all skill files (refresh KB + templates)", + "[status] — show which platforms already have skill files", + "[guide] — print the vibe-coding quick-start cheatsheet", + "--root — project root (default: current directory)", + "--for — target only: copilot, claude, cursor, windsurf, all", + "--force / -f — overwrite existing skill files", + "--yes / -y — non-interactive; skip confirmation prompt", + }, + Outputs: []string{ + "stdout: installation summary and quick-start guide", + ".github/chatmodes/forge-expert.chatmode.md (Copilot)", + ".github/instructions/forge-expert.instructions.md (Copilot)", + ".github/prompts/forge-*.prompt.md (Copilot)", + "CLAUDE.md + .claude/commands/ (Claude)", + ".cursor/rules/forge-expert.mdc (Cursor)", + ".windsurfrules (Windsurf)", + }, + SideEffects: []string{ + "Creates AI configuration files under .github/, CLAUDE.md, .cursor/, .windsurfrules", + }, + GatesTouched: []string{}, + ErrorCodes: []errcode.Code{ErrCompanionFailed}, + }) +} + +// New returns the top-level `forge companion` cobra command. +func New() *cobra.Command { + cmd := &cobra.Command{ + Use: "companion", + Short: "Zero-setup AI pairing: configure every AI tool as a Forge expert.", + Long: `forge companion wires the Forge expert persona into every AI coding tool +you have installed — VS Code Copilot, Claude, Cursor, and Windsurf. + +After running forge companion, open your AI chat tool and describe what you +want to build in plain English. The assistant will guide you through the full +Forge ship workflow: spec → scaffold → implement → test → review → ship. + + forge companion — detect + install (all platforms, safe defaults) + forge companion update — regenerate to pick up latest KB changes + forge companion status — show installed platforms + forge companion guide — print the vibe-coding quick-start cheatsheet`, + RunE: func(cmd *cobra.Command, _ []string) error { + // Default action: install all platforms (safe — skips existing files). + return runCompanionInstall(cmd, "", "all", false, false) + }, + } + cmd.AddCommand( + newInstallSubCmd(), + newUpdateSubCmd(), + newStatusSubCmd(), + newGuideSubCmd(), + ) + return cmd +} + +// ── install ─────────────────────────────────────────────────────────────────── + +func newInstallSubCmd() *cobra.Command { + var ( + root string + platform string + force bool + yes bool + ) + cmd := &cobra.Command{ + Use: "install", + Short: "Install the Forge expert persona for all AI tools.", + Example: ` forge companion install + forge companion install --for copilot + forge companion install --for claude --force`, + RunE: func(cmd *cobra.Command, _ []string) error { + return runCompanionInstall(cmd, root, platform, force, yes) + }, + } + cmd.Flags().StringVarP(&root, "root", "r", "", "project root (default: cwd)") + cmd.Flags().StringVar(&platform, "for", "all", "target platform: copilot, claude, cursor, windsurf, all") + cmd.Flags().BoolVarP(&force, "force", "f", false, "overwrite existing skill files") + cmd.Flags().BoolVarP(&yes, "yes", "y", false, "non-interactive (skip confirmation)") + return cmd +} + +// ── update ──────────────────────────────────────────────────────────────────── + +func newUpdateSubCmd() *cobra.Command { + var ( + root string + platform string + ) + cmd := &cobra.Command{ + Use: "update", + Short: "Regenerate skill files to pick up new KB entries and template changes.", + RunE: func(cmd *cobra.Command, _ []string) error { + return runCompanionInstall(cmd, root, platform, true /*force*/, true /*yes*/) + }, + } + cmd.Flags().StringVarP(&root, "root", "r", "", "project root (default: cwd)") + cmd.Flags().StringVar(&platform, "for", "all", "target platform: copilot, claude, cursor, windsurf, all") + return cmd +} + +// ── status ──────────────────────────────────────────────────────────────────── + +func newStatusSubCmd() *cobra.Command { + var root string + cmd := &cobra.Command{ + Use: "status", + Short: "Show which AI platforms have the Forge expert persona installed.", + RunE: func(cmd *cobra.Command, _ []string) error { + if root == "" { + var err error + root, err = os.Getwd() + if err != nil { + return errcode.New(ErrCompanionFailed, "cannot determine working directory", err) + } + } + printCompanionStatus(cmd, root) + return nil + }, + } + cmd.Flags().StringVarP(&root, "root", "r", "", "project root (default: cwd)") + return cmd +} + +// ── guide ───────────────────────────────────────────────────────────────────── + +func newGuideSubCmd() *cobra.Command { + cmd := &cobra.Command{ + Use: "guide", + Short: "Print the vibe-coding quick-start guide.", + RunE: func(cmd *cobra.Command, _ []string) error { + fmt.Fprint(cmd.OutOrStdout(), vibeCodeGuide()) + return nil + }, + } + return cmd +} + +// ── implementation ───────────────────────────────────────────────────────────── + +func runCompanionInstall(cmd *cobra.Command, root, platform string, force, _ bool) error { + if root == "" { + var err error + root, err = os.Getwd() + if err != nil { + return errcode.New(ErrCompanionFailed, "cannot determine working directory", err) + } + } + + fmt.Fprintln(cmd.OutOrStdout(), "") + fmt.Fprintln(cmd.OutOrStdout(), " Forge Companion — AI Pairing Setup") + fmt.Fprintln(cmd.OutOrStdout(), " ───────────────────────────────────") + fmt.Fprintf(cmd.OutOrStdout(), " Project root : %s\n", root) + fmt.Fprintf(cmd.OutOrStdout(), " Platform : %s\n", platform) + fmt.Fprintln(cmd.OutOrStdout(), "") + + res, err := cmdskill.RunInstall(root, "forge-expert", platform, force, false) + if err != nil { + return errcode.New(ErrCompanionFailed, "skill installation failed", err) + } + + for _, f := range res.Written { + fmt.Fprintf(cmd.OutOrStdout(), " ✓ %s\n", f.RelPath) + } + for _, f := range res.Skipped { + fmt.Fprintf(cmd.OutOrStdout(), " – %s (already installed)\n", f.RelPath) + } + + if len(res.Written) == 0 && len(res.Skipped) > 0 { + fmt.Fprintln(cmd.OutOrStdout(), "") + fmt.Fprintln(cmd.OutOrStdout(), " Already up to date. Run `forge companion update` to refresh.") + return nil + } + + fmt.Fprintln(cmd.OutOrStdout(), "") + fmt.Fprintln(cmd.OutOrStdout(), " ✨ Forge AI companion installed!") + fmt.Fprintln(cmd.OutOrStdout(), "") + fmt.Fprintln(cmd.OutOrStdout(), " How to use it:") + fmt.Fprintln(cmd.OutOrStdout(), " 1. Open VS Code Chat → switch to \"forge-expert\" mode") + fmt.Fprintln(cmd.OutOrStdout(), " (or Claude/Cursor/Windsurf — all configured)") + fmt.Fprintln(cmd.OutOrStdout(), " 2. Describe what you want to build in plain English") + fmt.Fprintln(cmd.OutOrStdout(), " 3. The assistant runs the full Forge workflow for you") + fmt.Fprintln(cmd.OutOrStdout(), "") + fmt.Fprintln(cmd.OutOrStdout(), " Run `forge companion guide` for vibe-coding examples.") + fmt.Fprintln(cmd.OutOrStdout(), "") + return nil +} + +// platformFiles lists the indicator files per platform. +var platformFiles = map[string]string{ + "VS Code Copilot": ".github/chatmodes/forge-expert.chatmode.md", + "Claude": "CLAUDE.md", + "Cursor": ".cursor/rules/forge-expert.mdc", + "Windsurf": ".windsurfrules", +} + +func printCompanionStatus(cmd *cobra.Command, root string) { + fmt.Fprintln(cmd.OutOrStdout(), "") + fmt.Fprintln(cmd.OutOrStdout(), " Forge Companion — Platform Status") + fmt.Fprintln(cmd.OutOrStdout(), " ──────────────────────────────────") + for platform, rel := range platformFiles { + path := filepath.Join(root, rel) + if _, err := os.Stat(path); err == nil { + fmt.Fprintf(cmd.OutOrStdout(), " ✓ %-20s %s\n", platform, rel) + } else { + fmt.Fprintf(cmd.OutOrStdout(), " – %-20s not installed\n", platform) + } + } + fmt.Fprintln(cmd.OutOrStdout(), "") + fmt.Fprintln(cmd.OutOrStdout(), " Run `forge companion install` to configure missing platforms.") + fmt.Fprintln(cmd.OutOrStdout(), "") +} + +// vibeCodeGuide returns the vibe-coding quick-start cheatsheet. +func vibeCodeGuide() string { + // Cannot use a raw string literal here because the content contains backtick + // characters (e.g. `forge skill list`). We build the string via concatenation + // so the Go parser is not confused by embedded backticks. + bt := "`" // backtick character + return "\n" + + " ╔══════════════════════════════════════════════════════════════════════╗\n" + + " ║ Forge Vibe-Coding Quick-Start Guide ║\n" + + " ╚══════════════════════════════════════════════════════════════════════╝\n" + + "\n" + + " WHAT IS VIBE-CODING WITH FORGE?\n" + + " ─────────────────────────────────────────────────────────────────────\n" + + " You describe what you want in plain English.\n" + + " The Forge AI companion handles spec → scaffold → implement → test →\n" + + " security-check → commit — the complete production-quality workflow.\n" + + "\n" + + " ══════════════════════════════════════════════════════════════════════\n" + + " DAILY WORKFLOW PATTERNS\n" + + " ══════════════════════════════════════════════════════════════════════\n" + + "\n" + + " ┌─ Feature Workflow ─────────────────────────────────────────────────┐\n" + + " │ │\n" + + " │ You: \"Build a rate-limiter middleware for the API gateway. │\n" + + " │ It should allow 100 req/min per client IP, return 429 │\n" + + " │ with Retry-After header, and store counters in Redis.\" │\n" + + " │ │\n" + + " │ AI: Creates spec → scaffolds middleware → implements with │\n" + + " │ Redis sliding window → writes 9-point test suite → │\n" + + " │ runs security check → commits on feature branch │\n" + + " └─────────────────────────────────────────────────────────────────────┘\n" + + "\n" + + " ┌─ Bugfix Workflow ───────────────────────────────────────────────────┐\n" + + " │ │\n" + + " │ You: \"Fix this panic: goroutine 1 [running]: │\n" + + " │ runtime error: index out of range [3] with length 3 │\n" + + " │ cmdship/pipeline.go:147 +0x2a4\" │\n" + + " │ │\n" + + " │ AI: Reads file:147 → writes failing test → traces root cause → │\n" + + " │ applies minimal fix → verifies test passes → regression guard │\n" + + " └─────────────────────────────────────────────────────────────────────┘\n" + + "\n" + + " ┌─ Security Scan Workflow ────────────────────────────────────────────┐\n" + + " │ │\n" + + " │ You: \"Scan the last 3 commits for secrets, OWASP Top 10 │\n" + + " │ violations, and CVEs in the dependency tree.\" │\n" + + " │ │\n" + + " │ AI: Checks git diff → scans for hardcoded creds → audits deps → │\n" + + " │ reports: severity / file / line / remediation │\n" + + " └─────────────────────────────────────────────────────────────────────┘\n" + + "\n" + + " ┌─ Morning Standup Workflow ──────────────────────────────────────────┐\n" + + " │ │\n" + + " │ You: \"Summarise what was shipped yesterday, what's in-flight, │\n" + + " │ and any blockers from the current forge ship status.\" │\n" + + " │ │\n" + + " │ AI: Reads .forge/specs/ + git log + open PRs → formats standup │\n" + + " └─────────────────────────────────────────────────────────────────────┘\n" + + "\n" + + " ┌─ Code Review Workflow ──────────────────────────────────────────────┐\n" + + " │ │\n" + + " │ You: \"Review this PR. Focus on: correctness of the rate-limiter │\n" + + " │ algorithm, error handling, and test coverage gaps.\" │\n" + + " │ │\n" + + " │ AI: Reads diff → checks spec alignment → identifies gaps → │\n" + + " │ produces inline comments with severity + suggested fixes │\n" + + " └─────────────────────────────────────────────────────────────────────┘\n" + + "\n" + + " ══════════════════════════════════════════════════════════════════════\n" + + " TOP 10 DAILY COMMANDS (paste directly into your AI chat)\n" + + " ══════════════════════════════════════════════════════════════════════\n" + + "\n" + + " 1. \"Ship \"\n" + + " → full 6-stage pipeline: spec, scaffold, implement, test, scan, commit\n" + + "\n" + + " 2. \"forge bugfix: \"\n" + + " → reproduce → root cause → fix → verify → regression guard\n" + + "\n" + + " 3. \"forge scan for secrets and OWASP issues in the current branch\"\n" + + " → comprehensive security review before PR\n" + + "\n" + + " 4. \"forge review focusing on \"\n" + + " → AI code review with actionable inline suggestions\n" + + "\n" + + " 5. \"forge test using the 9-point framework\"\n" + + " → test design → happy/boundary/negative/race/authz/regression\n" + + "\n" + + " 6. \"Upgrade dependencies and fix breaking changes in \"\n" + + " → safe upgrade with change-log analysis and migration\n" + + "\n" + + " 7. \"Add a to this project following Forge conventions\"\n" + + " → type-safe scaffold + wired into existing patterns\n" + + "\n" + + " 8. \"Explain what forge does with an example\"\n" + + " → learn any Forge verb with real usage context\n" + + "\n" + + " 9. \"What's the forge error code for ? Pick the next available.\"\n" + + " → error-code assignment from docs/ERROR_CODES.md\n" + + "\n" + + " 10. \"Prepare a postmortem for the incident: \"\n" + + " → structured postmortem with root cause, timeline, action items\n" + + "\n" + + " ══════════════════════════════════════════════════════════════════════\n" + + " TIPS\n" + + " ══════════════════════════════════════════════════════════════════════\n" + + "\n" + + " • Be specific about constraints: \"Redis\", \"no CGO\", \"must be idempotent\"\n" + + " • Mention the target file when fixing bugs: use forge bugfix with the filename\n" + + " • Ask for the test design BEFORE the implementation: \"Design the tests first\"\n" + + " • Use \"dry-run\" when exploring: \"What would forge scan find here?\"\n" + + " • Keep the AI in Forge mode: switch to \"forge-expert\" chat mode in VS Code\n" + + "\n" + + " Run " + bt + "forge skill list" + bt + " to see all installed skill files.\n" + + " Run " + bt + "forge companion update" + bt + " to refresh after a forge upgrade.\n" + + "\n" +} diff --git a/internal/cli/cmdcompanion/companion_test.go b/internal/cli/cmdcompanion/companion_test.go new file mode 100644 index 0000000..394494d --- /dev/null +++ b/internal/cli/cmdcompanion/companion_test.go @@ -0,0 +1,257 @@ +// Copyright 2024 The Forge Authors +// +// Licensed under the Apache License, Version 2.0 (the "License"); +// you may not use this file except in compliance with the License. +// You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +// companion_test.go — tests for `forge companion`. + +package cmdcompanion + +import ( + "bytes" + "os" + "path/filepath" + "strings" + "testing" + + "github.com/spf13/cobra" +) + +// execCmd runs cmd with args and captures stdout+stderr. +func execCmd(t *testing.T, cmd *cobra.Command, args ...string) (string, error) { + t.Helper() + var buf bytes.Buffer + cmd.SetOut(&buf) + cmd.SetErr(&buf) + cmd.SetArgs(args) + err := cmd.Execute() + return buf.String(), err +} + +// TestNew_ReturnsCommand verifies the command is constructed correctly. +func TestNew_ReturnsCommand(t *testing.T) { + t.Parallel() + cmd := New() + if cmd == nil { + t.Fatal("New() returned nil") + } + if cmd.Use != "companion" { + t.Errorf("Use = %q, want %q", cmd.Use, "companion") + } +} + +// TestNew_HasExpectedSubcommands verifies install/update/status/guide exist. +func TestNew_HasExpectedSubcommands(t *testing.T) { + t.Parallel() + cmd := New() + want := map[string]bool{"install": false, "update": false, "status": false, "guide": false} + for _, sub := range cmd.Commands() { + want[sub.Name()] = true + } + for name, found := range want { + if !found { + t.Errorf("subcommand %q not registered", name) + } + } +} + +// TestGuide_PrintsVibeCodeContent verifies guide contains key vibe-coding headings. +func TestGuide_PrintsVibeCodeContent(t *testing.T) { + t.Parallel() + cmd := New() + out, err := execCmd(t, cmd, "guide") + if err != nil { + t.Fatalf("guide returned error: %v", err) + } + checks := []string{ + "Vibe-Coding", + "Feature Workflow", + "Bugfix Workflow", + "Morning Standup", + "forge ship", + "forge bugfix", + } + for _, want := range checks { + if !strings.Contains(out, want) { + t.Errorf("guide output missing %q", want) + } + } +} + +// TestStatus_ShowsPlatforms verifies status output mentions all 4 platforms. +func TestStatus_ShowsPlatforms(t *testing.T) { + t.Parallel() + root := t.TempDir() + cmd := New() + out, err := execCmd(t, cmd, "status", "--root", root) + if err != nil { + t.Fatalf("status returned error: %v", err) + } + platforms := []string{"VS Code Copilot", "Claude", "Cursor", "Windsurf"} + for _, p := range platforms { + if !strings.Contains(out, p) { + t.Errorf("status output missing platform %q", p) + } + } +} + +// TestStatus_MarksMissingPlatforms verifies '–' marker for uninstalled platforms. +func TestStatus_MarksMissingPlatforms(t *testing.T) { + t.Parallel() + root := t.TempDir() + cmd := New() + out, err := execCmd(t, cmd, "status", "--root", root) + if err != nil { + t.Fatalf("status returned error: %v", err) + } + if !strings.Contains(out, "not installed") { + t.Error("expected 'not installed' for unconfigured platforms") + } +} + +// TestStatus_MarksInstalledPlatform verifies '✓' marker when file exists. +func TestStatus_MarksInstalledPlatform(t *testing.T) { + t.Parallel() + root := t.TempDir() + // Simulate the Copilot chatmode file existing. + chatmodeDir := filepath.Join(root, ".github", "chatmodes") + if err := os.MkdirAll(chatmodeDir, 0o755); err != nil { + t.Fatal(err) + } + if err := os.WriteFile(filepath.Join(chatmodeDir, "forge-expert.chatmode.md"), []byte("# forge"), 0o600); err != nil { + t.Fatal(err) + } + + cmd := New() + out, err := execCmd(t, cmd, "status", "--root", root) + if err != nil { + t.Fatalf("status returned error: %v", err) + } + if !strings.Contains(out, "✓") { + t.Error("expected ✓ check for installed platform") + } +} + +// TestInstall_WritesFiles verifies install subcommand actually writes files. +func TestInstall_WritesFiles(t *testing.T) { + t.Parallel() + root := t.TempDir() + cmd := New() + _, err := execCmd(t, cmd, "install", "--root", root, "--for", "copilot") + if err != nil { + t.Fatalf("install returned error: %v", err) + } + // Copilot chatmode file must exist after install. + chatmode := filepath.Join(root, ".github", "chatmodes", "forge-expert.chatmode.md") + if _, err := os.Stat(chatmode); os.IsNotExist(err) { + t.Errorf("chatmode file not created: %s", chatmode) + } +} + +// TestInstall_SkipsExistingWithoutForce verifies idempotency without --force. +func TestInstall_SkipsExistingWithoutForce(t *testing.T) { + t.Parallel() + root := t.TempDir() + // First install. + cmd1 := New() + if _, err := execCmd(t, cmd1, "install", "--root", root, "--for", "copilot"); err != nil { + t.Fatalf("first install: %v", err) + } + // Second install without --force → should skip. + cmd2 := New() + out, err := execCmd(t, cmd2, "install", "--root", root, "--for", "copilot") + if err != nil { + t.Fatalf("second install: %v", err) + } + if !strings.Contains(out, "up to date") { + t.Errorf("expected 'up to date' message on second install, got: %s", out) + } +} + +// TestInstall_ForceOverwrites verifies --force rewrites existing files. +func TestInstall_ForceOverwrites(t *testing.T) { + t.Parallel() + root := t.TempDir() + // Write a sentinel file. + chatmodeDir := filepath.Join(root, ".github", "chatmodes") + if err := os.MkdirAll(chatmodeDir, 0o755); err != nil { + t.Fatal(err) + } + path := filepath.Join(chatmodeDir, "forge-expert.chatmode.md") + if err := os.WriteFile(path, []byte("old-content"), 0o600); err != nil { + t.Fatal(err) + } + + cmd := New() + if _, err := execCmd(t, cmd, "install", "--root", root, "--for", "copilot", "--force"); err != nil { + t.Fatalf("install --force: %v", err) + } + data, err := os.ReadFile(path) + if err != nil { + t.Fatal(err) + } + if string(data) == "old-content" { + t.Error("--force should have overwritten the existing file") + } +} + +// TestUpdate_RegeneratesFiles verifies update overwrites without --force flag. +func TestUpdate_RegeneratesFiles(t *testing.T) { + t.Parallel() + root := t.TempDir() + chatmodeDir := filepath.Join(root, ".github", "chatmodes") + if err := os.MkdirAll(chatmodeDir, 0o755); err != nil { + t.Fatal(err) + } + path := filepath.Join(chatmodeDir, "forge-expert.chatmode.md") + if err := os.WriteFile(path, []byte("stale"), 0o600); err != nil { + t.Fatal(err) + } + + cmd := New() + if _, err := execCmd(t, cmd, "update", "--root", root, "--for", "copilot"); err != nil { + t.Fatalf("update: %v", err) + } + data, err := os.ReadFile(path) + if err != nil { + t.Fatal(err) + } + if string(data) == "stale" { + t.Error("update should have regenerated the stale skill file") + } +} + +// TestVibeCodeGuide_ContainsTopTenCommands verifies the 10 daily commands are listed. +func TestVibeCodeGuide_ContainsTopTenCommands(t *testing.T) { + t.Parallel() + guide := vibeCodeGuide() + commands := []string{"forge ship", "forge bugfix", "forge scan", "forge review", "forge test"} + for _, c := range commands { + if !strings.Contains(guide, c) { + t.Errorf("guide missing command %q", c) + } + } +} + +// TestVibeCodeGuide_FalsePositiveGuard verifies the guide does NOT contain +// internal/implementation details that would confuse end users. +func TestVibeCodeGuide_FalsePositiveGuard(t *testing.T) { + t.Parallel() + guide := vibeCodeGuide() + // The guide is user-facing, not developer docs — should not expose internal paths. + forbidden := []string{"internal/cli", "errcode.Register", "verbmeta.Register"} + for _, f := range forbidden { + if strings.Contains(guide, f) { + t.Errorf("guide should not contain internal detail %q", f) + } + } +} diff --git a/internal/cli/cmdinit/init.go b/internal/cli/cmdinit/init.go index dc51ce6..9aa7be1 100644 --- a/internal/cli/cmdinit/init.go +++ b/internal/cli/cmdinit/init.go @@ -197,6 +197,9 @@ func New(forgeVersion string) *cobra.Command { default: fmt.Fprintln(cmd.OutOrStdout(), " forge doctor") } + // Nudge: auto-pairing with AI tools. + fmt.Fprintln(cmd.OutOrStdout(), "\n→ Pair AI with this project (VS Code, Claude, Cursor, Windsurf):") + fmt.Fprintln(cmd.OutOrStdout(), " forge companion") return nil }, } @@ -264,6 +267,9 @@ func runMinimal(cmd *cobra.Command, target, name, forgeVersion string, asJSON, f fmt.Fprintln(cmd.OutOrStdout(), " forge doctor # validate forge.config.yml") fmt.Fprintln(cmd.OutOrStdout(), " forge ship \"\" # start the 6-checkpoint pipeline") fmt.Fprintln(cmd.OutOrStdout(), " forge lint # check conventions") + // Nudge: auto-pairing with AI tools. + fmt.Fprintln(cmd.OutOrStdout(), "\n→ Pair AI with this project (VS Code, Claude, Cursor, Windsurf):") + fmt.Fprintln(cmd.OutOrStdout(), " forge companion") return nil } diff --git a/internal/cli/cmdmetrics/metrics.go b/internal/cli/cmdmetrics/metrics.go new file mode 100644 index 0000000..d41706c --- /dev/null +++ b/internal/cli/cmdmetrics/metrics.go @@ -0,0 +1,111 @@ +// Copyright 2024 The Forge Authors +// +// Licensed under the Apache License, Version 2.0 (the "License"); +// you may not use this file except in compliance with the License. +// You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +// Package cmdmetrics implements `forge metrics` (P2: Prometheus token-ledger export). +// +// forge metrics reads .forge/token-ledger.jsonl and prints cumulative +// token and cost counters in Prometheus text format. The output is suitable +// for scraping by a Prometheus pull gateway or piping to a Push Gateway: +// +// forge metrics | curl --data-binary @- \ +// http://pushgateway:9091/metrics/job/forge +// +// Error codes reserved in range 6800–6849. +package cmdmetrics + +import ( + "fmt" + "os" + "path/filepath" + + "github.com/spf13/cobra" + + "github.com/teragrid/forge/internal/errcode" + "github.com/teragrid/forge/internal/tokenledger" + "github.com/teragrid/forge/internal/verbmeta" +) + +// Reserved error codes (range 6600..6649). +var ( + ErrMetricsFailed = errcode.Register(errcode.Code(6600), "metrics export failed") +) + +func init() { + verbmeta.Register(verbmeta.Manifest{ + Verb: "metrics", + Summary: "Export token-ledger data as Prometheus text format (P2, ADR-026).", + Inputs: []string{ + "--root — project root (default: .)", + "--ledger — override ledger file path", + }, + Outputs: []string{ + "stdout: Prometheus text format with forge_tokens_total and forge_cost_usd_total", + }, + SideEffects: []string{"none (read-only)"}, + GatesTouched: []string{}, + ErrorCodes: []errcode.Code{ErrMetricsFailed}, + }) +} + +// New returns the cobra command for `forge metrics`. +func New() *cobra.Command { + var ( + root string + ledgerPath string + ) + + cmd := &cobra.Command{ + Use: "metrics", + Short: "Export token-ledger data as Prometheus text format", + Long: `forge metrics reads the token ledger (.forge/token-ledger.jsonl) and +prints cumulative token and cost counters in Prometheus text format. + +Metrics exported: + forge_tokens_total{model,operation,type} — cumulative token counter + forge_cost_usd_total{model,operation} — cumulative cost in USD + +Pipe the output to a Prometheus Push Gateway: + forge metrics | curl --data-binary @- http://pushgateway:9091/metrics/job/forge`, + SilenceUsage: true, + RunE: func(cmd *cobra.Command, _ []string) error { + if root == "" { + cwd, err := os.Getwd() + if err != nil { + return errcode.New(ErrMetricsFailed, "cannot determine working directory", err) + } + root = cwd + } + if ledgerPath == "" { + ledgerPath = filepath.Join(root, tokenledger.DefaultPath) + } + + ledger := tokenledger.New(ledgerPath) + output, err := ledger.ExportPrometheus() + if err != nil { + return errcode.New(ErrMetricsFailed, + fmt.Sprintf("failed to export metrics from %s", ledgerPath), err) + } + if output == "" { + fmt.Fprintln(cmd.OutOrStdout(), "# No token-ledger entries found.") + return nil + } + fmt.Fprint(cmd.OutOrStdout(), output) + return nil + }, + } + + cmd.Flags().StringVar(&root, "root", "", "project root directory (default: current directory)") + cmd.Flags().StringVar(&ledgerPath, "ledger", "", "override ledger file path") + return cmd +} diff --git a/internal/cli/cmdship/.forge/trash/ship--1779975130486/manifest.json b/internal/cli/cmdship/.forge/trash/ship--1779975130486/manifest.json new file mode 100644 index 0000000..92a5a57 --- /dev/null +++ b/internal/cli/cmdship/.forge/trash/ship--1779975130486/manifest.json @@ -0,0 +1 @@ +{"run_id":"ship--1779975130486","ts":"2026-05-28T13:32:10Z","verb":"ship","files":[{"orig_path":"I:\\AI-Startup\\forge\\internal\\cli\\cmdship\\.forge\\specs","save_path":"I:\\AI-Startup\\forge\\internal\\cli\\cmdship\\.forge\\.snapshots","mode":493}]} \ No newline at end of file diff --git a/internal/cli/cmdship/.forge/trash/ship--1779975137984/manifest.json b/internal/cli/cmdship/.forge/trash/ship--1779975137984/manifest.json new file mode 100644 index 0000000..4e1ea79 --- /dev/null +++ b/internal/cli/cmdship/.forge/trash/ship--1779975137984/manifest.json @@ -0,0 +1 @@ +{"run_id":"ship--1779975137984","ts":"2026-05-28T13:32:17Z","verb":"ship","files":[{"orig_path":"I:\\AI-Startup\\forge\\internal\\cli\\cmdship\\.forge\\specs","save_path":"I:\\AI-Startup\\forge\\internal\\cli\\cmdship\\.forge\\.snapshots","mode":493}]} \ No newline at end of file diff --git a/internal/cli/cmdship/.forge/trash/ship--1779975145514/manifest.json b/internal/cli/cmdship/.forge/trash/ship--1779975145514/manifest.json new file mode 100644 index 0000000..959c045 --- /dev/null +++ b/internal/cli/cmdship/.forge/trash/ship--1779975145514/manifest.json @@ -0,0 +1 @@ +{"run_id":"ship--1779975145514","ts":"2026-05-28T13:32:25Z","verb":"ship","files":[{"orig_path":"I:\\AI-Startup\\forge\\internal\\cli\\cmdship\\.forge\\specs","save_path":"I:\\AI-Startup\\forge\\internal\\cli\\cmdship\\.forge\\.snapshots","mode":493}]} \ No newline at end of file diff --git a/internal/cli/cmdship/.forge/trash/ship--1779975145574/manifest.json b/internal/cli/cmdship/.forge/trash/ship--1779975145574/manifest.json new file mode 100644 index 0000000..86decb1 --- /dev/null +++ b/internal/cli/cmdship/.forge/trash/ship--1779975145574/manifest.json @@ -0,0 +1 @@ +{"run_id":"ship--1779975145574","ts":"2026-05-28T13:32:25Z","verb":"ship","files":[{"orig_path":"I:\\AI-Startup\\forge\\internal\\cli\\cmdship\\.forge\\specs","save_path":"I:\\AI-Startup\\forge\\internal\\cli\\cmdship\\.forge\\.snapshots","mode":493}]} \ No newline at end of file diff --git a/internal/cli/cmdship/.forge/trash/ship--1779975146201/manifest.json b/internal/cli/cmdship/.forge/trash/ship--1779975146201/manifest.json new file mode 100644 index 0000000..8a5ac6c --- /dev/null +++ b/internal/cli/cmdship/.forge/trash/ship--1779975146201/manifest.json @@ -0,0 +1 @@ +{"run_id":"ship--1779975146201","ts":"2026-05-28T13:32:26Z","verb":"ship","files":[{"orig_path":"I:\\AI-Startup\\forge\\internal\\cli\\cmdship\\.forge\\specs","save_path":"I:\\AI-Startup\\forge\\internal\\cli\\cmdship\\.forge\\.snapshots","mode":493}]} \ No newline at end of file diff --git a/internal/cli/cmdship/.forge/trash/ship--1779975156284/manifest.json b/internal/cli/cmdship/.forge/trash/ship--1779975156284/manifest.json new file mode 100644 index 0000000..fa15882 --- /dev/null +++ b/internal/cli/cmdship/.forge/trash/ship--1779975156284/manifest.json @@ -0,0 +1 @@ +{"run_id":"ship--1779975156284","ts":"2026-05-28T13:32:36Z","verb":"ship","files":[{"orig_path":"I:\\AI-Startup\\forge\\internal\\cli\\cmdship\\.forge\\specs","save_path":"I:\\AI-Startup\\forge\\internal\\cli\\cmdship\\.forge\\.snapshots","mode":493}]} \ No newline at end of file diff --git a/internal/cli/cmdship/.forge/trash/ship--1779975156775/manifest.json b/internal/cli/cmdship/.forge/trash/ship--1779975156775/manifest.json new file mode 100644 index 0000000..303e36e --- /dev/null +++ b/internal/cli/cmdship/.forge/trash/ship--1779975156775/manifest.json @@ -0,0 +1 @@ +{"run_id":"ship--1779975156775","ts":"2026-05-28T13:32:36Z","verb":"ship","files":[{"orig_path":"I:\\AI-Startup\\forge\\internal\\cli\\cmdship\\.forge\\specs","save_path":"I:\\AI-Startup\\forge\\internal\\cli\\cmdship\\.forge\\.snapshots","mode":493}]} \ No newline at end of file diff --git a/internal/cli/cmdship/.forge/trash/ship--1779976072322/manifest.json b/internal/cli/cmdship/.forge/trash/ship--1779976072322/manifest.json new file mode 100644 index 0000000..828ee72 --- /dev/null +++ b/internal/cli/cmdship/.forge/trash/ship--1779976072322/manifest.json @@ -0,0 +1 @@ +{"run_id":"ship--1779976072322","ts":"2026-05-28T13:47:52Z","verb":"ship","files":[{"orig_path":"I:\\AI-Startup\\forge\\internal\\cli\\cmdship\\.forge\\specs","save_path":"I:\\AI-Startup\\forge\\internal\\cli\\cmdship\\.forge\\.snapshots","mode":493}]} \ No newline at end of file diff --git a/internal/cli/cmdship/.forge/trash/ship--1779976073890/manifest.json b/internal/cli/cmdship/.forge/trash/ship--1779976073890/manifest.json new file mode 100644 index 0000000..4c8aa3c --- /dev/null +++ b/internal/cli/cmdship/.forge/trash/ship--1779976073890/manifest.json @@ -0,0 +1 @@ +{"run_id":"ship--1779976073890","ts":"2026-05-28T13:47:53Z","verb":"ship","files":[{"orig_path":"I:\\AI-Startup\\forge\\internal\\cli\\cmdship\\.forge\\specs","save_path":"I:\\AI-Startup\\forge\\internal\\cli\\cmdship\\.forge\\.snapshots","mode":493}]} \ No newline at end of file diff --git a/internal/cli/cmdship/.forge/trash/ship--1779976082160/manifest.json b/internal/cli/cmdship/.forge/trash/ship--1779976082160/manifest.json new file mode 100644 index 0000000..b1a2f5a --- /dev/null +++ b/internal/cli/cmdship/.forge/trash/ship--1779976082160/manifest.json @@ -0,0 +1 @@ +{"run_id":"ship--1779976082160","ts":"2026-05-28T13:48:02Z","verb":"ship","files":[{"orig_path":"I:\\AI-Startup\\forge\\internal\\cli\\cmdship\\.forge\\specs","save_path":"I:\\AI-Startup\\forge\\internal\\cli\\cmdship\\.forge\\.snapshots","mode":493}]} \ No newline at end of file diff --git a/internal/cli/cmdship/.forge/trash/ship--1779976082440/manifest.json b/internal/cli/cmdship/.forge/trash/ship--1779976082440/manifest.json new file mode 100644 index 0000000..2417eb5 --- /dev/null +++ b/internal/cli/cmdship/.forge/trash/ship--1779976082440/manifest.json @@ -0,0 +1 @@ +{"run_id":"ship--1779976082440","ts":"2026-05-28T13:48:02Z","verb":"ship","files":[{"orig_path":"I:\\AI-Startup\\forge\\internal\\cli\\cmdship\\.forge\\specs","save_path":"I:\\AI-Startup\\forge\\internal\\cli\\cmdship\\.forge\\.snapshots","mode":493}]} \ No newline at end of file diff --git a/internal/cli/cmdship/.forge/trash/ship--1779976082638/manifest.json b/internal/cli/cmdship/.forge/trash/ship--1779976082638/manifest.json new file mode 100644 index 0000000..0904f1b --- /dev/null +++ b/internal/cli/cmdship/.forge/trash/ship--1779976082638/manifest.json @@ -0,0 +1 @@ +{"run_id":"ship--1779976082638","ts":"2026-05-28T13:48:02Z","verb":"ship","files":[{"orig_path":"I:\\AI-Startup\\forge\\internal\\cli\\cmdship\\.forge\\specs","save_path":"I:\\AI-Startup\\forge\\internal\\cli\\cmdship\\.forge\\.snapshots","mode":493}]} \ No newline at end of file diff --git a/internal/cli/cmdship/.forge/trash/ship--1779976087632/manifest.json b/internal/cli/cmdship/.forge/trash/ship--1779976087632/manifest.json new file mode 100644 index 0000000..1231077 --- /dev/null +++ b/internal/cli/cmdship/.forge/trash/ship--1779976087632/manifest.json @@ -0,0 +1 @@ +{"run_id":"ship--1779976087632","ts":"2026-05-28T13:48:07Z","verb":"ship","files":[{"orig_path":"I:\\AI-Startup\\forge\\internal\\cli\\cmdship\\.forge\\specs","save_path":"I:\\AI-Startup\\forge\\internal\\cli\\cmdship\\.forge\\.snapshots","mode":493}]} \ No newline at end of file diff --git a/internal/cli/cmdship/.forge/trash/ship--1779976087798/manifest.json b/internal/cli/cmdship/.forge/trash/ship--1779976087798/manifest.json new file mode 100644 index 0000000..e83d4d3 --- /dev/null +++ b/internal/cli/cmdship/.forge/trash/ship--1779976087798/manifest.json @@ -0,0 +1 @@ +{"run_id":"ship--1779976087798","ts":"2026-05-28T13:48:07Z","verb":"ship","files":[{"orig_path":"I:\\AI-Startup\\forge\\internal\\cli\\cmdship\\.forge\\specs","save_path":"I:\\AI-Startup\\forge\\internal\\cli\\cmdship\\.forge\\.snapshots","mode":493}]} \ No newline at end of file diff --git a/internal/cli/cmdship/.forge/trash/ship--1779976254738/manifest.json b/internal/cli/cmdship/.forge/trash/ship--1779976254738/manifest.json new file mode 100644 index 0000000..690f213 --- /dev/null +++ b/internal/cli/cmdship/.forge/trash/ship--1779976254738/manifest.json @@ -0,0 +1 @@ +{"run_id":"ship--1779976254738","ts":"2026-05-28T13:50:54Z","verb":"ship","files":[{"orig_path":"I:\\AI-Startup\\forge\\internal\\cli\\cmdship\\.forge\\specs","save_path":"I:\\AI-Startup\\forge\\internal\\cli\\cmdship\\.forge\\.snapshots","mode":493}]} \ No newline at end of file diff --git a/internal/cli/cmdship/.forge/trash/ship--1779976264057/manifest.json b/internal/cli/cmdship/.forge/trash/ship--1779976264057/manifest.json new file mode 100644 index 0000000..bf0dca4 --- /dev/null +++ b/internal/cli/cmdship/.forge/trash/ship--1779976264057/manifest.json @@ -0,0 +1 @@ +{"run_id":"ship--1779976264057","ts":"2026-05-28T13:51:04Z","verb":"ship","files":[{"orig_path":"I:\\AI-Startup\\forge\\internal\\cli\\cmdship\\.forge\\specs","save_path":"I:\\AI-Startup\\forge\\internal\\cli\\cmdship\\.forge\\.snapshots","mode":493}]} \ No newline at end of file diff --git a/internal/cli/cmdship/.forge/trash/ship--1779976264376/manifest.json b/internal/cli/cmdship/.forge/trash/ship--1779976264376/manifest.json new file mode 100644 index 0000000..d2beb67 --- /dev/null +++ b/internal/cli/cmdship/.forge/trash/ship--1779976264376/manifest.json @@ -0,0 +1 @@ +{"run_id":"ship--1779976264376","ts":"2026-05-28T13:51:04Z","verb":"ship","files":[{"orig_path":"I:\\AI-Startup\\forge\\internal\\cli\\cmdship\\.forge\\specs","save_path":"I:\\AI-Startup\\forge\\internal\\cli\\cmdship\\.forge\\.snapshots","mode":493}]} \ No newline at end of file diff --git a/internal/cli/cmdship/.forge/trash/ship--1779976265666/manifest.json b/internal/cli/cmdship/.forge/trash/ship--1779976265666/manifest.json new file mode 100644 index 0000000..96db040 --- /dev/null +++ b/internal/cli/cmdship/.forge/trash/ship--1779976265666/manifest.json @@ -0,0 +1 @@ +{"run_id":"ship--1779976265666","ts":"2026-05-28T13:51:05Z","verb":"ship","files":[{"orig_path":"I:\\AI-Startup\\forge\\internal\\cli\\cmdship\\.forge\\specs","save_path":"I:\\AI-Startup\\forge\\internal\\cli\\cmdship\\.forge\\.snapshots","mode":493}]} \ No newline at end of file diff --git a/internal/cli/cmdship/.forge/trash/ship--1779976272642/manifest.json b/internal/cli/cmdship/.forge/trash/ship--1779976272642/manifest.json new file mode 100644 index 0000000..81f64d5 --- /dev/null +++ b/internal/cli/cmdship/.forge/trash/ship--1779976272642/manifest.json @@ -0,0 +1 @@ +{"run_id":"ship--1779976272642","ts":"2026-05-28T13:51:12Z","verb":"ship","files":[{"orig_path":"I:\\AI-Startup\\forge\\internal\\cli\\cmdship\\.forge\\specs","save_path":"I:\\AI-Startup\\forge\\internal\\cli\\cmdship\\.forge\\.snapshots","mode":493}]} \ No newline at end of file diff --git a/internal/cli/cmdship/.forge/trash/ship--1779976272793/manifest.json b/internal/cli/cmdship/.forge/trash/ship--1779976272793/manifest.json new file mode 100644 index 0000000..78b835d --- /dev/null +++ b/internal/cli/cmdship/.forge/trash/ship--1779976272793/manifest.json @@ -0,0 +1 @@ +{"run_id":"ship--1779976272793","ts":"2026-05-28T13:51:12Z","verb":"ship","files":[{"orig_path":"I:\\AI-Startup\\forge\\internal\\cli\\cmdship\\.forge\\specs","save_path":"I:\\AI-Startup\\forge\\internal\\cli\\cmdship\\.forge\\.snapshots","mode":493}]} \ No newline at end of file diff --git a/internal/cli/cmdship/.forge/trash/ship--1779976280561/manifest.json b/internal/cli/cmdship/.forge/trash/ship--1779976280561/manifest.json new file mode 100644 index 0000000..c81dbfe --- /dev/null +++ b/internal/cli/cmdship/.forge/trash/ship--1779976280561/manifest.json @@ -0,0 +1 @@ +{"run_id":"ship--1779976280561","ts":"2026-05-28T13:51:20Z","verb":"ship","files":[{"orig_path":"I:\\AI-Startup\\forge\\internal\\cli\\cmdship\\.forge\\specs","save_path":"I:\\AI-Startup\\forge\\internal\\cli\\cmdship\\.forge\\.snapshots","mode":493}]} \ No newline at end of file diff --git a/internal/cli/cmdship/llmpipe.go b/internal/cli/cmdship/llmpipe.go index 9eb24af..4b57cac 100644 --- a/internal/cli/cmdship/llmpipe.go +++ b/internal/cli/cmdship/llmpipe.go @@ -46,14 +46,24 @@ import ( "github.com/teragrid/forge/internal/knowledge" "github.com/teragrid/forge/internal/llmprovider" "github.com/teragrid/forge/internal/secretrewriter" + "github.com/teragrid/forge/internal/tierrouter" "github.com/teragrid/forge/internal/tokenledger" ) -// LLMPipe bundles an LLM Provider, a secret Rewriter, and a token Ledger. +// LLMPipe bundles an LLM Provider, a secret Rewriter, a token Ledger, and +// an optional tier router for complexity-driven model selection. type LLMPipe struct { provider llmprovider.Provider rewriter *secretrewriter.Rewriter ledger *tokenledger.Ledger + // router is used for complexity-based tier routing (T0→T1→T2 escalation). + // A nil router falls back to direct provider calls. + router *tierrouter.Router + // minTier is the minimum tier forced by complexity classification. + // Empty string means use T0 (cheapest available tier). + minTier string + // root is the project root used for layered knowledge-base loading. + root string } // newLLMPipe detects the active LLM provider from the environment and returns @@ -75,6 +85,25 @@ func newLLMPipeWithProvider(p llmprovider.Provider, root string) *LLMPipe { provider: p, rewriter: secretrewriter.New(), ledger: tokenledger.New(filepath.Join(root, tokenledger.DefaultPath)), + router: tierrouter.New(p, nil), // P1: cheap-first tier escalation + root: root, + } +} + +// SetComplexityTier maps a ComplexityTier to the minimum tier for LLM routing. +// nano/micro → T0 (cheap), standard → T1 (balanced), complex → T2 (powerful). +// Call this once per pipeline run after classifyComplexity is known. +func (p *LLMPipe) SetComplexityTier(ct ComplexityTier) { + if p == nil { + return + } + switch ct { + case ComplexityComplex: + p.minTier = tierrouter.TierPowerful + case ComplexityStandard: + p.minTier = tierrouter.TierBalanced + default: // nano, micro + p.minTier = tierrouter.TierCheap } } @@ -99,19 +128,44 @@ func (p *LLMPipe) Invoke(operation, model, system, user string, maxTokens int) ( UserPrompt: usr, MaxTokens: maxTokens, } - resp, err := p.provider.Complete(ctx, req) - if err != nil { - return "", err + + // P1: use the tier router when available to select the right model tier + // based on complexity (SetComplexityTier must be called before Invoke). + var ( + content string + respModel string + inputTokens int + outputTokens int + ) + if p.router != nil { + result, err := p.router.Route(ctx, *req, p.minTier) + if err != nil { + return "", err + } + content = result.Response + respModel = result.ModelUsed + inputTokens = result.TokensIn + outputTokens = result.TokensOut + } else { + resp, err := p.provider.Complete(ctx, req) + if err != nil { + return "", err + } + content = resp.Content + respModel = resp.Model + inputTokens = resp.InputTokens + outputTokens = resp.OutputTokens } + // Best-effort ledger append — a write failure never blocks the pipeline. _ = p.ledger.Append(tokenledger.Entry{ - Model: resp.Model, - InputTokens: resp.InputTokens, - OutputTokens: resp.OutputTokens, - CostUSD: estimateCost(resp.Model, resp.InputTokens, resp.OutputTokens), + Model: respModel, + InputTokens: inputTokens, + OutputTokens: outputTokens, + CostUSD: estimateCost(respModel, inputTokens, outputTokens), Operation: operation, }) - return resp.Content, nil + return content, nil } // InvokeWithKnowledge enriches the system prompt with relevant knowledge-base @@ -126,7 +180,14 @@ func (p *LLMPipe) InvokeWithKnowledge(operation, model, system, user string, max if p == nil { return "", nil } - idx, err := knowledge.Load() + // P1: use layered KB (project > user > embedded) when root is known. + var idx *knowledge.Index + var err error + if p.root != "" { + idx, err = knowledge.LoadLayered(p.root) + } else { + idx, err = knowledge.Load() + } if err != nil { // Graceful degradation: log nothing (no PII risk), proceed without KB. return p.Invoke(operation, model, system, user, maxTokens) @@ -210,6 +271,12 @@ func estimateCost(model string, inputTokens, outputTokens int) float64 { func specStub(description string) string { return fmt.Sprintf( "# Spec: %s\n\n"+ + "## Status Summary\n"+ + "- Lifecycle: Draft\n"+ + "- Version Scope: PATCH (default; confirm before release)\n"+ + "- Owner: \n"+ + "- Last Updated: \n"+ + "- Checkpoint Progress: 1/7\n\n"+ "## What\n%s\n\n"+ "## Why\n\n\n"+ "## Acceptance Criteria\n- [ ] \n\n"+ diff --git a/internal/cli/cmdship/llmpipe_tier_test.go b/internal/cli/cmdship/llmpipe_tier_test.go new file mode 100644 index 0000000..b173fef --- /dev/null +++ b/internal/cli/cmdship/llmpipe_tier_test.go @@ -0,0 +1,114 @@ +// Copyright 2024 The Forge Authors +// +// Licensed under the Apache License, Version 2.0 (the "License"); +// you may not use this file except in compliance with the License. +// You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +// llmpipe_tier_test.go — tests for P1 complexity-based tier routing in LLMPipe. + +package cmdship + +import ( + "context" + "testing" + + "github.com/teragrid/forge/internal/llmprovider" + "github.com/teragrid/forge/internal/tierrouter" +) + +// captureProvider records which model was used in the last Complete call. +type captureProvider struct { + lastModel string +} + +func (p *captureProvider) Name() string { return "anthropic" } +func (p *captureProvider) Complete(_ context.Context, req *llmprovider.Request) (*llmprovider.Response, error) { + p.lastModel = req.Model + return &llmprovider.Response{Content: "ok", Model: req.Model}, nil +} +func (p *captureProvider) Capabilities() llmprovider.Capabilities { return llmprovider.Capabilities{} } + +// TestSetComplexityTier_MapsToCorrectMinTier verifies the complexity → minTier mapping. +func TestSetComplexityTier_MapsToCorrectMinTier(t *testing.T) { + t.Parallel() + tests := []struct { + ct ComplexityTier + wantMin string + }{ + {ComplexityNano, tierrouter.TierCheap}, + {ComplexityMicro, tierrouter.TierCheap}, + {ComplexityStandard, tierrouter.TierBalanced}, + {ComplexityComplex, tierrouter.TierPowerful}, + } + for _, tc := range tests { + tc := tc + t.Run(string(tc.ct), func(t *testing.T) { + t.Parallel() + root := t.TempDir() + cp := &captureProvider{} + pipe := newLLMPipeWithProvider(cp, root) + pipe.SetComplexityTier(tc.ct) + if got := pipe.minTier; got != tc.wantMin { + t.Errorf("minTier = %q, want %q", got, tc.wantMin) + } + }) + } +} + +// TestSetComplexityTier_RouterNotNil verifies router is initialised in constructor. +func TestSetComplexityTier_RouterNotNil(t *testing.T) { + t.Parallel() + root := t.TempDir() + cp := &captureProvider{} + pipe := newLLMPipeWithProvider(cp, root) + if pipe.router == nil { + t.Fatal("expected router to be non-nil after newLLMPipeWithProvider") + } +} + +// TestInvoke_UsesRouterWhenMinTierSet verifies Invoke calls router.Route when +// minTier is set (i.e., SetComplexityTier has been called). +// For nano/micro the router starts at T0 (haiku/gpt-4o-mini). +func TestInvoke_UsesRouterWhenMinTierSet(t *testing.T) { + t.Parallel() + root := t.TempDir() + cp := &captureProvider{} + pipe := newLLMPipeWithProvider(cp, root) + pipe.SetComplexityTier(ComplexityNano) + + _, err := pipe.Invoke("test-op", "", "system", "user", 100) + if err != nil { + t.Fatalf("Invoke: %v", err) + } + // T0 model for anthropic is claude-3-5-haiku-20241022. + if got := cp.lastModel; got != "claude-3-5-haiku-20241022" { + t.Errorf("model used = %q, want T0 haiku model", got) + } +} + +// TestInvoke_FallbackToDirectWhenRouterNil ensures Invoke still works when +// router is nil (defensive nil-safety guard). +func TestInvoke_FallbackToDirectWhenRouterNil(t *testing.T) { + t.Parallel() + root := t.TempDir() + cp := &captureProvider{} + // Use the proper constructor so rewriter/ledger are initialised, + // then nil out the router to exercise the fallback path. + pipe := newLLMPipeWithProvider(cp, root) + pipe.router = nil + _, err := pipe.Invoke("test-op", "some-model", "system", "user", 100) + if err != nil { + t.Fatalf("Invoke with nil router: %v", err) + } + if cp.lastModel != "some-model" { + t.Errorf("expected model=some-model, got %q", cp.lastModel) + } +} diff --git a/internal/cli/cmdship/ship.go b/internal/cli/cmdship/ship.go index ef8d977..d5dd2ae 100644 --- a/internal/cli/cmdship/ship.go +++ b/internal/cli/cmdship/ship.go @@ -50,6 +50,7 @@ import ( "github.com/teragrid/forge/internal/gitservice" "github.com/teragrid/forge/internal/manifest" "github.com/teragrid/forge/internal/procspawn" + "github.com/teragrid/forge/internal/telemetry" "github.com/teragrid/forge/internal/verbmeta" ) @@ -1658,12 +1659,32 @@ func runWithOptions(opts RunOptions) *ShipResult { } // P2: take a snapshot before each checkpoint so that failures can be rolled back. + // Errors are warnings only — a snapshot failure must never block the pipeline. snapBefore := func(cpName string) { if specSlug != "" { - _ = TakeSnapshot(root, specSlug, cpName) + if err := TakeSnapshot(root, specSlug, cpName); err != nil { + // Non-fatal: log to stderr and continue. The snapshot is best-effort. + _, _ = fmt.Fprintf(os.Stderr, "forge ship: snapshot warning (cp=%s): %v\n", cpName, err) + } + } + } + // P2: restore the pre-checkpoint snapshot when a checkpoint fails. + // Provides all-or-nothing semantics for the spec artefacts directory. + snapOnFail := func(cpName string) { + if specSlug != "" { + if err := RestoreSnapshot(root, specSlug, cpName); err != nil { + _, _ = fmt.Fprintf(os.Stderr, "forge ship: restore-snapshot warning (cp=%s): %v\n", cpName, err) + } } } + // P2: write a TrashManifest so `forge undo` can locate this ship run. + shipRunID := fmt.Sprintf("ship-%s-%d", specSlug, time.Now().UnixMilli()) + writeShipTrashManifest(root, shipRunID, specSlug) + + // P2: start a pipeline-level telemetry span for OTEL-compatible tracing. + pipeTraceID, pipeSpanID := telemetry.StartPipelineSpan(root, "ship") + // Suppress the "unused" warning for domainProfile when no checkpoint reads it yet. _ = domainProfile @@ -1740,6 +1761,10 @@ func runWithOptions(opts RunOptions) *ShipResult { Message: shipMessage(pipe), Complexity: classifyComplexity(opts.Description, root), } + // P1: wire the complexity tier into the LLM pipe so that the tier router + // selects the right model (T0/T1/T2) based on task complexity. + pipe.SetComplexityTier(res.Complexity) + total := len(selected) for i, cp := range selected { @@ -1791,6 +1816,11 @@ func runWithOptions(opts RunOptions) *ShipResult { res.Checkpoints = append(res.Checkpoints, cp) res.Ready = false res.Message = fmt.Sprintf("checkpoint %s failed; pipeline stopped", cp.Name) + // P2: restore the pre-checkpoint snapshot so the spec artefacts + // directory is rolled back to the state before this checkpoint ran. + snapOnFail(strings.ToLower(cp.Name)) + // P2: emit an ERROR checkpoint span for OTEL-compatible tracing. + _ = telemetry.EmitCheckpointSpan(root, pipeTraceID, pipeSpanID, strings.ToLower(cp.Name), "ERROR", 0) // TG-40: emit gap.detected events (for ship/verify) before ship.failed. if opts.EventWriter != nil { cpLower := strings.ToLower(cp.Name) @@ -1812,6 +1842,8 @@ func runWithOptions(opts RunOptions) *ShipResult { } return res } + // P2: emit an OK checkpoint span for OTEL-compatible tracing. + _ = telemetry.EmitCheckpointSpan(root, pipeTraceID, pipeSpanID, strings.ToLower(cp.Name), "OK", 0) // Run self-debate for this checkpoint when DebateOpts is set. if opts.DebateOpts != nil { diff --git a/internal/cli/cmdship/ship_undo_test.go b/internal/cli/cmdship/ship_undo_test.go new file mode 100644 index 0000000..038cbb1 --- /dev/null +++ b/internal/cli/cmdship/ship_undo_test.go @@ -0,0 +1,175 @@ +// Copyright 2024 The Forge Authors +// +// Licensed under the Apache License, Version 2.0 (the "License"); +// you may not use this file except in compliance with the License. +// You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +// ship_undo_test.go — tests for undo / snapshot helpers added by RFC-005 P2. + +package cmdship + +import ( + "os" + "path/filepath" + "strings" + "testing" + "time" + + "github.com/teragrid/forge/internal/telemetry" +) + +// ─── writeShipTrashManifest ─────────────────────────────────────────────────── + +// TestWriteShipTrashManifest_CreatesManifest verifies the manifest file is written +// to .forge/trash//manifest.json. +func TestWriteShipTrashManifest_CreatesManifest(t *testing.T) { + t.Parallel() + root := t.TempDir() + writeShipTrashManifest(root, "run-001", "my-feature") + manifest := filepath.Join(root, ".forge", "trash", "run-001", "manifest.json") + if _, err := os.Stat(manifest); os.IsNotExist(err) { + t.Fatalf("manifest.json not created at %s", manifest) + } +} + +// TestWriteShipTrashManifest_VerbIsShip verifies the manifest records verb="ship". +func TestWriteShipTrashManifest_VerbIsShip(t *testing.T) { + t.Parallel() + root := t.TempDir() + writeShipTrashManifest(root, "run-002", "feat-a") + data, err := os.ReadFile(filepath.Join(root, ".forge", "trash", "run-002", "manifest.json")) + if err != nil { + t.Fatal(err) + } + if !strings.Contains(string(data), `"verb":"ship"`) { + t.Errorf("manifest missing verb:ship, got: %s", data) + } +} + +// TestWriteShipTrashManifest_RecordsRunID verifies the run ID is stored. +func TestWriteShipTrashManifest_RecordsRunID(t *testing.T) { + t.Parallel() + root := t.TempDir() + writeShipTrashManifest(root, "run-abc-123", "feature-x") + data, _ := os.ReadFile(filepath.Join(root, ".forge", "trash", "run-abc-123", "manifest.json")) + if !strings.Contains(string(data), "run-abc-123") { + t.Errorf("manifest does not contain run ID 'run-abc-123': %s", data) + } +} + +// TestWriteShipTrashManifest_TwoRunsDistinct verifies two manifests are independent. +func TestWriteShipTrashManifest_TwoRunsDistinct(t *testing.T) { + t.Parallel() + root := t.TempDir() + writeShipTrashManifest(root, "run-1", "feat") + writeShipTrashManifest(root, "run-2", "feat") + for _, id := range []string{"run-1", "run-2"} { + p := filepath.Join(root, ".forge", "trash", id, "manifest.json") + if _, err := os.Stat(p); os.IsNotExist(err) { + t.Errorf("missing manifest for %s", id) + } + } +} + +// ─── telemetry span helpers ─────────────────────────────────────────────────── + +// TestStartPipelineSpan_ReturnsNonEmptyIDs verifies that traceID and spanID are set. +func TestStartPipelineSpan_ReturnsNonEmptyIDs(t *testing.T) { + t.Parallel() + root := t.TempDir() + traceID, spanID := telemetry.StartPipelineSpan(root, "ship") + if traceID == "" { + t.Error("traceID must not be empty") + } + if spanID == "" { + t.Error("spanID must not be empty") + } +} + +// TestStartPipelineSpan_UniquePerInvocation verifies two calls produce different IDs. +func TestStartPipelineSpan_UniquePerInvocation(t *testing.T) { + t.Parallel() + root := t.TempDir() + traceA, _ := telemetry.StartPipelineSpan(root, "ship") + traceB, _ := telemetry.StartPipelineSpan(root, "ship") + if traceA == traceB { + t.Error("expected unique traceIDs for independent pipeline spans") + } +} + +// TestEmitCheckpointSpan_TelemetryDisabled verifies emit does not panic or error +// when telemetry is not opted in (empty root / no config). +func TestEmitCheckpointSpan_TelemetryDisabled(t *testing.T) { + t.Parallel() + defer func() { + if r := recover(); r != nil { + t.Errorf("EmitCheckpointSpan panicked with disabled telemetry: %v", r) + } + }() + root := t.TempDir() // no telemetry.json → disabled + err := telemetry.EmitCheckpointSpan(root, "trace-x", "span-y", "scan", "OK", 100*time.Millisecond) + // Error is acceptable (telemetry disabled returns nil per implementation). + _ = err +} + +// TestEmitCheckpointSpan_EnabledWritesFile verifies a span is written to the +// .forge/telemetry.jsonl file when telemetry is enabled. +func TestEmitCheckpointSpan_EnabledWritesFile(t *testing.T) { + t.Parallel() + root := t.TempDir() + // Enable telemetry for this run. + cfg := &telemetry.Config{Enabled: true, InstallID: "test-install"} + cfgPath := filepath.Join(root, telemetry.DefaultConfigPath) + if err := telemetry.SaveConfig(cfgPath, cfg); err != nil { + t.Fatalf("SaveConfig: %v", err) + } + err := telemetry.EmitCheckpointSpan(root, "trace-1", "span-1", "architecture", "OK", 200*time.Millisecond) + if err != nil { + t.Fatalf("EmitCheckpointSpan: %v", err) + } + spanFile := filepath.Join(root, telemetry.DefaultSpanPath) + if _, err := os.Stat(spanFile); os.IsNotExist(err) { + t.Errorf("telemetry span file not created: %s", spanFile) + } +} + +// ─── snapOnFail helper ──────────────────────────────────────────────────────── + +// TestSnapOnFail_BestEffort verifies snapOnFail does not panic on missing spec dir. +func TestSnapOnFail_BestEffort(t *testing.T) { + t.Parallel() + root := t.TempDir() + defer func() { + if r := recover(); r != nil { + t.Errorf("snapOnFail panicked: %v", r) + } + }() + // No spec dir — must not error or panic. + snapOnFail(root, "nonexistent-slug", "arch") +} + +// TestSnapOnFail_CreatesSnapshotWhenSpecExists verifies snapshotting works. +func TestSnapOnFail_CreatesSnapshotWhenSpecExists(t *testing.T) { + t.Parallel() + root := t.TempDir() + slug := "my-feat" + specDir := filepath.Join(root, ".forge", "specs", slug) + if err := os.MkdirAll(specDir, 0o755); err != nil { + t.Fatal(err) + } + if err := os.WriteFile(filepath.Join(specDir, "spec.md"), []byte("# spec"), 0o600); err != nil { + t.Fatal(err) + } + snapOnFail(root, slug, "code") + if !SnapshotExists(root, slug, "code") { + t.Error("expected snapshot to exist after snapOnFail with valid spec dir") + } +} diff --git a/internal/cli/cmdship/snapshot.go b/internal/cli/cmdship/snapshot.go index 87d744c..b6e335b 100644 --- a/internal/cli/cmdship/snapshot.go +++ b/internal/cli/cmdship/snapshot.go @@ -33,6 +33,7 @@ package cmdship import ( + "encoding/json" "fmt" "io" "os" @@ -146,3 +147,53 @@ func snapshotCopyFile(src, dst string) error { _, err = io.Copy(out, in) return err } + +// writeShipTrashManifest writes a TrashManifest for a `forge ship` run so that +// `forge undo` can locate and reverse it. +// +// The manifest is written to .forge/trash//manifest.json. +// This satisfies ADR-024 (§17.1 #5 reversibility contract) for the ship verb. +// Errors are silently swallowed — a missing manifest is recoverable by the user. +func writeShipTrashManifest(root, runID, specSlug string) { + dir := filepath.Join(root, ".forge", "trash", runID) + if err := os.MkdirAll(dir, 0o755); err != nil { + return + } + + // Record the spec artefacts directory as the primary tracked path. + specDir := filepath.Join(root, ".forge", "specs", specSlug) + type trashFile struct { + OrigPath string `json:"orig_path"` + SavePath string `json:"save_path"` + Mode uint32 `json:"mode"` + } + type trashManifest struct { + RunID string `json:"run_id"` + Timestamp string `json:"ts"` + Verb string `json:"verb"` + Files []trashFile `json:"files"` + } + m := trashManifest{ + RunID: runID, + Timestamp: time.Now().UTC().Format(time.RFC3339), + Verb: "ship", + Files: []trashFile{ + { + OrigPath: specDir, + SavePath: filepath.Join(root, ".forge", snapshotsBaseDir, specSlug), + Mode: 0o755, + }, + }, + } + data, err := json.Marshal(m) + if err != nil { + return + } + _ = os.WriteFile(filepath.Join(dir, "manifest.json"), data, 0o600) +} + +// snapOnFail is a best-effort snapshot helper called on checkpoint failure. +// It calls TakeSnapshot and silently discards any error (snapshot is advisory). +func snapOnFail(root, slug, checkpoint string) { + _ = TakeSnapshot(root, slug, checkpoint) +} diff --git a/internal/cli/cmdship/steering.go b/internal/cli/cmdship/steering.go index 0bad070..4b79c4a 100644 --- a/internal/cli/cmdship/steering.go +++ b/internal/cli/cmdship/steering.go @@ -27,7 +27,8 @@ // // Default steerings (mapped to all 7 forge ship checkpoints): // -// prompt-guide — ALL: behavioral standards (no placeholders, no hedging) +// prompt-guide — ALL: behavioral standards (no placeholders, no hedging, Status Summary check) +// spec-status-maintenance — ALL: enforce Status Summary at top of spec; version bump velocity guard // requirements-quality-scan — spec: Given/When/Then AC, measurable NFRs, impact analysis // tdd-standards — test: TDD gate, no always-passing assertions, coverage targets // task-decomposition — breakdown: atomic tasks, done criteria, effort sizing @@ -63,11 +64,39 @@ const promptGuideSteering = `Behavioral standards (always enforced): 2. No hedging language ("might", "could perhaps", "possibly", "maybe"). 3. No scope creep — address only what was asked; flag extras as open questions. 4. All file references use relative paths from the project root. -5. When uncertain, flag as a gap with a hint; do not invent content.` +5. When uncertain, flag as a gap with a hint; do not invent content. +6. Spec Status Summary — always check and update: if you read or write any spec.md, + the "## Status Summary" block at the top MUST reflect the current Lifecycle, + Checkpoint Progress, and Last Updated date before you finish.` + +// ── spec-status-maintenance (always-on) ───────────────────────────────────── + +const specStatusMaintenanceSteering = `Spec Status Summary maintenance (enforced at every checkpoint): +1. Every spec.md must have a "## Status Summary" block as the FIRST section (before "## What"). +2. Update the block whenever state changes: + - Lifecycle: Draft → In Progress → Implemented → Released + - Checkpoint Progress: increment (e.g. 3/7) to reflect the checkpoint just completed + - Last Updated: set to today + - Version Scope: confirm PATCH | MINOR | MAJOR with a one-line rationale +3. Version bump velocity guard — before recording Version Scope: + - PATCH (default): bug fixes, hardening, docs, non-visible internal refactors + - MINOR: additive user-visible capability — only when genuinely new; do not use MINOR + just because a few commits accumulated; batch small additions into the same MINOR + - MAJOR: intentional breaking contract changes only; requires explicit ADR + - Do NOT advance to the next MINOR version more than once per deliberate milestone. + If the previous MINOR was released fewer than 5 business days ago, default to PATCH + unless a significant new user-facing capability is present. +4. Record the rationale in the Version Scope line (e.g. "MINOR — DAG pipeline + domain profiles").` // ── spec ───────────────────────────────────────────────────────────────────── const requirementsQualitySteering = `Requirements quality scan (spec checkpoint): +0. The top of spec.md must start with a "## Status Summary" section (before "## What") containing at minimum: + - Lifecycle: Draft | In Progress | Implemented | Released + - Version Scope: PATCH | MINOR | MAJOR (+ one-line rationale) + - Owner: team/person responsible + - Last Updated: YYYY-MM-DD + - Checkpoint Progress: X/7 1. Every acceptance criterion must use Given/When/Then format. 2. NFRs must include measurable thresholds (e.g. "p95 latency < 200 ms", "error rate < 0.1%"). 3. Perform impact analysis: list upstream/downstream services affected by this change. @@ -100,7 +129,12 @@ const implementationStandardsSteering = `Implementation standards (code checkpoi 2. All filesystem paths validated via the project sandbox (no path traversal). 3. All subprocess execution goes through the allow-listed spawn utility; no shell=true. 4. OWASP Top 10: verify injection (A03), broken auth (A07), and insecure design (A04) are addressed. -5. New public APIs must have input validation at the boundary; internal callers may trust.` +5. New public APIs must have input validation at the boundary; internal callers may trust. +6. When implementation reaches done-state, update the spec.md "Status Summary" at the top: + - Lifecycle -> Implemented + - Checkpoint Progress -> 7/7 (or actual) + - Last Updated -> today + - Version Scope rationale confirmed before release.` // ── arch ───────────────────────────────────────────────────────────────────── @@ -110,14 +144,19 @@ const reviewDABSteering = `Design Approval Board — Full DAB checklist (arch ch 3. Security threat model: identify top-3 OWASP risks; document mitigations. 4. Data residency and privacy impact assessed; PII handling and retention documented. 5. Rollback / undo procedure documented and reversible within one deploy cycle. -6. Integration contracts (API, event schemas) versioned and backward-compatible.` +6. Integration contracts (API, event schemas) versioned and backward-compatible. +7. Version bump discipline: choose the smallest valid semver scope for the upcoming release. + - PATCH: fixes, hardening, non-breaking behavior corrections (default when unsure) + - MINOR: additive, backward-compatible capabilities + - MAJOR: explicit breaking contract changes only.` const reviewDABLightSteering = `Design Approval Board Light checklist (lower-risk / single-service change): 1. Change is bounded to a single service or module; no cross-service contract changes. 2. No new external dependencies introduced without an ADR. 3. Rollback is possible by reverting a single deployment unit. 4. Existing tests cover the change path; no coverage regression. -5. Sequence diagram required only if a new inter-service call is introduced.` +5. Sequence diagram required only if a new inter-service call is introduced. +6. Version bump discipline: default to PATCH unless an additive feature clearly requires MINOR.` // ── ship ───────────────────────────────────────────────────────────────────── @@ -125,7 +164,11 @@ const reviewTechChangeSteering = `Technical Change review (ship checkpoint — s 1. Confirm change is isolated: no schema migrations, no new service dependencies. 2. CAB Shift-Left checklist: rollback tested, feature-flagged if risky, changelog updated. 3. Observability: new code paths emit logs / metrics / traces at appropriate levels. -4. No breaking changes to public interfaces without a deprecation notice.` +4. No breaking changes to public interfaces without a deprecation notice. +5. Release scope guard (before tagging): + - Record semver decision in the spec.md Status Summary (Version Scope + rationale) + - Prefer PATCH by default; use MINOR only for real additive capability + - Use MAJOR only for intentional, documented breaking changes.` // ── arch: scope-scan phase (runs before DAB type decision) ────────────────── @@ -189,6 +232,13 @@ func DefaultSteerings() []Steering { Applies: func(_ string, _ *Checkpoint) bool { return true }, Prompt: promptGuideSteering, }, + // spec-status-maintenance fires on every checkpoint so that the Status + // Summary block is always updated as work progresses, not only at spec time. + { + Name: "spec-status-maintenance", + Applies: func(_ string, _ *Checkpoint) bool { return true }, + Prompt: specStatusMaintenanceSteering, + }, // ── spec ───────────────────────────────────────────────────────────── { Name: "requirements-quality-scan", diff --git a/internal/cli/cmdship/steering_policy_test.go b/internal/cli/cmdship/steering_policy_test.go new file mode 100644 index 0000000..4af1e9d --- /dev/null +++ b/internal/cli/cmdship/steering_policy_test.go @@ -0,0 +1,110 @@ +// Copyright 2024 The Forge Authors +// +// Licensed under the Apache License, Version 2.0 (the "License"); +// you may not use this file except in compliance with the License. +// You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +package cmdship + +import ( + "strings" + "testing" +) + +func TestSpecStub_IncludesTopStatusSummary(t *testing.T) { + t.Parallel() + stub := specStub("rate limiting") + + mustContain := []string{ + "## Status Summary", + "- Lifecycle: Draft", + "- Version Scope: PATCH", + "- Last Updated:", + "- Checkpoint Progress:", + "## What", + "## Acceptance Criteria", + } + for _, token := range mustContain { + if !strings.Contains(stub, token) { + t.Fatalf("spec stub missing %q\n\n%s", token, stub) + } + } + + if strings.Index(stub, "## Status Summary") > strings.Index(stub, "## What") { + t.Fatalf("status summary must be above ## What\n\n%s", stub) + } +} + +func TestSteering_RequirementsQuality_EnforcesStatusSummary(t *testing.T) { + t.Parallel() + if !strings.Contains(requirementsQualitySteering, "Status Summary") { + t.Fatalf("requirements-quality steering must enforce top Status Summary block") + } + if !strings.Contains(requirementsQualitySteering, "Version Scope") { + t.Fatalf("requirements-quality steering must require Version Scope") + } +} + +func TestSteering_ReleaseScopeGuard_PrefersPatch(t *testing.T) { + t.Parallel() + if !strings.Contains(reviewTechChangeSteering, "Prefer PATCH by default") { + t.Fatalf("ship steering must include PATCH-first release scope guidance") + } +} + +// TestSteering_SpecStatusMaintenance_IsAlwaysOn verifies that +// spec-status-maintenance fires on every checkpoint. +func TestSteering_SpecStatusMaintenance_IsAlwaysOn(t *testing.T) { + t.Parallel() + steerings := DefaultSteerings() + var sm *Steering + for i := range steerings { + if steerings[i].Name == "spec-status-maintenance" { + sm = &steerings[i] + break + } + } + if sm == nil { + t.Fatal("spec-status-maintenance steering not registered in DefaultSteerings") + } + for _, cp := range []string{"spec", "arch", "test", "breakdown", "code", "ship", "qa-verify"} { + if !sm.Applies(cp, nil) { + t.Errorf("spec-status-maintenance must apply to checkpoint %q but Applies returned false", cp) + } + } +} + +// TestSteering_SpecStatusMaintenance_HasVelocityGuard ensures the steering +// text explicitly guards against bumping MINOR too frequently. +func TestSteering_SpecStatusMaintenance_HasVelocityGuard(t *testing.T) { + t.Parallel() + must := []string{ + "Version bump velocity guard", + "PATCH", + "MINOR", + "MAJOR", + "deliberate milestone", + } + for _, token := range must { + if !strings.Contains(specStatusMaintenanceSteering, token) { + t.Fatalf("spec-status-maintenance steering missing %q", token) + } + } +} + +// TestSteering_PromptGuide_IncludesStatusSummaryRule ensures the always-on +// prompt-guide instructs the LLM to keep the Status Summary current. +func TestSteering_PromptGuide_IncludesStatusSummaryRule(t *testing.T) { + t.Parallel() + if !strings.Contains(promptGuideSteering, "Status Summary") { + t.Fatal("prompt-guide must remind LLM to keep Status Summary current") + } +} diff --git a/internal/cli/cmdskill/platforms.go b/internal/cli/cmdskill/platforms.go index 5192258..c01345e 100644 --- a/internal/cli/cmdskill/platforms.go +++ b/internal/cli/cmdskill/platforms.go @@ -352,6 +352,73 @@ Error codes use 4-digit ranges per package: Full table in ` + "`" + `docs/ERROR_CODES.md` + "`" + `. Never reuse a code. Pick the next available in the package's range. + +--- + +## Daily Vibe-Coding Workflows + +These are the patterns you execute natively every day. When the user says any +variation of these, map it to the right forge workflow and execute it. + +### Feature workflow +User says: "Build X", "Add X feature", "Implement X" + +1. Create ` + "`" + `.forge/specs//spec.yml` + "`" + ` with acceptance criteria + 3 test cases. +2. Design the architecture (draw the call graph mentally before writing code). +3. Scaffold types/interfaces first, then implementations. +4. Write tests using the 9-point framework before implementing. +5. Implement; check all forge conventions. +6. Run ` + "`" + `go test ./...` + "`" + ` — all green before committing. +7. Commit on a ` + "`" + `feature/` + "`" + ` branch. + +### Bugfix workflow +User says: "Fix X", "There's a bug: ", "This panics" + +1. Write a **failing test** first (the exact reproduction). +2. State the root cause in one sentence. +3. Apply the minimal fix. +4. Verify: failing test now passes; all other tests still pass. +5. Add regression comment: ` + "`" + `// regression: ` + "`" + `. + +### Security scan workflow +User says: "Scan for secrets", "Check for vulnerabilities", "OWASP audit" + +1. Grep every changed file for credential-shaped strings. +2. Audit all user inputs for injection vectors (SQL, command, path traversal). +3. List dependencies with known CVEs. +4. Report: severity / file:line / description / remediation. + +### Morning standup workflow +User says: "What did we ship?", "Standup summary", "What's in-flight?" + +1. Read ` + "`" + `.forge/specs/` + "`" + ` for current specs and their checkpoint status. +2. Read ` + "`" + `git log --since="yesterday"` + "`" + ` for commits. +3. Summarise: shipped yesterday / in-flight today / blockers. + +### Code review workflow +User says: "Review this PR", "Review ", "What's wrong with this?" + +1. Read the diff / file(s). +2. Check against the spec (` + "`" + `.forge/specs//spec.yml` + "`" + `). +3. Report: correctness → test gaps → security issues → style. +4. Format as inline comments with severity + suggested fix. + +--- + +## Quick command reference (top 10 daily) + +| Intent | Command | +|--------|---------| +| Ship a feature end-to-end | ` + "`" + `forge ship "description"` + "`" + ` | +| Fix a bug from a stacktrace | ` + "`" + `forge bugfix --bug "description"` + "`" + ` | +| Security scan | ` + "`" + `forge scan all` + "`" + ` | +| Scaffold new project | ` + "`" + `forge new