Skip to content

docs(0.2): Public-facing documentation refresh for 0.2.0 tag#129

Merged
pmclSF merged 5 commits intomainfrom
docs/0.2-public-facing-refresh
May 2, 2026
Merged

docs(0.2): Public-facing documentation refresh for 0.2.0 tag#129
pmclSF merged 5 commits intomainfrom
docs/0.2-public-facing-refresh

Conversation

@pmclSF
Copy link
Copy Markdown
Owner

@pmclSF pmclSF commented May 2, 2026

Brings user-facing docs into sync with what 0.2.0 actually shipped after the ship-blocker + final-polish review cycles.

Files updated

  • CHANGELOG.md: 0.2.0 release date, per-detector descriptions reflecting shipped behaviour, new "Polish" section bulleting cross-cutting review fixes.
  • docs/release/feature-status.md: AI detector rows updated to describe actual shipped behaviour; cosign-install row reflects mandatory-by-default posture.
  • docs/cli-spec.md: New canonical-vs-legacy surface section; terrain version schemaVersion field documented; --read-only description matches shipped enforcement.
  • docs/telemetry.md: Example version updated.

Why a separate PR?

The polish PR (#128) landed the code; this one lands the docs that describe what landed. Separating keeps the doc-only change reviewable on its own — no test runs needed beyond make docs-verify.

Test plan

  • make docs-verify zero-diff
  • go test ./... clean
  • CI green on this branch

🤖 Generated with Claude Code

Updates user-facing docs to reflect everything that actually
shipped in 0.2.0 — including the post-initial-publish polish
work that landed during the ship-blocker + final-polish reviews.

## CHANGELOG.md
- 0.2.0 header gets the release date (2026-05-02).
- Per-detector entries replace stale "Known limitation" callouts
  with the actual shipped behaviour (per-provider scoping for
  aiNonDeterministicEval, structural key-name + benign-object
  whitelist for aiToolWithoutSandbox, implicit path-based coverage
  for aiSafetyEvalMissing + aiFewShotContamination, multi-line
  concatenation + expanded user-input shapes for
  aiPromptInjectionRisk, severity-by-category for
  aiModelDeprecationRisk, paired-count confidence scaling for
  aiCostRegression + aiRetrievalRegression, placeholder-token
  rejection for aiPromptVersioning, env-var constructor patterns
  for aiEmbeddingModelChange).
- Calibration corpus section reframed: gate is recall-only,
  precision floor slipped to 0.3 (corpus v2). Match-key precision
  improvement (Symbol added) and empty-corpus-bypass closure
  documented.
- New "Polish (release-prep adversarial review fixes)" section
  bullets the cross-cutting fixes from the two adversarial-review
  passes: release infra, engine self-diagnostic (detectorPanic in
  catalog, RequiresGraph hard-fail), eval adapters (Promptfoo
  errors-bucket + per-case cost + time magnitude, DeepEval runId +
  metric-name normalisation, Ragas evaluation_results + scores
  shapes, envelope SourcePath rel), CLI (deprecation hints,
  --read-only enforcement, version JSON schemaVersion, exit 5 for
  not-found), determinism (sortSignals Symbol tiebreak), supply
  chain (concurrency, timeouts, CodeQL Python drop,
  COSIGN_EXPERIMENTAL cleanup), documentation (CODE_OF_CONDUCT,
  issue templates, glossary/versioning/compatibility, integration
  guides).

## docs/release/feature-status.md
- 12 AI detector rows updated to describe the actual shipped
  behaviour, not the pre-fix state. Specifically: per-provider
  scoping, severity-by-category, structural-with-benign-objects,
  implicit-coverage, paired-count confidence, placeholder
  rejection, env-var constructor patterns.
- Cosign npm-install row now reflects the mandatory-by-default
  posture (was "degrades to checksum-only"). Documents the two
  escapes (TERRAIN_INSTALLER_ALLOW_MISSING_COSIGN=1,
  TERRAIN_INSTALLER_SKIP_VERIFY=1) and the redirect cap.

## docs/cli-spec.md
- New "Surface — canonical 11 + legacy aliases" section at the
  top documents the 0.2 namespace dispatchers and the
  TERRAIN_LEGACY_HINT=1 deprecation flow. Detailed per-command
  entries below stay valid since legacy aliases still work
  through 0.2.x.
- `terrain version` entry mentions the new schemaVersion field
  in --json output (CI tools can pin the snapshot contract).
- `terrain serve --read-only` description updated: was
  "no-op in 0.1.2", now reflects actual HTTP 405 method
  enforcement that shipped in 0.2.0.

## docs/telemetry.md
- Example version field updated 0.1.0 → 0.2.0.

`make docs-verify` passes; full `go test ./...` clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 2, 2026

[RISK] Terrain — Merge with caution

High-severity gaps found in changed code.

Metric Value
Changed files 71 (30 source · 15 test)
Impacted units 119
Protection gaps 48
Tests selected 117 of 772 (15% of suite)

Coverage gaps in changed code

  • bin/terrain-installer.js [MED] — Exported function ensureTerrainBinary has no observed test coverage.
    → Add unit tests for exported function ensureTerrainBinary — this is public API surface.
  • bin/terrain-installer.js [MED] — Exported function runTerrainCli has no observed test coverage.
    → Add unit tests for exported function runTerrainCli — this is public API surface.
  • cmd/terrain/cmd_ai.go [LOW] — cmd_ai.go has no observed test coverage.
    → Add unit tests for cmd_ai.go.
  • cmd/terrain/cmd_config_namespace.go [LOW] — cmd_config_namespace.go has no observed test coverage.
    → Add unit tests for cmd_config_namespace.go.
  • cmd/terrain/cmd_serve.go [LOW] — cmd_serve.go has no observed test coverage.
    → Add unit tests for cmd_serve.go.
  • cmd/terrain/main.go [LOW] — main.go has no observed test coverage.
    → Add unit tests for main.go.
  • internal/aidetect/cost_regression.go [MED] — Exported function Detect has no observed test coverage.
    → Add unit tests for exported function Detect — this is public API surface.
  • internal/aidetect/hardcoded_api_key.go [MED] — Exported function Detect has no observed test coverage.
    → Add unit tests for exported function Detect — this is public API surface.
  • internal/aidetect/model_deprecation.go [MED] — Exported function Detect has no observed test coverage.
    → Add unit tests for exported function Detect — this is public API surface.
  • internal/aidetect/non_deterministic_eval.go [MED] — Exported function Detect has no observed test coverage.
    → Add unit tests for exported function Detect — this is public API surface.
  • ...and 38 more (37 medium, 1 low)
63 pre-existing issues on changed files
  • internal/aidetect/model_deprecation.go [MED] — [aiModelDeprecationRisk] model tag gpt-4 resolves to whatever the provider currently maps it to; pin a dated variant (e.g. gpt-4-0613)
  • internal/aidetect/model_deprecation.go [MED] — [aiModelDeprecationRisk] model tag gpt-3.5-turbo is a moving alias; pin a dated variant
  • internal/aidetect/model_deprecation.go [MED] — [aiModelDeprecationRisk] model tag claude-3-opus floats across provider releases; pin claude-3-opus-YYYYMMDD
  • internal/aidetect/model_deprecation.go [MED] — [aiModelDeprecationRisk] model tag claude-3-sonnet floats; pin a dated variant
  • internal/aidetect/model_deprecation.go [MED] — [aiModelDeprecationRisk] model tag claude-3-haiku floats; pin a dated variant
  • ...and 58 more

Recommended tests

117 test(s) with exact coverage of 72 impacted unit(s). 47 impacted unit(s) have no covering tests in the selected set.

Package Tests Sample
internal/aidetect 12 internal/aidetect/cost_regression_test.go ...
cmd/terrain 9 cmd/terrain/ai_workflow_test.go ...
internal/engine 8 internal/engine/adversarial_test.go ...
internal/testdata 8 internal/testdata/adversarial_test.go ...
internal/analysis 7 internal/analysis/ai_extra_surfaces_test.go ...
internal/quality 7 internal/quality/coverage_blind_spot_test.go ...
internal/convert 6 internal/convert/catalog_test.go ...
internal/depgraph 6 internal/depgraph/bench_test.go ...
internal/airun 5 internal/airun/artifact_test.go ...
internal/models 5 internal/models/migrate_test.go ...
internal/reporting 4 internal/reporting/analyze_report_test.go ...
internal/impact 3 internal/impact/changeset_builder_test.go ...
internal/migration 3 internal/migration/detectors_test.go ...
internal/analyze 2 internal/analyze/analyze_golden_test.go ...
internal/changescope 2 internal/changescope/changescope_test.go ...
internal/explain 2 internal/explain/explain_golden_test.go ...
internal/insights 2 internal/insights/insights_golden_test.go ...
internal/measurement 2 internal/measurement/measurement_test.go ...
internal/ownership 2 internal/ownership/aggregate_test.go ...
internal/scoring 2 internal/scoring/risk_engine_benchmark_test.go ...
internal/benchmark 1 internal/benchmark/export_test.go
internal/calibration 1 internal/calibration/runner_test.go
internal/comparison 1 internal/comparison/compare_test.go
internal/gauntlet 1 internal/gauntlet/ingest_test.go
internal/governance 1 internal/governance/evaluate_test.go
internal/graph 1 internal/graph/graph_test.go
internal/heatmap 1 internal/heatmap/heatmap_test.go
internal/lifecycle 1 internal/lifecycle/lifecycle_test.go
internal/matrix 1 internal/matrix/matrix_test.go
internal/metrics 1 internal/metrics/metrics_test.go
internal/portfolio 1 internal/portfolio/portfolio_test.go
internal/sarif 1 internal/sarif/convert_test.go
internal/server 1 internal/server/server_test.go
internal/severity 1 internal/severity/rubric_test.go
internal/signals 1 internal/signals/detector_registry_test.go
internal/skipstats 1 internal/skipstats/summary_test.go
internal/stability 1 internal/stability/stability_test.go
internal/structural 1 internal/structural/structural_test.go
internal/summary 1 internal/summary/executive_test.go
internal/truthcheck 1 internal/truthcheck/calibration_test.go

AI Validation

Scenarios: 0 of 16 selected

5 new findings introduced by this PR:

  • internal/aidetect/model_deprecation.go:34, 35, 36, 37, 38, 39, 41, 42, 43, 44, 45, 46, 47, 72 — Model tag is sunset or floats — the next API call could break or silently re-resolve.
    → Pin to a dated model variant (e.g. gpt-4-0613) or upgrade to a current tier.
  • internal/aidetect/model_deprecation_test.go:42, 105, 106, 107, 116, 119, 121, 122, 134, 181 — Model tag is sunset or floats — the next API call could break or silently re-resolve.
    → Pin to a dated model variant (e.g. gpt-4-0613) or upgrade to a current tier.
  • internal/analysis/ai_extra_surfaces.go (es_knn_search, faiss_in_memory_index, mcp_tool_ai_extra_surfaces.go, pgvector_query, weaviate_rest) — AI surface (prompt / tool / retriever / model) has zero test or scenario coverage.
    → Add an eval scenario or test that exercises this surface.
  • internal/changescope/changescope_test.go:600 — Model tag is sunset or floats — the next API call could break or silently re-resolve.
    → Pin to a dated model variant (e.g. gpt-4-0613) or upgrade to a current tier.
  • internal/severity/rubric.go:129 — User input flows into a prompt without visible escaping or boundary tokens.
    → Wrap user input through a sanitizer, or use a prompt template with explicit user-content boundaries.
4 advisory findings
  • internal/aidetect/model_deprecation.go:27, 28, 29, 30, 31, 41, 42, 43, 45, 62, 63, 72, 118 — Model tag is sunset or floats — the next API call could break or silently re-resolve.
    → Pin to a dated model variant (e.g. gpt-4-0613) or upgrade to a current tier.
  • internal/aidetect/model_deprecation_test.go:16, 95, 154, 155, 156, 157, 158, 159, 160, 161, 181 — Model tag is sunset or floats — the next API call could break or silently re-resolve.
    → Pin to a dated model variant (e.g. gpt-4-0613) or upgrade to a current tier.
  • internal/airun/promptfoo.go:278, 279 — Model tag is sunset or floats — the next API call could break or silently re-resolve.
    → Pin to a dated model variant (e.g. gpt-4-0613) or upgrade to a current tier.
  • internal/severity/rubric.go:225, 227, 228 — Model tag is sunset or floats — the next API call could break or silently re-resolve.
    → Pin to a dated model variant (e.g. gpt-4-0613) or upgrade to a current tier.

Changed AI surfaces without eval coverage (5):

  • retrieval: es_knn_search (internal/analysis/ai_extra_surfaces.go)
  • retrieval: faiss_in_memory_index (internal/analysis/ai_extra_surfaces.go)
  • retrieval: pgvector_query (internal/analysis/ai_extra_surfaces.go)
  • retrieval: weaviate_rest (internal/analysis/ai_extra_surfaces.go)
  • tool_definition: mcp_tool_ai_extra_surfaces.go (internal/analysis/ai_extra_surfaces.go)

Owners: PMCLSF

Limitations
  • No coverage artifacts provided; protection gaps reflect missing data, not measured absence. Provide --coverage to improve accuracy.
  • Mixed test cultures reduce cross-framework optimization confidence. Consider standardizing on fewer frameworks.

Generated by Terrain · terrain pr --json for machine-readable output

Targeted Test Results

Terrain selected 117 test(s) instead of the full suite.

  • Go tests: passed

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 2, 2026

Terrain AI Validation

Metric Value
AI surfaces 13
Eval scenarios 16
Impacted scenarios 0
Uncovered surfaces 13
Blocking signals 31

Decision: PASS — AI surfaces are covered.

pmclSF and others added 4 commits May 1, 2026 20:49
The repo's existing convention is American English (240 AmE markers
vs 21 BrE pre-sweep). Recent docs and Go-source comments drifted to
British spellings during the polish review cycles; this commit
brings everything back to convention.

Mechanical replacements applied uniformly across `*.go`, `*.md`,
`*.js`, `*.yml`, `*.yaml` (excluding `node_modules/`,
`benchmarks/repos/`, `.git/`, `.claude/`, `tests/calibration/`,
`vendor/`):

- behaviour → behavior (incl. plurals + capitalization)
- normalise / normalised / normalising / normalisation →
  normalize / normalized / normalizing / normalization
- categorise / recognise / synthesise / summarise / prioritise /
  utilise / optimise / favour / finalise → -ize / -or / -ze
  variants
- labelled / labelling / modelled / modelling → labeled / labeling /
  modeled / modeling
- cancelled → canceled
- centre → center
- analyse → analyze (verb form; the -yze variant is the AmE
  convention and was already used as such elsewhere)

64 files changed, 132 insertions / 132 deletions. Pure word-level
substitutions; no semantics affected. `make docs-gen` regenerated
manifest.json + severity-rubric.md + rule doc stubs to track the
Go-source comment changes.

Verified: `go build ./...` clean, `go test ./...` all pass,
`make docs-verify` zero-diff.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The terrain-ai.yml workflow was reporting whole-repo AI signals as
"blocking" for every PR, including doc-only ones. Both PR #128
(code-changing) and PR #129 (docs-only) generated 29 / 68 blocking
signals respectively — most referencing files the PR never touched
(calibration-corpus fixtures, the detector's own source, etc.).

Root cause in `internal/changescope/analyze.go`: the AI signals
collection loop iterated EVERY signal of category AI in the
snapshot, regardless of whether `sig.Location.File` was in the PR's
changed-files set. Compare to the impacted-scenarios loop right
above it (line 250), which IS impact-scoped via
`result.ImpactedScenarios`. The asymmetry meant the gate was
useless: the calibration corpus contains intentionally-bad
fixtures designed to trigger the detectors (that's their job —
they're regression tests), and those fixtures showed up as
"merge blockers" on every PR.

Fix: hoist the `changedPaths` set construction above the signals
loop and filter by `sig.Location.File`. Signals without a
Location.File (whole-repo emergent signals) are also dropped —
they belong in `terrain analyze`, not `terrain pr`.

The pre-existing
`TestBuildAIValidationSummary_WithSignals` test was constructed
around the old (broken) behavior with no Location.File set on the
fixture signals; rewritten to assert the new contract:

  - Critical AI signal on a changed file → BlockingSignals
  - Medium AI signal on a changed file → WarningSignals
  - High AI signal on an UNCHANGED file → dropped
  - Quality signal → dropped (category filter, unchanged)

New regression test `TestBuildAIValidationSummary_DropsSignalsOnUnchangedFiles`
locks in the doc-only-PR case: a PR that only touches docs/* and
CHANGELOG.md should produce zero blocking signals from
calibration-corpus fixtures elsewhere in the tree.

Verified: `go test ./...` green; the new asymmetry between
"whole-repo signals via terrain analyze" and "PR-introduced
signals via terrain pr" matches the design contract for the two
commands.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The PR-comment format for `terrain pr --format markdown` had three
problems that made the AI Validation section unhelpful:

1. Detector taxonomy as the headline. Bullets started with bold
   `**aiPromptInjectionRisk**:` followed by the raw detector
   explanation. PR authors had to know the rubric to interpret it.
2. No file paths or line numbers. The Recommended Tests section
   right above already shows file paths (and they're clickable on
   GitHub); the AI Validation section dropped them.
3. No deduplication. 12 prompt-injection hits across 4 files
   produced 12 identical-looking bullets. The signal was lost in
   the noise.

Fix:

- AISignalSummary gains File/Line/Symbol fields (model.go);
  populated in analyze.go from sig.Location.

- New `internal/changescope/ai_signal_humanize.go` carries:
  - `humanSummary[type]` — one-sentence plain-language description
    per detector type (no taxonomy)
  - `humanAction[type]`  — one-sentence concrete next step
  - `groupSignalsByFileAndType()` — collapses N (file, type)
    duplicates into one entry whose Lines slice carries every
    distinct line number
  - `renderGroupedSignal()` — outputs `**\`path:42, 47, 51\`** —
    <plain summary>` followed by `→ <action>`

- render.go's markdown path now calls the grouped renderer for
  Blocking + Warning sections. Section header becomes
  "**N new finding(s) introduced by this PR**" instead of
  "Blocking signals (N)" so the framing matches what `terrain pr`
  is actually about (PR-introduced findings, not whole-repo state).

- The text renderer (used when --format isn't markdown) gets the
  same grouped output.

Test updates:
- TestRenderPRSummaryMarkdown_AISection rewritten for the new
  contract: locator presence, plain-language summary presence,
  action arrow, line-grouping for duplicates, symbol-keyed
  locator for tool findings, NO bare detector taxonomy in the
  bold headline.

Sample new output (12-line repetition collapses to 4 bullets):

  ### AI Validation
  Scenarios: 2 of 12 selected

  **4 new finding(s) introduced by this PR:**
  - **`evals/promptfoo.yaml:7`** — API key embedded in source or
    config — should be in env / secret store.
    → Move the secret to an env var (or your secrets manager).
  - **`agents/tools.yaml (delete_user)`** — Destructive tool can
    run without an approval gate, sandbox, or dry-run mode.
    → Add `requires_approval: true`, route through a sandbox.
  - **`src/auth/login.ts:42, 47, 51`** — User input flows into a
    prompt without visible escaping or boundary tokens.
    → Wrap user input through a sanitizer, or use a prompt
      template with explicit user-content boundaries.
  - **`src/chat/handler.ts:18`** — User input flows into a prompt
    ... [same summary, separate file]

Combined with the impact-scope filter from the previous commit,
the AI Validation section now does what it should:
PR-introduced findings only, grouped, with file paths and
plain-language explanations.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The PR comment now reads as a sequence of discrete cards rather than
a wall of bullets. Concrete changes:

## Header
- H2 + blockquote subtitle ("> High-severity gaps found in changed
  code.") instead of `## ... — verdict` + `*italic line*`. Blockquote
  renders as a callout on GitHub (left rule + soft background) which
  visually separates the "why" from the "what."

## Metrics table
- Compact 2-column layout with bold left-column labels:

  | Metric | Value |
  |---|---|
  | **Changed files** | 7 (5 source · 2 test) |
  | **Impacted units** | 12 |

  Empty rows skipped (`Impacted units: 0` doesn't render). Middle-dot
  separator (·) replaces commas in compound stats so cells read
  cleanly.

## Sentence-case headings + horizontal rules
- "New Risks (directly changed)" → "Coverage gaps in changed code"
- "Recommended Tests" → "Recommended tests"
- "Impacted capabilities:" + "Scenarios: 2 of 12 selected" inline →
  blockquote with both: `> **Capabilities:** ... · **Scenarios:** ...`
- Every major section is preceded by a `---` horizontal rule, giving
  the comment visual rhythm. Each section reads as its own card.

## Finding card shape (parallel across all sections)
- Coverage gaps and AI signals now share the same bullet shape:

  - **`path/to/file.ts`** [HIGH] — <plain-language description>
    → <suggested action>

  Pre-fix coverage gaps and AI signals were rendered by two different
  paths producing different shapes (one used severity prefix, one
  used taxonomy prefix); the inconsistency was visible in the same
  comment. Now both go through `renderFindingCard` /
  `renderGroupedSignal` with parallel structure.

## Pluralization
- New `pluralize(n, singular, plural)` helper replaces the awkward
  "finding(s)" / "issue(s)" / "gap(s)" notation in user-visible
  headers. "1 advisory finding", "2 advisory findings",
  "1 indirectly impacted protection gap", "3 pre-existing issues."

## Footer
- Owners + branding + limitations consolidated into a small-text
  footer using `<sub>` tags, separated from main content by a final
  `---`. Pre-fix these were scattered mid-comment.

## Test updates
- TestRenderPRSummaryMarkdown_DirectVsIndirectSections updated for
  new heading + card shape + pluralization.
- TestRenderPRSummaryMarkdown_FindingTruncation: italicized
  "_...and N more_" overflow message.
- TestRenderPRSummaryMarkdown_Deterministic + AISection +
  MixedTraditionalAndAI: matched against new sentence-case headings.

Sample new comment shape:

  ## [WARN] Terrain — Merge with caution

  > High-severity gaps found in changed code.

  | Metric | Value |
  |---|---|
  | **Changed files** | 7 (5 source · 2 test) |
  | **Impacted units** | 12 |

  ---

  ### Coverage gaps in changed code

  - **`src/auth/login.ts`** [HIGH] — Exported function authenticate
    has no observed test coverage.

  ---

  ### AI Validation

  > **Capabilities:** customer-support-bot · **Scenarios:** 2 of 12
  > selected

  **2 new findings introduced by this PR:**

  - **`src/auth/login.ts:42, 47, 51`** — User input flows into a
    prompt without visible escaping or boundary tokens.
    → Wrap user input through a sanitizer.

  ---

  <sub>Generated by [Terrain](...) · `terrain pr --json` for
  machine-readable output</sub>

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@pmclSF pmclSF merged commit 60b9838 into main May 2, 2026
10 checks passed
@pmclSF pmclSF deleted the docs/0.2-public-facing-refresh branch May 2, 2026 04:28
pmclSF added a commit that referenced this pull request May 9, 2026
* docs(0.2): Public-facing documentation refresh for 0.2.0 tag

Updates user-facing docs to reflect everything that actually
shipped in 0.2.0 — including the post-initial-publish polish
work that landed during the ship-blocker + final-polish reviews.

## CHANGELOG.md
- 0.2.0 header gets the release date (2026-05-02).
- Per-detector entries replace stale "Known limitation" callouts
  with the actual shipped behaviour (per-provider scoping for
  aiNonDeterministicEval, structural key-name + benign-object
  whitelist for aiToolWithoutSandbox, implicit path-based coverage
  for aiSafetyEvalMissing + aiFewShotContamination, multi-line
  concatenation + expanded user-input shapes for
  aiPromptInjectionRisk, severity-by-category for
  aiModelDeprecationRisk, paired-count confidence scaling for
  aiCostRegression + aiRetrievalRegression, placeholder-token
  rejection for aiPromptVersioning, env-var constructor patterns
  for aiEmbeddingModelChange).
- Calibration corpus section reframed: gate is recall-only,
  precision floor slipped to 0.3 (corpus v2). Match-key precision
  improvement (Symbol added) and empty-corpus-bypass closure
  documented.
- New "Polish (release-prep adversarial review fixes)" section
  bullets the cross-cutting fixes from the two adversarial-review
  passes: release infra, engine self-diagnostic (detectorPanic in
  catalog, RequiresGraph hard-fail), eval adapters (Promptfoo
  errors-bucket + per-case cost + time magnitude, DeepEval runId +
  metric-name normalisation, Ragas evaluation_results + scores
  shapes, envelope SourcePath rel), CLI (deprecation hints,
  --read-only enforcement, version JSON schemaVersion, exit 5 for
  not-found), determinism (sortSignals Symbol tiebreak), supply
  chain (concurrency, timeouts, CodeQL Python drop,
  COSIGN_EXPERIMENTAL cleanup), documentation (CODE_OF_CONDUCT,
  issue templates, glossary/versioning/compatibility, integration
  guides).

## docs/release/feature-status.md
- 12 AI detector rows updated to describe the actual shipped
  behaviour, not the pre-fix state. Specifically: per-provider
  scoping, severity-by-category, structural-with-benign-objects,
  implicit-coverage, paired-count confidence, placeholder
  rejection, env-var constructor patterns.
- Cosign npm-install row now reflects the mandatory-by-default
  posture (was "degrades to checksum-only"). Documents the two
  escapes (TERRAIN_INSTALLER_ALLOW_MISSING_COSIGN=1,
  TERRAIN_INSTALLER_SKIP_VERIFY=1) and the redirect cap.

## docs/cli-spec.md
- New "Surface — canonical 11 + legacy aliases" section at the
  top documents the 0.2 namespace dispatchers and the
  TERRAIN_LEGACY_HINT=1 deprecation flow. Detailed per-command
  entries below stay valid since legacy aliases still work
  through 0.2.x.
- `terrain version` entry mentions the new schemaVersion field
  in --json output (CI tools can pin the snapshot contract).
- `terrain serve --read-only` description updated: was
  "no-op in 0.1.2", now reflects actual HTTP 405 method
  enforcement that shipped in 0.2.0.

## docs/telemetry.md
- Example version field updated 0.1.0 → 0.2.0.

`make docs-verify` passes; full `go test ./...` clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs(0.2): Repo-wide American English sweep

The repo's existing convention is American English (240 AmE markers
vs 21 BrE pre-sweep). Recent docs and Go-source comments drifted to
British spellings during the polish review cycles; this commit
brings everything back to convention.

Mechanical replacements applied uniformly across `*.go`, `*.md`,
`*.js`, `*.yml`, `*.yaml` (excluding `node_modules/`,
`benchmarks/repos/`, `.git/`, `.claude/`, `tests/calibration/`,
`vendor/`):

- behaviour → behavior (incl. plurals + capitalization)
- normalise / normalised / normalising / normalisation →
  normalize / normalized / normalizing / normalization
- categorise / recognise / synthesise / summarise / prioritise /
  utilise / optimise / favour / finalise → -ize / -or / -ze
  variants
- labelled / labelling / modelled / modelling → labeled / labeling /
  modeled / modeling
- cancelled → canceled
- centre → center
- analyse → analyze (verb form; the -yze variant is the AmE
  convention and was already used as such elsewhere)

64 files changed, 132 insertions / 132 deletions. Pure word-level
substitutions; no semantics affected. `make docs-gen` regenerated
manifest.json + severity-rubric.md + rule doc stubs to track the
Go-source comment changes.

Verified: `go build ./...` clean, `go test ./...` all pass,
`make docs-verify` zero-diff.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(0.2): AI validation gate is now impact-scoped

The terrain-ai.yml workflow was reporting whole-repo AI signals as
"blocking" for every PR, including doc-only ones. Both PR #128
(code-changing) and PR #129 (docs-only) generated 29 / 68 blocking
signals respectively — most referencing files the PR never touched
(calibration-corpus fixtures, the detector's own source, etc.).

Root cause in `internal/changescope/analyze.go`: the AI signals
collection loop iterated EVERY signal of category AI in the
snapshot, regardless of whether `sig.Location.File` was in the PR's
changed-files set. Compare to the impacted-scenarios loop right
above it (line 250), which IS impact-scoped via
`result.ImpactedScenarios`. The asymmetry meant the gate was
useless: the calibration corpus contains intentionally-bad
fixtures designed to trigger the detectors (that's their job —
they're regression tests), and those fixtures showed up as
"merge blockers" on every PR.

Fix: hoist the `changedPaths` set construction above the signals
loop and filter by `sig.Location.File`. Signals without a
Location.File (whole-repo emergent signals) are also dropped —
they belong in `terrain analyze`, not `terrain pr`.

The pre-existing
`TestBuildAIValidationSummary_WithSignals` test was constructed
around the old (broken) behavior with no Location.File set on the
fixture signals; rewritten to assert the new contract:

  - Critical AI signal on a changed file → BlockingSignals
  - Medium AI signal on a changed file → WarningSignals
  - High AI signal on an UNCHANGED file → dropped
  - Quality signal → dropped (category filter, unchanged)

New regression test `TestBuildAIValidationSummary_DropsSignalsOnUnchangedFiles`
locks in the doc-only-PR case: a PR that only touches docs/* and
CHANGELOG.md should produce zero blocking signals from
calibration-corpus fixtures elsewhere in the tree.

Verified: `go test ./...` green; the new asymmetry between
"whole-repo signals via terrain analyze" and "PR-introduced
signals via terrain pr" matches the design contract for the two
commands.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(0.2): Rewrite AI Validation PR comment for actual humans

The PR-comment format for `terrain pr --format markdown` had three
problems that made the AI Validation section unhelpful:

1. Detector taxonomy as the headline. Bullets started with bold
   `**aiPromptInjectionRisk**:` followed by the raw detector
   explanation. PR authors had to know the rubric to interpret it.
2. No file paths or line numbers. The Recommended Tests section
   right above already shows file paths (and they're clickable on
   GitHub); the AI Validation section dropped them.
3. No deduplication. 12 prompt-injection hits across 4 files
   produced 12 identical-looking bullets. The signal was lost in
   the noise.

Fix:

- AISignalSummary gains File/Line/Symbol fields (model.go);
  populated in analyze.go from sig.Location.

- New `internal/changescope/ai_signal_humanize.go` carries:
  - `humanSummary[type]` — one-sentence plain-language description
    per detector type (no taxonomy)
  - `humanAction[type]`  — one-sentence concrete next step
  - `groupSignalsByFileAndType()` — collapses N (file, type)
    duplicates into one entry whose Lines slice carries every
    distinct line number
  - `renderGroupedSignal()` — outputs `**\`path:42, 47, 51\`** —
    <plain summary>` followed by `→ <action>`

- render.go's markdown path now calls the grouped renderer for
  Blocking + Warning sections. Section header becomes
  "**N new finding(s) introduced by this PR**" instead of
  "Blocking signals (N)" so the framing matches what `terrain pr`
  is actually about (PR-introduced findings, not whole-repo state).

- The text renderer (used when --format isn't markdown) gets the
  same grouped output.

Test updates:
- TestRenderPRSummaryMarkdown_AISection rewritten for the new
  contract: locator presence, plain-language summary presence,
  action arrow, line-grouping for duplicates, symbol-keyed
  locator for tool findings, NO bare detector taxonomy in the
  bold headline.

Sample new output (12-line repetition collapses to 4 bullets):

  ### AI Validation
  Scenarios: 2 of 12 selected

  **4 new finding(s) introduced by this PR:**
  - **`evals/promptfoo.yaml:7`** — API key embedded in source or
    config — should be in env / secret store.
    → Move the secret to an env var (or your secrets manager).
  - **`agents/tools.yaml (delete_user)`** — Destructive tool can
    run without an approval gate, sandbox, or dry-run mode.
    → Add `requires_approval: true`, route through a sandbox.
  - **`src/auth/login.ts:42, 47, 51`** — User input flows into a
    prompt without visible escaping or boundary tokens.
    → Wrap user input through a sanitizer, or use a prompt
      template with explicit user-content boundaries.
  - **`src/chat/handler.ts:18`** — User input flows into a prompt
    ... [same summary, separate file]

Combined with the impact-scope filter from the previous commit,
the AI Validation section now does what it should:
PR-introduced findings only, grouped, with file paths and
plain-language explanations.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(0.2): Aesthetic polish for terrain pr markdown comment

The PR comment now reads as a sequence of discrete cards rather than
a wall of bullets. Concrete changes:

## Header
- H2 + blockquote subtitle ("> High-severity gaps found in changed
  code.") instead of `## ... — verdict` + `*italic line*`. Blockquote
  renders as a callout on GitHub (left rule + soft background) which
  visually separates the "why" from the "what."

## Metrics table
- Compact 2-column layout with bold left-column labels:

  | Metric | Value |
  |---|---|
  | **Changed files** | 7 (5 source · 2 test) |
  | **Impacted units** | 12 |

  Empty rows skipped (`Impacted units: 0` doesn't render). Middle-dot
  separator (·) replaces commas in compound stats so cells read
  cleanly.

## Sentence-case headings + horizontal rules
- "New Risks (directly changed)" → "Coverage gaps in changed code"
- "Recommended Tests" → "Recommended tests"
- "Impacted capabilities:" + "Scenarios: 2 of 12 selected" inline →
  blockquote with both: `> **Capabilities:** ... · **Scenarios:** ...`
- Every major section is preceded by a `---` horizontal rule, giving
  the comment visual rhythm. Each section reads as its own card.

## Finding card shape (parallel across all sections)
- Coverage gaps and AI signals now share the same bullet shape:

  - **`path/to/file.ts`** [HIGH] — <plain-language description>
    → <suggested action>

  Pre-fix coverage gaps and AI signals were rendered by two different
  paths producing different shapes (one used severity prefix, one
  used taxonomy prefix); the inconsistency was visible in the same
  comment. Now both go through `renderFindingCard` /
  `renderGroupedSignal` with parallel structure.

## Pluralization
- New `pluralize(n, singular, plural)` helper replaces the awkward
  "finding(s)" / "issue(s)" / "gap(s)" notation in user-visible
  headers. "1 advisory finding", "2 advisory findings",
  "1 indirectly impacted protection gap", "3 pre-existing issues."

## Footer
- Owners + branding + limitations consolidated into a small-text
  footer using `<sub>` tags, separated from main content by a final
  `---`. Pre-fix these were scattered mid-comment.

## Test updates
- TestRenderPRSummaryMarkdown_DirectVsIndirectSections updated for
  new heading + card shape + pluralization.
- TestRenderPRSummaryMarkdown_FindingTruncation: italicized
  "_...and N more_" overflow message.
- TestRenderPRSummaryMarkdown_Deterministic + AISection +
  MixedTraditionalAndAI: matched against new sentence-case headings.

Sample new comment shape:

  ## [WARN] Terrain — Merge with caution

  > High-severity gaps found in changed code.

  | Metric | Value |
  |---|---|
  | **Changed files** | 7 (5 source · 2 test) |
  | **Impacted units** | 12 |

  ---

  ### Coverage gaps in changed code

  - **`src/auth/login.ts`** [HIGH] — Exported function authenticate
    has no observed test coverage.

  ---

  ### AI Validation

  > **Capabilities:** customer-support-bot · **Scenarios:** 2 of 12
  > selected

  **2 new findings introduced by this PR:**

  - **`src/auth/login.ts:42, 47, 51`** — User input flows into a
    prompt without visible escaping or boundary tokens.
    → Wrap user input through a sanitizer.

  ---

  <sub>Generated by [Terrain](...) · `terrain pr --json` for
  machine-readable output</sub>

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant