diff --git a/docs/src/content/docs/reference/effective-tokens-specification.md b/docs/src/content/docs/reference/effective-tokens-specification.md index fde89238f78..4027eb2102b 100644 --- a/docs/src/content/docs/reference/effective-tokens-specification.md +++ b/docs/src/content/docs/reference/effective-tokens-specification.md @@ -40,8 +40,9 @@ This document is governed by the GitHub Agentic Workflows project specifications 10. [Compliance Testing](#10-compliance-testing) 11. [Appendices](#appendices) 12. [Model Multiplier Registry](#model-multiplier-registry) -13. [References](#references) -14. [Change Log](#change-log) +13. [Sync Notes](#sync-notes) +14. [References](#references) +15. [Change Log](#change-log) --- @@ -473,12 +474,31 @@ This file is embedded at compile time into the `gh-aw` binary using a Go `//go:e **R-REG-006**: Custom multipliers supplied by the caller (e.g., via API or configuration) MUST be merged with registry multipliers. Custom values take precedence and MUST be disclosed in any report that uses them. +**R-REG-007**: The registry MUST NOT contain placeholder values such as `TBD`, `null`, or empty strings for any model multiplier entry. Each declared model key MUST map to a numeric multiplier value. + +**R-REG-008**: When adding support for a new model, maintainers MUST register the model in `pkg/cli/data/model_multipliers.json` with a concrete numeric multiplier before release. If calibration is incomplete, the model MUST be omitted from the registry and the implementation fallback behavior in R-REG-005 applies. + ### Registry Versioning The `version` field in `model_multipliers.json` corresponds to the registry schema version, not the gh-aw binary version. Implementations SHOULD include the registry version in all ET summary reports to enable historical reconstruction. --- +## Sync Notes + +The Effective Tokens registry is maintained in `pkg/cli/data/model_multipliers.json` and loaded by `pkg/cli/effective_tokens.go`. + +To keep specification and implementation synchronized: + +1. Update this specification's registry requirements when adding, removing, or re-scaling model multipliers. +2. Update `pkg/cli/data/model_multipliers.json` in the same change. +3. Verify loading and fallback behavior in `pkg/cli/effective_tokens_test.go` (`TestModelMultipliersJSONEmbedded`, `TestResolveEffectiveWeightsDefault`, and inventory checks). +4. Run `make build` so the embedded registry is rebuilt into the `gh-aw` binary. + +Conforming releases SHOULD include a test assertion for newly added model multipliers to ensure implementation-registry parity. + +--- + ## References ### Normative References diff --git a/docs/src/content/docs/reference/experiments-specification.md b/docs/src/content/docs/reference/experiments-specification.md index 36483d4b612..eda7770ca27 100644 --- a/docs/src/content/docs/reference/experiments-specification.md +++ b/docs/src/content/docs/reference/experiments-specification.md @@ -691,6 +691,13 @@ implemented within a single workflow file. Engine-switching experiments **MUST** compiled workflow files (one per variant), which can then be compared via their respective GitHub Actions run metrics. +**R-MULTI-005**: When two or more experiments are simultaneously active in the same analysis +window, reporting tools **MUST** detect and bound interaction risk by preserving the full +assignment vector per run and evaluating whether each observed combination cell has sufficient +sample coverage. If interaction effects cannot be bounded (for example, sparse cells below +`min_samples`), the report **MUST** emit an explicit interaction-risk status and **MUST NOT** +recommend PROMOTE for affected variants. + ### 12.1 Conflict Resolution Norms A **conflict** occurs when two or more simultaneously active experiments would assign @@ -1115,10 +1122,26 @@ approximate minimum runs per variant are: 5. **State branch growth**: The experiments git branch grows monotonically. Operators **MAY** prune old commits from the experiments branch without affecting the current state. +### Sync Follow-ups (May 2026 Expert Review) + +This appendix itemizes corrective follow-ups referenced in the abstract. + +- **FR-001 (implemented via R-SELECT-006)**: Weighted selection increments invocation counters after each selection. +- **FR-002 (implemented via R-STAT-001/R-STAT-002)**: Reporting uses `state.runs` assignment records instead of count-delta inference. +- **FR-003 (implemented via R-STAT-011/R-STAT-012)**: Reporting workflows that write issues/discussions declare explicit write permissions. +- **FR-004 (implemented via R-MULTI-005)**: Concurrent-experiment interaction effects are explicitly detected and bounded before promotion decisions. +- **TODO(experiments, owner: @gh-aw-maintainers, target: v1.1.0)**: Add factorial-interaction analysis helpers to reporting workflows for K₁×K₂ cell significance output. +- **TODO(experiments, owner: @gh-aw-maintainers, target: v1.1.0)**: Add compiler diagnostics for sparse interaction cells when >1 experiment is active and weighted traffic is configured. + --- ## Change Log +### Version 1.0.1 (Draft) — 2026-05-07 + +- **Added**: R-MULTI-005 requiring interaction-risk detection/bounding for simultaneous experiments. +- **Added**: Sync Follow-ups appendix with itemized May 2026 expert-review corrective items and owned TODOs. + ### Version 1.0.0 (Draft) — 2026-05-03 - **Initial publication** consolidating ADR-29534, ADR-29618, ADR-29628, ADR-29985, and ADR-29996. diff --git a/docs/src/content/docs/reference/frontmatter-hash-specification.md b/docs/src/content/docs/reference/frontmatter-hash-specification.md index 4f04674b8f7..80034b3d313 100644 --- a/docs/src/content/docs/reference/frontmatter-hash-specification.md +++ b/docs/src/content/docs/reference/frontmatter-hash-specification.md @@ -1,10 +1,21 @@ --- title: Frontmatter Hash Specification description: Specification for computing deterministic hashes of agentic workflow frontmatter +version: 1.0.0 +status: Draft +publication_date: 2026-05-07 --- # Frontmatter Hash Specification +**Version**: 1.0.0 +**Status**: Draft +**Publication Date**: 2026-05-07 +**Latest Version**: [frontmatter-hash-specification](/gh-aw/reference/frontmatter-hash-specification/) +**Editor**: GitHub Agentic Workflows Team + +--- + This document specifies the algorithm for computing a deterministic hash of agentic workflow frontmatter, including contributions from imported workflows. ## Purpose @@ -14,6 +25,17 @@ The frontmatter hash provides: 2. **Reproducibility**: Ensure identical configurations produce identical hashes across languages (Go and JavaScript) 3. **Change detection**: Verify that workflow configuration has not changed between compilation and execution +## Conformance + +### Conformance Classes + +- **Basic Conformance**: An implementation MUST compute a deterministic SHA-256 hash from canonicalized frontmatter input and MUST produce the same output for identical input. +- **Full Conformance**: An implementation MUST satisfy Basic Conformance and MUST implement cross-language consistency checks between Go and JavaScript implementations. + +### Requirements Notation + +The key words **MUST**, **MUST NOT**, **REQUIRED**, **SHALL**, **SHALL NOT**, **SHOULD**, **SHOULD NOT**, **RECOMMENDED**, **MAY**, and **OPTIONAL** in this document are to be interpreted as described in [RFC 2119](https://www.rfc-editor.org/rfc/rfc2119). + ## Hash Algorithm ### 1. Input Collection @@ -262,6 +284,12 @@ The Go and JavaScript implementations must produce byte-for-byte identical canon **Mitigation**: Maintain a shared test-vector file (at minimum: empty frontmatter, single-field workflow, multi-level imports, all field types). Run cross-language hash tests in CI. Any change to the serialization algorithm in either language MUST be accompanied by updated test vectors verified against both implementations. +### S-6: Maximum Frontmatter Input Size + +Very large frontmatter payloads can cause excessive memory use and hash-computation latency during compilation and runtime verification. This can degrade CI reliability and increase stale-lock false positives due to timeout or resource pressure. + +**Mitigation**: Implementations SHOULD enforce a maximum cumulative frontmatter input size and MUST fail deterministically with a descriptive error when the limit is exceeded. A limit of 1 MiB for the combined normalized frontmatter input is RECOMMENDED unless repository-specific requirements justify a higher bound. + --- ## Security Considerations diff --git a/docs/src/content/docs/reference/fuzzy-schedule-specification.md b/docs/src/content/docs/reference/fuzzy-schedule-specification.md index 8e59fedc000..bd7ce974a18 100644 --- a/docs/src/content/docs/reference/fuzzy-schedule-specification.md +++ b/docs/src/content/docs/reference/fuzzy-schedule-specification.md @@ -524,6 +524,15 @@ The scattering algorithm MUST provide: 3. **Stability**: Scattered times remain constant across recompilations 4. **Uniqueness**: Different workflow identifiers produce different scattered times +The scattering algorithm uses the following formal input entities: + +| Entity | Type | Constraints | Description | +|---|---|---|---| +| `workflow_identifier` | string | MUST be non-empty; SHOULD use `owner/repo/path/to/workflow.md` format | Canonical identifier hashed for deterministic scatter selection | +| `schedule_string` | string | MUST match a supported fuzzy placeholder form (`FUZZY:*`) | Parsed schedule expression that determines algorithm branch | +| `seed` | unsigned 32-bit integer | MUST be derived deterministically from `workflow_identifier` using the configured hash function | Hash-derived seed used for modulo operations | +| `window_minutes` | integer | MUST be positive; MUST NOT exceed 1440 | Candidate-minute search window for around/between scattering | + ### 6.2 Hash Function Requirements #### 6.2.1 Hash Algorithm Selection diff --git a/docs/src/content/docs/reference/mcp-scripts-specification.md b/docs/src/content/docs/reference/mcp-scripts-specification.md index bc2ed21c4b4..37678cfe985 100644 --- a/docs/src/content/docs/reference/mcp-scripts-specification.md +++ b/docs/src/content/docs/reference/mcp-scripts-specification.md @@ -186,7 +186,7 @@ mcp-scripts: // JavaScript implementation env: SECRET_NAME: "${{ secrets.SECRET_NAME }}" - timeout: 60 + timeout: 30 ``` **JSON Schema**: [mcp-scripts-config.schema.json](/gh-aw/schemas/mcp-scripts-config.schema.json) @@ -209,7 +209,7 @@ Each tool configuration MAY contain: | `py` | string | Conditional* | Python script implementation | | `go` | string | Conditional* | Go code implementation | | `env` | object | No | Environment variables (typically secrets) | -| `timeout` | integer | No | Execution timeout in seconds (default: 60, applies to run/py/go only) | +| `timeout` | integer | No | Execution timeout in seconds (default: 30, applies to run/py/go only) | | `dependencies` | array[string] | No | Package dependencies to install in execution environment (runtime-specific) | *Exactly ONE of `script`, `run`, `py`, or `go` MUST be provided per tool. @@ -417,6 +417,14 @@ For JavaScript tools: - Thrown errors indicate failure - Async functions are awaited +### 5.6 Runtime Timeout Requirements + +Each runtime handler (`script`, `run`, `py`, and `go`) **MUST** enforce a configurable execution timeout and **MUST** terminate tool execution when the timeout is reached. + +Implementations **SHOULD** default this timeout to 30 seconds or less unless the workflow author explicitly configures a different value. + +When a timeout occurs, the server **MUST** return a JSON-RPC execution error (`-32603`) that explicitly identifies timeout termination. + --- ## 6. Language Support