Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
24 changes: 22 additions & 2 deletions docs/src/content/docs/reference/effective-tokens-specification.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,8 +40,9 @@ This document is governed by the GitHub Agentic Workflows project specifications
10. [Compliance Testing](#10-compliance-testing)
11. [Appendices](#appendices)
12. [Model Multiplier Registry](#model-multiplier-registry)
13. [References](#references)
14. [Change Log](#change-log)
13. [Sync Notes](#sync-notes)
14. [References](#references)
15. [Change Log](#change-log)

---

Expand Down Expand Up @@ -473,12 +474,31 @@ This file is embedded at compile time into the `gh-aw` binary using a Go `//go:e

**R-REG-006**: Custom multipliers supplied by the caller (e.g., via API or configuration) MUST be merged with registry multipliers. Custom values take precedence and MUST be disclosed in any report that uses them.

**R-REG-007**: The registry MUST NOT contain placeholder values such as `TBD`, `null`, or empty strings for any model multiplier entry. Each declared model key MUST map to a numeric multiplier value.

**R-REG-008**: When adding support for a new model, maintainers MUST register the model in `pkg/cli/data/model_multipliers.json` with a concrete numeric multiplier before release. If calibration is incomplete, the model MUST be omitted from the registry and the implementation fallback behavior in R-REG-005 applies.

### Registry Versioning

The `version` field in `model_multipliers.json` corresponds to the registry schema version, not the gh-aw binary version. Implementations SHOULD include the registry version in all ET summary reports to enable historical reconstruction.

---

## Sync Notes

The Effective Tokens registry is maintained in `pkg/cli/data/model_multipliers.json` and loaded by `pkg/cli/effective_tokens.go`.

To keep specification and implementation synchronized:

1. Update this specification's registry requirements when adding, removing, or re-scaling model multipliers.
2. Update `pkg/cli/data/model_multipliers.json` in the same change.
3. Verify loading and fallback behavior in `pkg/cli/effective_tokens_test.go` (`TestModelMultipliersJSONEmbedded`, `TestResolveEffectiveWeightsDefault`, and inventory checks).
4. Run `make build` so the embedded registry is rebuilt into the `gh-aw` binary.

Conforming releases SHOULD include a test assertion for newly added model multipliers to ensure implementation-registry parity.

---

## References

### Normative References
Expand Down
23 changes: 23 additions & 0 deletions docs/src/content/docs/reference/experiments-specification.md
Original file line number Diff line number Diff line change
Expand Up @@ -691,6 +691,13 @@ implemented within a single workflow file. Engine-switching experiments **MUST**
compiled workflow files (one per variant), which can then be compared via their respective
GitHub Actions run metrics.

**R-MULTI-005**: When two or more experiments are simultaneously active in the same analysis
window, reporting tools **MUST** detect and bound interaction risk by preserving the full
assignment vector per run and evaluating whether each observed combination cell has sufficient
sample coverage. If interaction effects cannot be bounded (for example, sparse cells below
`min_samples`), the report **MUST** emit an explicit interaction-risk status and **MUST NOT**
recommend PROMOTE for affected variants.

### 12.1 Conflict Resolution Norms

A **conflict** occurs when two or more simultaneously active experiments would assign
Expand Down Expand Up @@ -1115,10 +1122,26 @@ approximate minimum runs per variant are:
5. **State branch growth**: The experiments git branch grows monotonically. Operators
**MAY** prune old commits from the experiments branch without affecting the current state.

### Sync Follow-ups (May 2026 Expert Review)

This appendix itemizes corrective follow-ups referenced in the abstract.

- **FR-001 (implemented via R-SELECT-006)**: Weighted selection increments invocation counters after each selection.
- **FR-002 (implemented via R-STAT-001/R-STAT-002)**: Reporting uses `state.runs` assignment records instead of count-delta inference.
- **FR-003 (implemented via R-STAT-011/R-STAT-012)**: Reporting workflows that write issues/discussions declare explicit write permissions.
- **FR-004 (implemented via R-MULTI-005)**: Concurrent-experiment interaction effects are explicitly detected and bounded before promotion decisions.
- **TODO(experiments, owner: @gh-aw-maintainers, target: v1.1.0)**: Add factorial-interaction analysis helpers to reporting workflows for K₁×K₂ cell significance output.
- **TODO(experiments, owner: @gh-aw-maintainers, target: v1.1.0)**: Add compiler diagnostics for sparse interaction cells when >1 experiment is active and weighted traffic is configured.

---

## Change Log

### Version 1.0.1 (Draft) — 2026-05-07

- **Added**: R-MULTI-005 requiring interaction-risk detection/bounding for simultaneous experiments.
- **Added**: Sync Follow-ups appendix with itemized May 2026 expert-review corrective items and owned TODOs.

Comment on lines +1140 to +1144
### Version 1.0.0 (Draft) — 2026-05-03

- **Initial publication** consolidating ADR-29534, ADR-29618, ADR-29628, ADR-29985, and ADR-29996.
Expand Down
28 changes: 28 additions & 0 deletions docs/src/content/docs/reference/frontmatter-hash-specification.md
Original file line number Diff line number Diff line change
@@ -1,10 +1,21 @@
---
title: Frontmatter Hash Specification
description: Specification for computing deterministic hashes of agentic workflow frontmatter
version: 1.0.0
status: Draft
publication_date: 2026-05-07
---

# Frontmatter Hash Specification

**Version**: 1.0.0
**Status**: Draft
**Publication Date**: 2026-05-07
**Latest Version**: [frontmatter-hash-specification](/gh-aw/reference/frontmatter-hash-specification/)
**Editor**: GitHub Agentic Workflows Team

---

This document specifies the algorithm for computing a deterministic hash of agentic workflow frontmatter, including contributions from imported workflows.

## Purpose
Expand All @@ -14,6 +25,17 @@ The frontmatter hash provides:
2. **Reproducibility**: Ensure identical configurations produce identical hashes across languages (Go and JavaScript)
3. **Change detection**: Verify that workflow configuration has not changed between compilation and execution

## Conformance

### Conformance Classes

- **Basic Conformance**: An implementation MUST compute a deterministic SHA-256 hash from canonicalized frontmatter input and MUST produce the same output for identical input.
- **Full Conformance**: An implementation MUST satisfy Basic Conformance and MUST implement cross-language consistency checks between Go and JavaScript implementations.

### Requirements Notation

The key words **MUST**, **MUST NOT**, **REQUIRED**, **SHALL**, **SHALL NOT**, **SHOULD**, **SHOULD NOT**, **RECOMMENDED**, **MAY**, and **OPTIONAL** in this document are to be interpreted as described in [RFC 2119](https://www.rfc-editor.org/rfc/rfc2119).

## Hash Algorithm

### 1. Input Collection
Expand Down Expand Up @@ -262,6 +284,12 @@ The Go and JavaScript implementations must produce byte-for-byte identical canon

**Mitigation**: Maintain a shared test-vector file (at minimum: empty frontmatter, single-field workflow, multi-level imports, all field types). Run cross-language hash tests in CI. Any change to the serialization algorithm in either language MUST be accompanied by updated test vectors verified against both implementations.

### S-6: Maximum Frontmatter Input Size

Very large frontmatter payloads can cause excessive memory use and hash-computation latency during compilation and runtime verification. This can degrade CI reliability and increase stale-lock false positives due to timeout or resource pressure.

**Mitigation**: Implementations SHOULD enforce a maximum cumulative frontmatter input size and MUST fail deterministically with a descriptive error when the limit is exceeded. A limit of 1 MiB for the combined normalized frontmatter input is RECOMMENDED unless repository-specific requirements justify a higher bound.

---

## Security Considerations
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -524,6 +524,15 @@ The scattering algorithm MUST provide:
3. **Stability**: Scattered times remain constant across recompilations
4. **Uniqueness**: Different workflow identifiers produce different scattered times

The scattering algorithm uses the following formal input entities:

| Entity | Type | Constraints | Description |
|---|---|---|---|
| `workflow_identifier` | string | MUST be non-empty; SHOULD use `owner/repo/path/to/workflow.md` format | Canonical identifier hashed for deterministic scatter selection |
| `schedule_string` | string | MUST match a supported fuzzy placeholder form (`FUZZY:*`) | Parsed schedule expression that determines algorithm branch |
| `seed` | unsigned 32-bit integer | MUST be derived deterministically from `workflow_identifier` using the configured hash function | Hash-derived seed used for modulo operations |
| `window_minutes` | integer | MUST be positive; MUST NOT exceed 1440 | Candidate-minute search window for around/between scattering |

### 6.2 Hash Function Requirements

#### 6.2.1 Hash Algorithm Selection
Expand Down
12 changes: 10 additions & 2 deletions docs/src/content/docs/reference/mcp-scripts-specification.md
Original file line number Diff line number Diff line change
Expand Up @@ -186,7 +186,7 @@ mcp-scripts:
// JavaScript implementation
env:
SECRET_NAME: "${{ secrets.SECRET_NAME }}"
timeout: 60
timeout: 30
```

**JSON Schema**: [mcp-scripts-config.schema.json](/gh-aw/schemas/mcp-scripts-config.schema.json)
Expand All @@ -209,7 +209,7 @@ Each tool configuration MAY contain:
| `py` | string | Conditional* | Python script implementation |
| `go` | string | Conditional* | Go code implementation |
| `env` | object | No | Environment variables (typically secrets) |
| `timeout` | integer | No | Execution timeout in seconds (default: 60, applies to run/py/go only) |
| `timeout` | integer | No | Execution timeout in seconds (default: 30, applies to run/py/go only) |
| `dependencies` | array[string] | No | Package dependencies to install in execution environment (runtime-specific) |

*Exactly ONE of `script`, `run`, `py`, or `go` MUST be provided per tool.
Expand Down Expand Up @@ -417,6 +417,14 @@ For JavaScript tools:
- Thrown errors indicate failure
- Async functions are awaited

### 5.6 Runtime Timeout Requirements

Each runtime handler (`script`, `run`, `py`, and `go`) **MUST** enforce a configurable execution timeout and **MUST** terminate tool execution when the timeout is reached.

Implementations **SHOULD** default this timeout to 30 seconds or less unless the workflow author explicitly configures a different value.

When a timeout occurs, the server **MUST** return a JSON-RPC execution error (`-32603`) that explicitly identifies timeout termination.

---

## 6. Language Support
Expand Down