Skip to content

SPDD: close safeguards/consistency gaps across 5 specs and add targeted compliance vectors#33365

Merged
pelikhan merged 6 commits into
mainfrom
copilot/spdd-daily-spec-work-plan-2026-05-19
May 19, 2026
Merged

SPDD: close safeguards/consistency gaps across 5 specs and add targeted compliance vectors#33365
pelikhan merged 6 commits into
mainfrom
copilot/spdd-daily-spec-work-plan-2026-05-19

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented May 19, 2026

This PR addresses the daily SPDD findings across Effective Tokens, Forecast, Frontmatter Hash, Fuzzy Schedule, and MCP Scripts specifications. It closes missing safeguards/threat-model coverage, normalizes key norms (retry/error/lower-bound semantics), and adds missing compliance vectors/tests referenced by the specs.

  • Effective Tokens spec: deep-graph ordering + partial observability

    • Added explicit implementation ordering constraints for multi-invocation aggregation in deep graphs.
    • Added concrete partial-observability examples (including deterministic subtotal sequencing).
    • Added new compliance requirement T-ET-032 and aligned checklist/counts.
    • Added test coverage for T-ET-032 in pkg/cli/effective_tokens_compliance_test.go.
  • Forecast spec: safeguards format + discovery race handling + lower-bound disclosure

    • Added remote discovery race-condition requirement (R-DISC-014) and mitigation guidance.
    • Extended episode-analysis JSON schema with episode_count_is_lower_bound (normative disclosure).
    • Reframed output-ordering norms so R-OUT-010 is explicit lower-bound disclosure and shifted ordering requirements to R-OUT-011..013.
    • Reworked appendices to include:
      • Appendix E: workflow discovery race conditions
      • Appendix F: safeguards (threat model, mitigations, residual risk)
  • Frontmatter Hash spec: sync note authority + diamond-import tie-breaking + new vector

    • Updated sync notes to explicitly mark the 2026-05-08 text-based resolution as authoritative, with 2026-05-12 as verification.
    • Added deterministic BFS diamond-import tie-breaking requirement to the sync checklist.
    • Added FH-TV-004 (agent import / diamond path scenario) and expected hash.
    • Added matching test vector in pkg/parser/frontmatter_hash_cross_language_test.go.
  • Fuzzy Schedule spec: retry/backoff norms + DST behavior + minor-version hash stability

    • Added DST-transition behavior guidance for abbreviation-based schedules.
    • Added R-HASH-001 requiring hash-derived scatter stability across minor version upgrades.
    • Added Section 9.6 with bounded retry/backoff norms for collision/contention paths.
  • MCP Scripts spec: safeguards completion + JS global side-effect constraints + dependency failure semantics

    • Clarified dependency failure behavior: fail-fast for deterministic failures; bounded retry only for transient failures.
    • Added explicit side-effect constraint for JS globals (no workflow-control mutators like core.setFailed() from tool code).
    • Added safeguards appendix with threat model/mitigations/residual risk.
    • Added compliance ID T-MCP-050 (Go sandbox network isolation) and checklist entry.
    • Added corresponding targeted test in pkg/workflow/mcp_scripts_firewall_test.go.

Example of newly normalized JSON disclosure in Forecast episode analysis:

"episode_analysis": {
  "sampled_episodes": 40,
  "episode_count_is_lower_bound": true,
  "runs_per_episode": 1.05,
  "avg_effective_tokens_per_episode": 13100,
  "observed_episodes_per_period": 36.7
}

Copilot AI linked an issue May 19, 2026 that may be closed by this pull request
17 tasks
Copilot AI changed the title [WIP] Fix gaps in specifications for SPDD completeness SPDD: close safeguards/consistency gaps across 5 specs and add targeted compliance vectors May 19, 2026
Copilot AI requested a review from gh-aw-bot May 19, 2026 17:14
@pelikhan pelikhan marked this pull request as ready for review May 19, 2026 18:33
Copilot AI review requested due to automatic review settings May 19, 2026 18:33
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR closes daily SPDD findings across five specifications (Effective Tokens, Forecast, Frontmatter Hash, Fuzzy Schedule, MCP Scripts) by adding missing safeguards/threat-model sections, normalizing retry/error/lower-bound semantics, and adding three targeted compliance tests (T-ET-032, FH-TV-004, T-MCP-050).

Changes:

  • Spec edits add new normative requirements (R-DISC-014, R-OUT-010 lower-bound disclosure, R-HASH-001, R-ERR-050..052) and safeguards/threat-model appendices for Forecast and MCP Scripts.
  • Adds compliance test IDs T-ET-032 (deep-graph post-order), FH-TV-004 (diamond-import hash vector), and T-MCP-050 (Go sandbox network isolation), plus checklist/count updates.
  • Renumbers Forecast R-OUT-010..012R-OUT-011..013 and reuses R-OUT-010 for a different (lower-bound disclosure) requirement.
Show a summary per file
File Description
docs/src/content/docs/reference/effective-tokens-specification.md Adds deep-graph ordering constraints, partial-observability example (A.5), and T-ET-032 compliance entry.
docs/src/content/docs/reference/forecast-specification.md Adds R-DISC-014 race-condition norm, lower-bound JSON field, renumbers R-OUT-*, and adds Appendix E/F (race conditions, safeguards).
docs/src/content/docs/reference/frontmatter-hash-specification.md Marks 2026-05-08 text-based resolution as authoritative, adds BFS diamond-import tie-breaking note, adds FH-TV-004 vector.
docs/src/content/docs/reference/fuzzy-schedule-specification.md Adds DST behavior guidance, R-HASH-001 hash stability across minor versions, and §9.6 bounded retry/backoff norms.
docs/src/content/docs/reference/mcp-scripts-specification.md Clarifies dependency failure semantics, adds JS global side-effect constraints, adds Appendix D safeguards and T-MCP-050.
pkg/cli/effective_tokens_compliance_test.go Adds T-ET-032 test (currently exercises a test-local walker, not production aggregation code).
pkg/parser/frontmatter_hash_cross_language_test.go Adds FH-TV-004 test with hardcoded expected hash for the agent-import diamond scenario.
pkg/workflow/mcp_scripts_firewall_test.go Adds T-MCP-050 test asserting no wildcard/0.0.0.0/0 in computed allowed-domain string.

Copilot's findings

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

  • Files reviewed: 8/8 changed files
  • Comments generated: 3

Comment on lines +258 to +309
type testNode struct {
id string
localET float64
observed bool
children []*testNode
fallback float64
}

// root
// ├─ planner
// │ ├─ retrieval
// │ │ └─ shard-1
// │ └─ shard-2 (unobservable fallback)
// └─ synthesis
shard1 := &testNode{id: "shard-1", localET: 60, observed: true}
retrieval := &testNode{id: "retrieval", localET: 120, observed: true, children: []*testNode{shard1}}
shard2 := &testNode{id: "shard-2", observed: false, fallback: 25}
planner := &testNode{id: "planner", localET: 30, observed: true, children: []*testNode{retrieval, shard2}}
synthesis := &testNode{id: "synthesis", localET: 40, observed: true}
root := &testNode{id: "root", localET: 10, observed: true, children: []*testNode{planner, synthesis}}

var order []string
var subtotals []float64
running := 0.0

var walk func(n *testNode)
walk = func(n *testNode) {
for _, child := range n.children {
walk(child)
}

value := n.localET
if !n.observed {
value = n.fallback
}

running += value
order = append(order, n.id)
subtotals = append(subtotals, running)
}

walk(root)

assert.Equal(t,
[]string{"shard-1", "retrieval", "shard-2", "planner", "synthesis", "root"},
order,
"T-ET-032: aggregation order must be stable post-order for deep graphs")

assert.Equal(t,
[]float64{60, 180, 205, 235, 275, 285},
subtotals,
"T-ET-032: partial subtotals must remain deterministic under partial observability")
Comment on lines +737 to +743
- **R-OUT-010**: JSON output MUST disclose episode lower-bound semantics by including
`episode_analysis.episode_count_is_lower_bound` for every workflow. Console output SHOULD include
a note when this field is `true`.
- **R-OUT-011**: In both console and JSON output, workflows MUST be ordered by
`projected_effective_tokens` (P50 value) in descending order.
- **R-OUT-012**: Workflows with zero projected tokens MUST appear after all workflows with non-zero projections.
- **R-OUT-013**: Among workflows with equal projected tokens, the ordering SHOULD be deterministic (e.g., alphabetical by workflow ID).
| Field | Type | Required | Description |
|---|---|---|---|
| `sampled_episodes` | integer | MUST | Distinct episode count. Lower-bound estimate when artifact linkage is unavailable. |
| `episode_count_is_lower_bound` | boolean | MUST | `true` when episode linkage data is incomplete (for example, remote mode without artifacts); otherwise `false`. |
@pelikhan
Copy link
Copy Markdown
Collaborator

@copilot review all comments and reviews

Copilot AI and others added 3 commits May 19, 2026 19:32
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
…rement IDs

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copy link
Copy Markdown
Contributor Author

Copilot AI commented May 19, 2026

@copilot review all comments and reviews

Done. I reviewed the feedback and pushed follow-up fixes:

  • 9019554: updated T-ET-032 to exercise production ET aggregation code, restored Forecast requirement ID compatibility (R-OUT-010..012 unchanged), and made episode_count_is_lower_bound migration-safe (SHOULD) with changelog clarification.
  • 28e64f1: added clarification in T-ET-032 about why multipliers are fixed at 1.0 for deterministic-order isolation.

Targeted validation run: go test -run TestETCompliance_T_ET_032_DeepGraphPostOrderAggregation ./pkg/cli/ passed.

Copilot AI requested a review from pelikhan May 19, 2026 19:38
@pelikhan pelikhan merged commit 2e27970 into main May 19, 2026
24 of 25 checks passed
@pelikhan pelikhan deleted the copilot/spdd-daily-spec-work-plan-2026-05-19 branch May 19, 2026 19:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[spdd] Daily spec work plan - 2026-05-19

4 participants