Skip to content

fix(aw): front-load propositions PR call before safeoutputs idle timeout#1961

Merged
pethers merged 5 commits intomainfrom
copilot/aw-fix-government-propositions-workflow
Apr 23, 2026
Merged

fix(aw): front-load propositions PR call before safeoutputs idle timeout#1961
pethers merged 5 commits intomainfrom
copilot/aw-fix-government-propositions-workflow

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Apr 23, 2026

News: Government Propositions (run 24820030825) produced no safe outputs: the agent completed 37 analysis files and committed locally at min 31, then all safeoutputs___create_pull_request calls at min 32 failed with Streamable HTTP error: session not found.

Root cause

The safeoutputs MCP uses a Streamable-HTTP session with a ~30–35 min idle timeout (prior PR #1895). sandbox.mcp.keepalive-interval: 300 keeps the TCP connection alive but does not refresh server-side session state. The workflow's Time budget directed the single PR call to min 47–50 — always past the cutoff.

✗ create_pull_request (MCP: safeoutputs) · title: "📊 Analysis Only — Propositions — 2026-04-23"
  └ MCP server 'safeoutputs': Error: Streamable HTTP error: Error POSTing to endpoint: session not found

Changes (.github/workflows/news-propositions.md body only)

  • Run 1 (Analysis) budget rewritten: PR call moved from min 47–50 → 25–28; Pass 1 6–18, Pass 2 18–24, Gate 24–25.
  • Run 2 (Articles) budget tightened to 22–28 for parity, internally consistent with the "HARD DEADLINE minute 28" label (~28 min total).
  • Prominent 🔴 callout citing run 24820030825, documenting the ~30 min idle-timeout, stating the first and only safeoutputs___* call MUST happen by minute 28 at the latest, and explicitly overriding the 48-min deadline in 07-commit-and-pr.md §Deadline enforcement for this workflow.
  • Hard early-exit rule: at min 25 without staging, commit with [early-pr] tag and call the fully-qualified safeoutputs___create_pull_request immediately.
  • AI-FIRST guidance inside the compressed budget: prefer download/manifest scope compression (fewer dok_id entries in data-download-manifest.md) over iteration skipping, while maintaining 1:1 per-document coverage for every dok_id that remains in the manifest so 05-analysis-gate.md check 2 still passes; Pass 2 remains mandatory; scheduled runs behave as analysis_depth: standard in practice; reserve deep/comprehensive for manual workflow_dispatch backfills.

Notes for reviewers

  • Lock file unchanged by design: body is runtime-imported via {{#runtime-import .github/workflows/news-propositions.md}}; the lock tracks frontmatter_hash only.
  • Not a heartbeat PR: 07-commit-and-pr.md §Core rule forbids heartbeat/keep-alive PRs post-fix: call safeoutputs___create_pull_request ASAP, never noop when work exists #1768 (create-pull-request.max: 1). This aligns the per-workflow budget with 00-base-contract.md §Session keepalive requirement ("well before the session approaches expiry") rather than reviving the fix(aw): prevent safeoutputs session timeout in news-committee-reports #1895 pattern.
  • Scoped to propositions. Sibling workflows (committee-reports, motions, interpellations) carry the same 47–50 min budget and likely the same latent bug; interpellations has a separate WIP in fix(prompts): tighten safe-outputs MCP deadline after news workflow timed out at 33 min #1960. Consider a follow-up to propagate once this pattern is validated in production.
  • Tradeoff: the compressed budget reduces time per iteration vs. the prior 50-min budget — but the prior budget produced zero output, losing all 37 files to session expiry. Delivered-with-tighter-iteration strictly dominates not-delivered.
  • tests/workflow-architecture.test.ts (80 tests) passes; no tests assert on time-budget text.

Copilot AI linked an issue Apr 23, 2026 that may be closed by this pull request
@github-actions github-actions Bot added the size-xs Extra small change (< 10 lines) label Apr 23, 2026
@github-actions
Copy link
Copy Markdown
Contributor

🏷️ Automatic Labeling Summary

This PR has been automatically labeled based on the files changed and PR metadata.

Applied Labels: size-xs

Label Categories

  • 🗳️ Content: news, dashboard, visualization, intelligence
  • 💻 Technology: html-css, javascript, workflow, security
  • 📊 Data: cia-data, riksdag-data, data-pipeline, schema
  • 🌍 I18n: i18n, translation, rtl
  • 🔒 ISMS: isms, iso-27001, nist-csf, cis-controls
  • 🏗️ Infrastructure: ci-cd, deployment, performance, monitoring
  • 🔄 Quality: testing, accessibility, documentation, refactor
  • 🤖 AI: agent, skill, agentic-workflow

For more information, see .github/labeler.yml.

@github-actions
Copy link
Copy Markdown
Contributor

🔍 Lighthouse Performance Audit

Category Score Status
Performance 85/100 🟡
Accessibility 95/100 🟢
Best Practices 90/100 🟢
SEO 95/100 🟢

📥 Download full Lighthouse report

Budget Compliance: Performance budgets enforced via budget.json

…tputs session idle timeout

Agent-Logs-Url: https://github.com/Hack23/riksdagsmonitor/sessions/f6b0a915-3add-46af-85c2-dd74cd7f0051

Co-authored-by: pethers <1726836+pethers@users.noreply.github.com>
@github-actions github-actions Bot added documentation Documentation updates workflow GitHub Actions workflows ci-cd CI/CD pipeline changes news News articles and content generation agentic-workflow Agentic workflow changes size-s Small change (10-50 lines) labels Apr 23, 2026
@github-actions
Copy link
Copy Markdown
Contributor

🔍 Lighthouse Performance Audit

Category Score Status
Performance 85/100 🟡
Accessibility 95/100 🟢
Best Practices 90/100 🟢
SEO 95/100 🟢

📥 Download full Lighthouse report

Budget Compliance: Performance budgets enforced via budget.json

Copilot AI changed the title [WIP] Fix workflow failure for government propositions fix(aw): front-load propositions PR call before safeoutputs idle timeout Apr 23, 2026
Copilot AI requested a review from pethers April 23, 2026 08:41
@github-actions
Copy link
Copy Markdown
Contributor

🔍 Lighthouse Performance Audit

Category Score Status
Performance 85/100 🟡
Accessibility 95/100 🟢
Best Practices 90/100 🟢
SEO 95/100 🟢

📥 Download full Lighthouse report

Budget Compliance: Performance budgets enforced via budget.json

@pethers pethers marked this pull request as ready for review April 23, 2026 08:46
Copilot AI review requested due to automatic review settings April 23, 2026 08:46
@github-actions
Copy link
Copy Markdown
Contributor

🔍 Lighthouse Performance Audit

Category Score Status
Performance 85/100 🟡
Accessibility 95/100 🟢
Best Practices 90/100 🟢
SEO 95/100 🟢

📥 Download full Lighthouse report

Budget Compliance: Performance budgets enforced via budget.json

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates the News: Government Propositions agentic workflow prompt to avoid Safe Outputs “session not found” failures by moving the safeoutputs___create_pull_request intent-recording call earlier in the run, aligning the time budget with the observed Streamable-HTTP idle timeout.

Changes:

  • Added a prominent safeoutputs idle-timeout callout and clarified that the PR intent must be recorded early.
  • Compressed Run 1 (analysis) and Run 2 (articles) time budgets so the PR call happens well before the observed cutoff.
  • Added an early-exit rule to stage/commit and open an [early-pr] PR if staging hasn’t started by minute 25.

Comment thread .github/workflows/news-propositions.md Outdated
**Run 1 — Analysis mode** (no prior analysis found, ~50 min — produces all 23 artifacts):
> 🔴 **CRITICAL — safeoutputs MCP idle timeout (~30 min)**: The `safeoutputs` MCP server runs a Streamable-HTTP session that expires after **~30–35 minutes of idle time** (confirmed in run [24820030825](https://github.com/Hack23/riksdagsmonitor/actions/runs/24820030825) where all `create_pull_request` calls at minute ~32 failed with `session not found`, losing 37 files of completed analysis work). `sandbox.mcp.keepalive-interval: 300` keeps TCP connections alive but does **NOT** refresh the server-side session. **Your first and only `safeoutputs___*` call MUST happen before minute 28 of agent time.** This is a harder deadline than the ~60-minute Copilot-API token window described in `00-base-contract.md §Session keepalive requirement` and overrides the generic 48-min deadline in `07-commit-and-pr.md §Deadline enforcement` for this workflow.
>
> **AI-FIRST within the compressed budget**: Pass 2 is still mandatory (AI FIRST principle, §"5. 🔴 AI FIRST Quality Principle" in repo custom instructions). Under the tightened ~28-min Run 1 budget, prefer **scope compression over iteration skipping** — reduce the number of per-document analyses if needed, but always perform a full read-back-and-improve Pass 2 on whatever artifacts exist. For scheduled runs treat `analysis_depth` as `standard` in practice even if the input defaults to `deep`; reserve `deep`/`comprehensive` for manual `workflow_dispatch` backfills where a human can monitor for a longer-session environment. A partial analysis with genuine iteration is always strictly better than a broader analysis that never reaches GitHub.
Copy link

Copilot AI Apr 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The guidance to “reduce the number of per-document analyses if needed” conflicts with the Analysis Gate’s per-document coverage requirement (one document analysis per dok_id in data-download-manifest.md, see 05-analysis-gate.md check 2). As written, following this advice will predictably fail the gate. Consider reframing scope compression as reducing the download/manifest scope (e.g., fewer dok_id entries) while still maintaining 1:1 per-document coverage for whatever remains in the manifest.

Suggested change
> **AI-FIRST within the compressed budget**: Pass 2 is still mandatory (AI FIRST principle, §"5. 🔴 AI FIRST Quality Principle" in repo custom instructions). Under the tightened ~28-min Run 1 budget, prefer **scope compression over iteration skipping** — reduce the number of per-document analyses if needed, but always perform a full read-back-and-improve Pass 2 on whatever artifacts exist. For scheduled runs treat `analysis_depth` as `standard` in practice even if the input defaults to `deep`; reserve `deep`/`comprehensive` for manual `workflow_dispatch` backfills where a human can monitor for a longer-session environment. A partial analysis with genuine iteration is always strictly better than a broader analysis that never reaches GitHub.
> **AI-FIRST within the compressed budget**: Pass 2 is still mandatory (AI FIRST principle, §"5. 🔴 AI FIRST Quality Principle" in repo custom instructions). Under the tightened ~28-min Run 1 budget, prefer **scope compression over iteration skipping** — reduce the **download/manifest scope** if needed (for example, fewer `dok_id` entries in `data-download-manifest.md`), but maintain **1:1 per-document coverage** for every `dok_id` that remains in the manifest and always perform a full read-back-and-improve Pass 2 on whatever artifacts exist. For scheduled runs treat `analysis_depth` as `standard` in practice even if the input defaults to `deep`; reserve `deep`/`comprehensive` for manual `workflow_dispatch` backfills where a human can monitor for a longer-session environment. A smaller manifest with complete per-document coverage and genuine iteration is always strictly better than a broader manifest that cannot clear the Analysis Gate or never reaches GitHub.

Copilot uses AI. Check for mistakes.
Comment thread .github/workflows/news-propositions.md Outdated
| 2–6 | Read all 23 analysis artifacts into context (Families A+B+C+D) | 06 |
| 6–18 | Article Pass 1 + Pass 2 (EN, SV) | 06 |
| 18–22 | Visual + link validation | 06 |
| 22–26 | Stage articles, commit, **ONE** `safeoutputs___create_pull_request` — **HARD DEADLINE minute 28** | 07 |
Copy link

Copilot AI Apr 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Run 2’s staging/PR row is 22–26 but labels a “HARD DEADLINE minute 28”. If minute 28 is the real cutoff (per the safeoutputs session constraint), consider extending the row to 22–28 or changing the hard-deadline label to minute 26 so the table is internally consistent.

Suggested change
| 22–26 | Stage articles, commit, **ONE** `safeoutputs___create_pull_request`**HARD DEADLINE minute 28** | 07 |
| 22–28 | Stage articles, commit, **ONE** `safeoutputs___create_pull_request`**HARD DEADLINE minute 28** | 07 |

Copilot uses AI. Check for mistakes.
Comment thread .github/workflows/news-propositions.md Outdated
| 22–26 | Stage articles, commit, **ONE** `safeoutputs___create_pull_request` — **HARD DEADLINE minute 28** | 07 |

Trim scope before quality. Never open a second PR within a run — there is no second PR.
Trim scope before quality. Never open a second PR within a run — there is no second PR. **If you reach minute 25 without staging, stop all remaining analysis/article work, commit whatever exists on disk (include `[early-pr]` in the commit message), and call `create_pull_request` immediately** — a partial-but-delivered PR is infinitely better than losing all work to a `session not found` error.
Copy link

Copilot AI Apr 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This paragraph instructs to call create_pull_request immediately, but elsewhere the workflow is explicit about safeoutputs___create_pull_request. Using the unqualified name risks the agent calling a non-existent/incorrect tool name. Consider changing this to safeoutputs___create_pull_request for consistency and to prevent tool-selection mistakes under time pressure.

Suggested change
Trim scope before quality. Never open a second PR within a run — there is no second PR. **If you reach minute 25 without staging, stop all remaining analysis/article work, commit whatever exists on disk (include `[early-pr]` in the commit message), and call `create_pull_request` immediately** — a partial-but-delivered PR is infinitely better than losing all work to a `session not found` error.
Trim scope before quality. Never open a second PR within a run — there is no second PR. **If you reach minute 25 without staging, stop all remaining analysis/article work, commit whatever exists on disk (include `[early-pr]` in the commit message), and call `safeoutputs___create_pull_request` immediately** — a partial-but-delivered PR is infinitely better than losing all work to a `session not found` error.

Copilot uses AI. Check for mistakes.
Comment thread .github/workflows/news-propositions.md Outdated
## Time budget

**Run 1 — Analysis mode** (no prior analysis found, ~50 min — produces all 23 artifacts):
> 🔴 **CRITICAL — safeoutputs MCP idle timeout (~30 min)**: The `safeoutputs` MCP server runs a Streamable-HTTP session that expires after **~30–35 minutes of idle time** (confirmed in run [24820030825](https://github.com/Hack23/riksdagsmonitor/actions/runs/24820030825) where all `create_pull_request` calls at minute ~32 failed with `session not found`, losing 37 files of completed analysis work). `sandbox.mcp.keepalive-interval: 300` keeps TCP connections alive but does **NOT** refresh the server-side session. **Your first and only `safeoutputs___*` call MUST happen before minute 28 of agent time.** This is a harder deadline than the ~60-minute Copilot-API token window described in `00-base-contract.md §Session keepalive requirement` and overrides the generic 48-min deadline in `07-commit-and-pr.md §Deadline enforcement` for this workflow.
Copy link

Copilot AI Apr 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The callout says the first/only safeoutputs___* call MUST happen before minute 28, but the Run 1 schedule allocates 25–28 for staging + PR, which can imply calling at minute 28. To avoid conflicting guidance, either tighten the window (e.g. end at 27) or reword the callout to “by minute 28 at the latest / no later than minute 28”.

Suggested change
> 🔴 **CRITICAL — safeoutputs MCP idle timeout (~30 min)**: The `safeoutputs` MCP server runs a Streamable-HTTP session that expires after **~30–35 minutes of idle time** (confirmed in run [24820030825](https://github.com/Hack23/riksdagsmonitor/actions/runs/24820030825) where all `create_pull_request` calls at minute ~32 failed with `session not found`, losing 37 files of completed analysis work). `sandbox.mcp.keepalive-interval: 300` keeps TCP connections alive but does **NOT** refresh the server-side session. **Your first and only `safeoutputs___*` call MUST happen before minute 28 of agent time.** This is a harder deadline than the ~60-minute Copilot-API token window described in `00-base-contract.md §Session keepalive requirement` and overrides the generic 48-min deadline in `07-commit-and-pr.md §Deadline enforcement` for this workflow.
> 🔴 **CRITICAL — safeoutputs MCP idle timeout (~30 min)**: The `safeoutputs` MCP server runs a Streamable-HTTP session that expires after **~30–35 minutes of idle time** (confirmed in run [24820030825](https://github.com/Hack23/riksdagsmonitor/actions/runs/24820030825) where all `create_pull_request` calls at minute ~32 failed with `session not found`, losing 37 files of completed analysis work). `sandbox.mcp.keepalive-interval: 300` keeps TCP connections alive but does **NOT** refresh the server-side session. **Your first and only `safeoutputs___*` call MUST happen by minute 28 at the latest.** This is a harder deadline than the ~60-minute Copilot-API token window described in `00-base-contract.md §Session keepalive requirement` and overrides the generic 48-min deadline in `07-commit-and-pr.md §Deadline enforcement` for this workflow.

Copilot uses AI. Check for mistakes.
@pethers
Copy link
Copy Markdown
Member

pethers commented Apr 23, 2026

@copilot apply changes based on the comments in this thread

Copy link
Copy Markdown
Contributor Author

Copilot AI commented Apr 23, 2026

@copilot apply changes based on the comments in this thread

Applied all four reviewer suggestions in 5e49385:

  • Callout reworded to "by minute 28 at the latest" (resolves ambiguity with the 25–28 staging window)
  • AI-FIRST guidance now says to compress download/manifest scope (fewer dok_id entries) while maintaining 1:1 per-document coverage — keeps 05-analysis-gate.md check 2 satisfied
  • Run 2 staging row extended to 22–28 so the table matches the "HARD DEADLINE minute 28" label (header ~26 min~28 min for consistency)
  • Early-exit paragraph now uses the fully-qualified safeoutputs___create_pull_request tool name

tests/workflow-architecture.test.ts — 80/80 passing.

@github-actions
Copy link
Copy Markdown
Contributor

🔍 Lighthouse Performance Audit

Category Score Status
Performance 85/100 🟡
Accessibility 95/100 🟢
Best Practices 90/100 🟢
SEO 95/100 🟢

📥 Download full Lighthouse report

Budget Compliance: Performance budgets enforced via budget.json

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated no new comments.

@pethers pethers merged commit 8ac00a8 into main Apr 23, 2026
17 checks passed
@pethers pethers deleted the copilot/aw-fix-government-propositions-workflow branch April 23, 2026 09:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agentic-workflow Agentic workflow changes ci-cd CI/CD pipeline changes documentation Documentation updates news News articles and content generation size-s Small change (10-50 lines) size-xs Extra small change (< 10 lines) workflow GitHub Actions workflows

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[aw] News: Government Propositions failed

3 participants