Skip to content

fix: improve news-evening-analysis workflow reliability with stronger safe output guardrails#1179

Merged
pethers merged 3 commits intomainfrom
copilot/aw-fix-news-evening-analysis
Mar 14, 2026
Merged

fix: improve news-evening-analysis workflow reliability with stronger safe output guardrails#1179
pethers merged 3 commits intomainfrom
copilot/aw-fix-news-evening-analysis

Conversation

Copy link
Contributor

Copilot AI commented Mar 14, 2026

The News Evening Analysis agentic workflow has been failing consistently since March 12 (runs #23064228552, #23017036455) — the agent completes execution without ever calling a safe output tool, resulting in empty OUTPUT_TYPES and automatic failure.

Comparison with the reliably-working news-committee-reports workflow revealed several missing guardrails in the prompt.

Missing safe output tool search prohibition

The working workflows include an explicit instruction preventing agents from wasting time searching the filesystem for tools. Added:

**🚨 NEVER search for safe output tools via bash.** `safeoutputs___create_pull_request`, 
`safeoutputs___noop`, `safeoutputs___missing_tool`, and `safeoutputs___missing_data` are 
**always available as direct tool calls**. NEVER run `ls /tmp/gh-aw/` or any bash command to "find" them.

Weekday mode hardcoded 14 languages instead of respecting input

The prompt instructed the agent to manually generate HTML articles for all 14 languages ([en, sv, da, no, fi, de, fr, es, nl, ar, he, ja, ko, zh]), contradicting the description ("core languages EN, SV") and the default input en,sv. Translations are handled by the separate news-translate dispatch workflow. Now resolves LANG_ARG from the languages input parameter.

Other improvements

  • MANDATORY PR Creation section — Structured do/don't list matching working workflow pattern, clarifying when to use create_pull_request vs noop
  • Time-check bash helper — Explicit elapsed-time guard before data gathering with START_TIME validation
  • Explicit noop on empty results — Agent now has clear instruction to call safeoutputs___noop immediately when all MCP queries return empty
  • CRITICAL FINAL REMINDER — Bottom-of-prompt enumeration of all safe output scenarios (the last thing the agent reads before executing)
  • Trimmed verbose examples — Removed JS code samples and detailed cross-referencing examples that added token overhead without improving agent behavior

Note: The .lock.yml needs recompilation from the updated .md source via gh aw compile or the auto-compile workflow.

Original prompt

This section details on the original issue you should resolve

<issue_title>[aw] News Evening Analysis failed</issue_title>
<issue_description>### Workflow Failure

Workflow: News Evening Analysis
Branch: main
Run: https://github.com/Hack23/riksdagsmonitor/actions/runs/23064228552
Pull Request: #1136

Action Required

Option 1: Debug using any coding agent

Use this prompt with any coding agent (GitHub Copilot, Claude, Gemini, etc.):

Debug the agentic workflow failure using https://raw.githubusercontent.com/github/gh-aw/main/debug.md

The failed workflow run is at https://github.com/Hack23/riksdagsmonitor/actions/runs/23064228552

Option 2: Assign this issue to Copilot

Assign this issue to Copilot using the agentic-workflows sub-agent to automatically debug and fix the workflow failure.

Option 3: Manually invoke the agent

Debug this workflow failure using your favorite Agent CLI and the agentic-workflows prompt.

[!TIP]
To stop a workflow from creating failure issues, set report-failure-as-issue: false in its frontmatter:

safe-outputs:
  report-failure-as-issue: false

Generated from News Evening Analysis

  • expires on Mar 20, 2026, 6:55 PM UTC

Comments on the Issue (you are @copilot in this section)

Custom agent used: agentic-workflows
GitHub Agentic Workflows (gh-aw) - Create, debug, and upgrade AI-powered workflows with intelligent prompt routing


📱 Kick off Copilot coding agent tasks wherever you are with GitHub Mobile, available on iOS and Android.

Copilot AI and others added 2 commits March 14, 2026 11:14
… safe output instructions

- Add explicit "NEVER search for safe output tools via bash" instruction (matching working workflows)
- Add MANDATORY PR Creation section with clear do/don't list
- Fix weekday languages to respect input parameter (default: en,sv) instead of hardcoding all 14
- Add time-check bash helper before data gathering phase
- Add explicit noop fallback when all queries return empty
- Add CRITICAL FINAL REMINDER section at bottom with all safe output scenarios
- Simplify verbose cross-referencing examples and data freshness JS code
- Improve error handling table with concrete safe output actions

Co-authored-by: pethers <1726836+pethers@users.noreply.github.com>
…r, fix time-check fallback

Co-authored-by: pethers <1726836+pethers@users.noreply.github.com>
Copilot AI changed the title [WIP] [aw] Fix News Evening Analysis workflow failure fix: improve news-evening-analysis workflow reliability with stronger safe output guardrails Mar 14, 2026
Copilot AI requested a review from pethers March 14, 2026 11:17
@github-actions github-actions bot added documentation Documentation updates workflow GitHub Actions workflows ci-cd CI/CD pipeline changes news News articles and content generation agentic-workflow Agentic workflow changes size-m Medium change (50-250 lines) labels Mar 14, 2026
@github-actions
Copy link
Contributor

🔍 Lighthouse Performance Audit

Category Score Status
Performance 85/100 🟡
Accessibility 95/100 🟢
Best Practices 90/100 🟢
SEO 95/100 🟢

📥 Download full Lighthouse report

Budget Compliance: Performance budgets enforced via budget.json

@pethers pethers marked this pull request as ready for review March 14, 2026 13:16
@pethers pethers merged commit 80ab949 into main Mar 14, 2026
14 checks passed
@pethers pethers deleted the copilot/aw-fix-news-evening-analysis branch March 14, 2026 13:16
Copilot AI added a commit that referenced this pull request Mar 14, 2026
1. Rename module from "AI pipeline" to "heuristic-based analysis pipeline"
   to match the actual deterministic implementation (no LLM integration)
2. Gate Passes 2-3 on iterations >= 2 with stub fallbacks for iterations=1
3. Fix class docstring to match actual conditional behavior
4. Fix safety comment in generators.ts to reference AIAnalysisPipeline
5. Tighten hasCoalitionStress() — remove generic "motion"/"opposition"
   keywords; require stronger conflict markers (avslag/reject/amendment/etc.)
6. Fix TAKEAWAY_PROP: remove broken %ss placeholder in French/Spanish;
   use full plural word forms in all 14 language templates
7. Fix TAKEAWAY_BET: replace %sv combined placeholder with %verb key;
   add betVerbForm() for language-correct verb conjugation
8. Fix TAKEAWAY_MOT: replace %sv combined placeholder with %verb key;
   add motVerbForm() for language-correct verb conjugation
9. Fix TAKEAWAY_EU: remove suffix placeholders, use full plural forms
10. Fix CI: restore cross-referencing examples, filter patterns, and date
    calculations in news-evening-analysis.md (regression from PR #1179 merge)

Co-authored-by: pethers <1726836+pethers@users.noreply.github.com>
Copilot AI added a commit that referenced this pull request Mar 14, 2026
Review comments addressed:
1. hasDomain() now works with all 14 languages via DOMAIN_NAME_TO_KEY reverse mapping
2. buildStakeholderSwot() localizes SWOT subject with requested lang
3. cacheKey() includes fullContent and notis lengths for collision resistance
4. PolicyDomain.key uses canonical domain keys (e.g. 'healthcare') not localized names
5. Added non-EN PESTLE and SWOT localization test assertions
6. buildPestleAnalysis() uses 'en' for internal trigger checks
7. Removed 'riksdag' from state-agencies signals (too generic)
8. Updated selectRelevantStakeholders docstring to list all 3 always-included groups
9. generateExecutiveSummary() uses fullText ?? fullContent for key passage
10. Replaced DOMAIN_EN_NAMES with canonical DOMAIN_NAME_TO_KEY from policy-analysis.ts

CI fix: Updated agentic-workflow-mcp-queries.test.ts to match current workflow
content after PR #1179 merged (removed Example 1-3 assertions, updated date
calculation and cross-referencing pattern checks).

Co-authored-by: pethers <1726836+pethers@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agentic-workflow Agentic workflow changes ci-cd CI/CD pipeline changes documentation Documentation updates news News articles and content generation size-m Medium change (50-250 lines) workflow GitHub Actions workflows

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[aw] News Evening Analysis failed

2 participants