Skip to content

Refactor article aggregation into narrative-first journalism structure#2806

Merged
pethers merged 3 commits into
mainfrom
copilot/improve-article-narrative-flow
May 28, 2026
Merged

Refactor article aggregation into narrative-first journalism structure#2806
pethers merged 3 commits into
mainfrom
copilot/improve-article-narrative-flow

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented May 28, 2026

This PR shifts generated articles from artifact-centric intelligence framing to a reader-first journalistic narrative. It removes BLUF-facing terminology in rendered output, improves section semantics, and adds first-use context for confidence codes and Riksdag document IDs.

  • Narrative section model (aggregation titles)

    • Renamed top-level story sections to reader-facing journalism headings:
      • executive-brief.mdWhat Happened
      • synthesis-summary.mdWhy It Matters
      • intelligence-assessment.mdKey Findings
    • Reframed audit/technical sections under explicit Deep Dive headings to separate editorial narrative from methodology/audit content.
  • Terminology normalization in rendered article bodies

    • Added post-cleaning narrative normalization in aggregation output:
      • BLUF/Bottom Line Up Front headings → Lede
      • Decisions This Brief SupportsDecisions and confidence context
    • On first confidence-code occurrence, injects inline reader explanation (e.g. HIGH (B2, …)).
    • On first HDxxxxx occurrence, adds explicit context as a Riksdag document identifier.
    • First-use annotations (confidence-code gloss, document-ID contextualization) are tracked once per rendered article via caller-owned state threaded across all artifact bodies, rather than re-emitting in every artifact.
    • Narrative rewrites are gated to English source bodies (lang === 'en'); non-English bodies pass through untouched so English copy is never injected into localized prose.
    • Document-ID contextualization is scoped to the HD prefix by design — the only bare Riksdag document-identifier token in these artifacts — and the rationale is documented in code (other prefixes appear as session-scoped YYYY/NN:NNN references whose trailing number is not a global document id).
  • Reader Intelligence Guide wording across locales

    • Removed BLUF-centric wording from guide entry labels and aligned with lede/editorial language across language bundles (with localized phrasing per locale).
    • Localized the lede label to native journalistic terms instead of the English "Lede": de → Aufmacher, fr → Chapeau, es → Entradilla, nl → Intro, no → Ingress, fi → Ingressi (sv → Ingress; ja/ko/zh already localized).
  • Prompt contract alignment

    • Updated article-generation prompt copy to instruct and describe journalistic lede framing instead of BLUF framing.
  • Targeted test updates

    • Updated aggregation/reader-guide/title expectation tests to the new narrative/deep-dive headings.
    • Added focused assertions for terminology normalization (lede rename, confidence explanation, first document-ID contextualization).
    • Added tests asserting first-use state is threaded once per article across artifact bodies and that non-English bodies are returned unchanged.
// Example normalization in aggregated body output
"## 🎯 BLUF"                        -> "## Lede"
"### Decisions This Brief Supports" -> "### Decisions and confidence context"
"HIGH (B2)"                         -> "HIGH (B2, high confidence, corroborated by multiple sources)"
"HD03271"                           -> "Riksdag document #03271 (HD03271)" // first occurrence

@github-actions github-actions Bot added the size-xs Extra small change (< 10 lines) label May 28, 2026
@github-actions
Copy link
Copy Markdown
Contributor

🏷️ Automatic Labeling Summary

This PR has been automatically labeled based on the files changed and PR metadata.

Applied Labels: size-xs

Label Categories

  • 🗳️ Content: news, dashboard, visualization, intelligence
  • 💻 Technology: html-css, javascript, workflow, security
  • 📊 Data: cia-data, riksdag-data, data-pipeline, schema
  • 🌍 I18n: i18n, translation, rtl
  • 🔒 ISMS: isms, iso-27001, nist-csf, cis-controls
  • 🏗️ Infrastructure: ci-cd, deployment, performance, monitoring
  • 🔄 Quality: testing, accessibility, documentation, refactor
  • 🤖 AI: agent, skill, agentic-workflow

For more information, see .github/labeler.yml.

@github-actions
Copy link
Copy Markdown
Contributor

🔍 Lighthouse Performance Audit

Category Score Status
Performance 85/100 🟡
Accessibility 95/100 🟢
Best Practices 90/100 🟢
SEO 95/100 🟢

📥 Download full Lighthouse report

Budget Compliance: Performance budgets enforced via budget.json

Co-authored-by: pethers <1726836+pethers@users.noreply.github.com>
@github-actions github-actions Bot added documentation Documentation updates testing Test coverage refactor Code refactoring size-m Medium change (50-250 lines) labels May 28, 2026
@github-actions
Copy link
Copy Markdown
Contributor

🔍 Lighthouse Performance Audit

Category Score Status
Performance 85/100 🟡
Accessibility 95/100 🟢
Best Practices 90/100 🟢
SEO 95/100 🟢

📥 Download full Lighthouse report

Budget Compliance: Performance budgets enforced via budget.json

Copilot AI changed the title [WIP] Transform artifact-collage structure into reader-friendly journalism Refactor article aggregation into narrative-first journalism structure May 28, 2026
Copilot AI requested a review from pethers May 28, 2026 16:34
@pethers pethers marked this pull request as ready for review May 28, 2026 16:49
Copilot AI review requested due to automatic review settings May 28, 2026 16:49
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Refactors aggregated article output from intelligence/BLUF framing to reader-first journalism: renames the top-level section titles (Executive Brief → What Happened, Synthesis Summary → Why It Matters, Intelligence Assessment → Key Findings), prefixes audit/technical artifacts with "Deep Dive:", introduces a normalizeNarrativeTerminology post-cleaning step that rewrites BLUF headings to "Lede", renames "Decisions This Brief Supports", expands the first confidence-code mention with a plain-language gloss, and contextualizes the first HDxxxxx document id. Reader-guide labels across all 14 locales are updated and the article-generation prompt is reworded to match.

Changes:

  • New normalizeNarrativeTerminology cleaning helper wired into aggregateAnalysis, with regex-based rewrites for BLUF/decision headings, confidence codes, and HD document ids.
  • Updated SECTION_TITLES in aggregator/order.ts (narrative titles + "Deep Dive:" prefixes) and corresponding test expectations including audit-anchor slug changes.
  • Reader-guide entry labels swapped from "BLUF…" to "Lede/Ingress/…" across every reader-guide-i18n/*.ts bundle and updated prompt copy.

Reviewed changes

Copilot reviewed 19 out of 19 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
scripts/render-lib/aggregator/cleaning/structural.ts Adds normalizeNarrativeTerminology with per-call first-use tracking for confidence codes and HD doc ids.
scripts/render-lib/aggregator/aggregate.ts Pipes each cleaned artifact body through normalizeNarrativeTerminology.
scripts/render-lib/aggregator/order.ts Renames section titles to narrative wording and prefixes audit artifacts with "Deep Dive:".
scripts/render-lib/aggregator/reader-guide.ts Updates the canonical executive-brief reader-guide label to "Lede and editorial decisions".
scripts/render-lib/aggregator/reader-guide-i18n/{en,sv,da,de,es,fi,fr,ja,ko,nl,no,zh}.ts Localizes the executive-brief reader-guide label away from "BLUF" wording (some locales reuse English "Lede").
tests/render-lib.test.ts Updates section/title expectations and adds an integration test for the new normalization.
tests/render-lib-leaf-modules.test.ts Adds a unit test for normalizeNarrativeTerminology and updates audit-anchor expectations to the "deep-dive-" prefix.
.github/prompts/06-article-generation.md Reframes prompt language from "BLUF" to "journalistic lede".

Comment on lines +104 to +122
let confidenceExplained = false;
out = out.replace(/\b(HIGH|MEDIUM|LOW)\s*\(([A-C]\d)\)/g, (match, band: string, code: string) => {
if (confidenceExplained) return match;
confidenceExplained = true;
const explanation =
band === 'HIGH'
? 'high confidence, corroborated by multiple sources'
: band === 'MEDIUM'
? 'medium confidence, partial corroboration'
: 'low confidence, limited corroboration';
return `${band} (${code}, ${explanation})`;
});

let firstDocContextualized = false;
out = out.replace(/\b(HD(\d{5,}))\b/g, (match, fullId: string, numericId: string) => {
if (firstDocContextualized) return match;
firstDocContextualized = true;
return `Riksdag document #${numericId} (${fullId})`;
});
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 1833ffd. Hoisted the first-use flags into a caller-owned NarrativeNormalizationState; aggregateAnalysis now creates one state object and threads it through every normalizeNarrativeTerminology call, so the confidence-code gloss and Riksdag document #… contextualization fire once per article instead of once per artifact.

Comment on lines +94 to +125
export function normalizeNarrativeTerminology(body: string): string {
let out = body.replace(
/^(#{2,6})\s*(?:🎯\s*)?(?:BLUF(?:\s*\(Bottom Line Up Front\))?|Bottom Line Up Front)\s*$/gim,
'$1 Lede',
);
out = out.replace(
/^(#{2,6})\s*Decisions This Brief Supports\s*$/gim,
'$1 Decisions and confidence context',
);

let confidenceExplained = false;
out = out.replace(/\b(HIGH|MEDIUM|LOW)\s*\(([A-C]\d)\)/g, (match, band: string, code: string) => {
if (confidenceExplained) return match;
confidenceExplained = true;
const explanation =
band === 'HIGH'
? 'high confidence, corroborated by multiple sources'
: band === 'MEDIUM'
? 'medium confidence, partial corroboration'
: 'low confidence, limited corroboration';
return `${band} (${code}, ${explanation})`;
});

let firstDocContextualized = false;
out = out.replace(/\b(HD(\d{5,}))\b/g, (match, fullId: string, numericId: string) => {
if (firstDocContextualized) return match;
firstDocContextualized = true;
return `Riksdag document #${numericId} (${fullId})`;
});

return out;
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 1833ffd. Added a lang parameter (default 'en'); the English-only rewrites now early-return for any non-English body, so localized prose is left untouched. Added a regression test asserting a Swedish body passes through unchanged.

Comment on lines +117 to +122
let firstDocContextualized = false;
out = out.replace(/\b(HD(\d{5,}))\b/g, (match, fullId: string, numericId: string) => {
if (firstDocContextualized) return match;
firstDocContextualized = true;
return `Riksdag document #${numericId} (${fullId})`;
});
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Documented the scope in 1833ffd. HD is the only bare Riksdag document-identifier token in these artifacts (matching BILL_ID_RE in seo/brief-extractor.ts, e.g. HD03271). The other prefixes (propositions, motions, interpellations, etc.) appear as session-scoped YYYY/NN:NNN references like prop. 2025/26:267 / MOT 2023/24:1234, where the trailing number is not a global document id — framing those as "Riksdag document #…" would be incorrect, so they're intentionally out of scope.

export const ENTRIES: Record<string, ReaderGuideEntryI18n> = {
'executive-brief.md': {
label: 'BLUF und redaktionelle Entscheidungen',
label: 'Lede und redaktionelle Entscheidungen',
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 1833ffd. Replaced the verbatim "Lede" with native journalistic terms: de → Aufmacher, fr → Chapeau, es → Entradilla, nl → Intro, no → Ingress, fi → Ingressi.

…ope doc-id + localize lede labels

Co-authored-by: pethers <1726836+pethers@users.noreply.github.com>
@github-actions github-actions Bot added the size-l Large change (250-1000 lines) label May 28, 2026
@github-actions
Copy link
Copy Markdown
Contributor

🔍 Lighthouse Performance Audit

Category Score Status
Performance 85/100 🟡
Accessibility 95/100 🟢
Best Practices 90/100 🟢
SEO 95/100 🟢

📥 Download full Lighthouse report

Budget Compliance: Performance budgets enforced via budget.json

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 19 out of 19 changed files in this pull request and generated no new comments.

@pethers pethers merged commit 12417e3 into main May 28, 2026
19 checks passed
@pethers pethers deleted the copilot/improve-article-narrative-flow branch May 28, 2026 20:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Documentation updates refactor Code refactoring size-l Large change (250-1000 lines) size-m Medium change (50-250 lines) size-xs Extra small change (< 10 lines) testing Test coverage

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Improve article narrative flow: transform artifact-collage structure into reader-friendly journalism

3 participants