news-translate: pivot to executive-brief markdown across 14 languages, 3 runs/day#2519
Conversation
…SLATION_GUIDE section added, lock regenerated Agent-Logs-Url: https://github.com/Hack23/riksdagsmonitor/sessions/c214aa5b-d447-43ed-94f7-74cd922495f1 Co-authored-by: pethers <1726836+pethers@users.noreply.github.com>
🏷️ Automatic Labeling SummaryThis PR has been automatically labeled based on the files changed and PR metadata. Applied Labels: documentation,workflow,i18n,translation,ci-cd,size-l,news,agentic-workflow Label Categories
For more information, see |
🔍 Lighthouse Performance Audit
📥 Download full Lighthouse report Budget Compliance: Performance budgets enforced via |
…dules and per-type workflow descriptions Agent-Logs-Url: https://github.com/Hack23/riksdagsmonitor/sessions/3680553b-1502-4360-bff0-d59ef3ff4dc3 Co-authored-by: pethers <1726836+pethers@users.noreply.github.com>
🔍 Lighthouse Performance Audit
📥 Download full Lighthouse report Budget Compliance: Performance budgets enforced via |
…very PR Agent-Logs-Url: https://github.com/Hack23/riksdagsmonitor/sessions/c7b1a7ec-dd0b-4875-9a26-5fa9b0131fd6 Co-authored-by: pethers <1726836+pethers@users.noreply.github.com>
🔍 Lighthouse Performance Audit
📥 Download full Lighthouse report Budget Compliance: Performance budgets enforced via |
There was a problem hiding this comment.
Pull request overview
This pull request repurposes the news-translate workflow from HTML translation to executive-brief Markdown translation across 13 non-English target languages, and adds a defense-in-depth PR gate to enforce ownership and structural parity for those executive brief translations.
Changes:
- Extend translation documentation and contracts for
analysis/daily/**/executive-brief.md→executive-brief_<lang>.md. - Add/extend validators + unit tests for executive-brief translation parity and file-ownership boundaries.
- Add a PR-check workflow (
exec-brief-translation-checks.yml) to re-run ownership + parity validation on relevant PRs.
Reviewed changes
Copilot reviewed 40 out of 40 changed files in this pull request and generated 23 comments.
Show a summary per file
| File | Description |
|---|---|
| TRANSLATION_GUIDE.md | Adds executive-brief markdown translation contract and acceptance checklist. |
| scripts/validate-file-ownership.ts | Extends ownership validation to executive-brief markdown and adds file-list/category auto-detection. |
| scripts/validate-executive-brief-translations.ts | Adds structural-parity validator for executive-brief translations. |
| tests/validate-file-ownership-exec-brief.test.ts | Unit tests for new executive-brief ownership rules and category detection. |
| tests/validate-executive-brief-translations.test.ts | Unit tests for executive-brief structural counters + validation checks. |
| .github/workflows/exec-brief-translation-checks.yml | New PR gate that runs ownership + parity validation on PRs touching translation surfaces. |
| .github/workflows/news-*.lock.yml | Recompiled agentic workflow lockfiles (safe-outputs + PR creation flow). |
| - `news/$DATE-$SUB-sv.html` (Swedish master) | ||
|
|
||
| The remaining 12 language variants (`da`, `nb`, `fi`, `de`, `fr`, `es`, `nl`, `ar`, `he`, `ja`, `ko`, `zh`) are produced **exclusively** by the standalone [`news-translate`](.github/workflows/news-translate.md) workflow, which: | ||
| For HTML article variants, the remaining 12 language renderings (`da`, `no`, `fi`, `de`, `fr`, `es`, `nl`, `ar`, `he`, `ja`, `ko`, `zh`) are emitted inline by the per-type workflows themselves via the per-language `article.<lang>.md` step inside `06-article-generation.md`. The standalone [`news-translate`](.github/workflows/news-translate.md) workflow no longer owns HTML translation; its current mission is **executive-brief Markdown translation** — see §"Executive Brief Markdown Translations" below. |
| - [ ] Every `dok_id`, intressent ID, vote ID, and external URL from the source appears in the translation. | ||
| - [ ] No verbatim English BLUF / "Decisions" / "Confidence" labels remain in non-English files (validator scans a banned-phrase list). | ||
| - [ ] `ar` / `he` files start with `<!-- dir: rtl -->`. |
| | Evidence-anchor table column headers present in expected order | exact | | ||
| | `dok_id` references preserved (set equality) | exact | | ||
| | Image / external-URL set preserved | exact | | ||
| | Top-level word count | source ±25 % (translations expand or contract by language; outside ±25 % flags drift) | |
| * Enforces a strict file-ownership contract between content and translation workflows: | ||
| * - Content workflows (news-committee-reports, news-propositions, etc.) own EN/SV files | ||
| * - Translation workflow (news-translate) owns all other language files (DA/NO/FI/DE/FR/ES/NL/AR/HE/JA/KO/ZH) | ||
| * - Content workflows (news-committee-reports, news-propositions, etc.) own | ||
| * EN/SV `news/*.html` files **and** the English-master executive brief | ||
| * `analysis/daily/$DATE/$SUB/executive-brief.md`. | ||
| * - Translation workflow (news-translate) owns the 13 non-English `news/*.html` | ||
| * files **and** all `analysis/daily/$DATE/$SUB/executive-brief_<lang>.md` | ||
| * files (for the 13 non-English target languages). |
| /** | ||
| * Auto-detect the workflow category from a list of changed file paths. | ||
| * | ||
| * Detection rules: | ||
| * - If any file is an `executive-brief_<lang>.md` translation, OR | ||
| * any `news/*.html` for a non-English/non-Swedish language → `translation`. | ||
| * - Otherwise → `content` (English/Swedish HTML + English executive-brief source). | ||
| * | ||
| * Files outside the ownership surface (\`news/*.html\` and | ||
| * \`analysis/daily/(any)/executive-brief(_lang).md\`) are ignored for detection. | ||
| * | ||
| * Returns `null` if no ownership-surface files are present (caller should treat | ||
| * the check as a no-op pass). | ||
| * | ||
| * @param files - Array of repo-relative file paths to inspect | ||
| * @returns The inferred workflow category, or `null` if no surface files | ||
| */ | ||
| export function detectCategoryFromFiles( | ||
| files: readonly string[], | ||
| ): WorkflowCategory | null { | ||
| let sawSurfaceFile = false; | ||
| for (const f of files) { | ||
| if (isExecutiveBriefTranslation(f)) return 'translation'; | ||
| if (isEnglishExecutiveBriefSource(f)) { | ||
| sawSurfaceFile = true; | ||
| continue; | ||
| } | ||
| if (f.startsWith('news/') && f.endsWith('.html')) { | ||
| sawSurfaceFile = true; | ||
| const lang = extractLangFromPath(f); | ||
| if (lang && (TRANSLATION_LANGS as readonly string[]).includes(lang)) { | ||
| return 'translation'; | ||
| } | ||
| } | ||
| } | ||
| return sawSurfaceFile ? 'content' : null; |
| token: ${{ secrets.GH_AW_GITHUB_TOKEN || secrets.GITHUB_TOKEN }} | ||
| persist-credentials: false | ||
| fetch-depth: 0 | ||
| fetch-depth: 1 |
| token: ${{ secrets.GH_AW_GITHUB_TOKEN || secrets.GITHUB_TOKEN }} | ||
| persist-credentials: false | ||
| fetch-depth: 0 | ||
| fetch-depth: 1 |
| token: ${{ secrets.GH_AW_GITHUB_TOKEN || secrets.GITHUB_TOKEN }} | ||
| persist-credentials: false | ||
| fetch-depth: 0 | ||
| fetch-depth: 1 |
| token: ${{ secrets.GH_AW_GITHUB_TOKEN || secrets.GITHUB_TOKEN }} | ||
| persist-credentials: false | ||
| fetch-depth: 0 | ||
| fetch-depth: 1 |
| ref: ${{ steps.extract-base-branch.outputs.base-branch || github.base_ref || github.event.pull_request.base.ref || github.ref_name || github.event.repository.default_branch }} | ||
| token: ${{ secrets.GH_AW_GITHUB_TOKEN || secrets.GITHUB_TOKEN }} | ||
| persist-credentials: false | ||
| fetch-depth: 0 | ||
| fetch-depth: 1 |
|
@copilot apply changes based on the comments in this thread do not change lock files. generated when compiling |
🔍 Lighthouse Performance Audit
📥 Download full Lighthouse report Budget Compliance: Performance budgets enforced via |
…ases, set equality - Fix CodeQL shell injection: use execFileSync instead of execSync for git commands - Fix CodeQL incomplete sanitization: loop stripFencesAndComments until stable - Update file-ownership: per-type workflows own ALL news/*.html (all 14 langs) - news-translate no longer owns any news/*.html — only executive-brief_<lang>.md - Fix detectCategoryFromFiles: news/*.html no longer implies translation category - Remove news/*.html from exec-brief-translation-checks.yml path trigger - Add banned phrases: Decisions, Confidence, Key Takeaways, What Happened, What It Means - Tighten dok_id/URL checks to set equality (flag extras, not just missing) - Fix TRANSLATION_GUIDE.md: remove conflicting "exactly two" vs "all 14" statement - Update tests to match new ownership model (all 38 ownership tests pass) - Full test suite: 5845 passed, 0 failures Agent-Logs-Url: https://github.com/Hack23/riksdagsmonitor/sessions/1e32bcff-0fbe-41ef-8740-736f3a78c44e Co-authored-by: pethers <1726836+pethers@users.noreply.github.com>
Applied all review feedback in
Full test suite: 5845 passed, 0 failures. |
🔍 Lighthouse Performance Audit
📥 Download full Lighthouse report Budget Compliance: Performance budgets enforced via |
| if ! npx tsx scripts/validate-executive-brief-translations.ts --source "$src" --soft; then | ||
| FAIL=1 | ||
| fi | ||
| done | ||
|
|
||
| # `--soft` keeps the per-source command exit code green so we can | ||
| # collect every failure in one report. Convert to a hard failure now. |
| // Parse flags: --files <comma-list>, --files-from <path|->, --category <c|t>, plain positional category. | ||
| let filesArg: string | undefined; | ||
| let filesFromArg: string | undefined; | ||
| let categoryArg: WorkflowCategory | undefined; | ||
| for (let i = 0; i < args.length; i++) { | ||
| const a = args[i]; | ||
| if (a === '--files' && args[i + 1]) { | ||
| filesArg = args[++i]; | ||
| } else if (a === '--files-from' && args[i + 1]) { | ||
| filesFromArg = args[++i]; | ||
| } else if (a === '--category' && args[i + 1]) { | ||
| categoryArg = args[++i] as WorkflowCategory; | ||
| } else if (a === 'content' || a === 'translation') { | ||
| categoryArg = a; | ||
| } else if (a === '--help' || a === '-h') { | ||
| console.log( | ||
| 'Usage:\n' + | ||
| ' validate-file-ownership.ts <content|translation> # validate git working tree\n' + | ||
| ' validate-file-ownership.ts --files <a.md,b.md> [--category <c|t>]\n' + | ||
| ' validate-file-ownership.ts --files-from <path|-> [--category <c|t>]\n' + | ||
| '\nIf --category is omitted with --files / --files-from, it is auto-detected:\n' + | ||
| ' any executive-brief_<lang>.md OR non-EN/SV news/*.html -> translation\n' + | ||
| ' otherwise -> content', | ||
| ); |
| /** Count fenced code blocks (any info string). */ | ||
| export function countCodeFences(md: string): number { | ||
| const matches = md.match(/^```/gm); | ||
| // Each fence is one of opening/closing; divide by 2. | ||
| return matches ? Math.floor(matches.length / 2) : 0; | ||
| } |
| /** Extract the trailing `<!-- source-sha: <40-hex> -->` marker, or null if missing/malformed. */ | ||
| export function extractSourceShaMarker(md: string): string | null { | ||
| const match = md.match(/<!--\s*source-sha:\s*([0-9a-f]{40})\s*-->/i); | ||
| return match?.[1] ?? null; | ||
| } |
| * - fenced-code-block count parity (±0), | ||
| * - Mermaid fenced-block count parity (±0), | ||
| * - dok_id reference preservation (set equality, source ⊆ translation), | ||
| * - external URL preservation (set equality, source ⊆ translation), | ||
| * - RTL marker present for `ar` / `he`, | ||
| * - trailing `<!-- source-sha: <40-hex> -->` marker present and well-formed, | ||
| * - no banned English BLUF phrases in non-English files, | ||
| * - top-level word count within ±25% of the source. |
| token: ${{ secrets.GH_AW_GITHUB_TOKEN || secrets.GITHUB_TOKEN }} | ||
| persist-credentials: false | ||
| fetch-depth: 0 | ||
| fetch-depth: 1 |
| token: ${{ secrets.GH_AW_GITHUB_TOKEN || secrets.GITHUB_TOKEN }} | ||
| persist-credentials: false | ||
| fetch-depth: 0 | ||
| fetch-depth: 1 |
| token: ${{ secrets.GH_AW_GITHUB_TOKEN || secrets.GITHUB_TOKEN }} | ||
| persist-credentials: false | ||
| fetch-depth: 0 | ||
| fetch-depth: 1 |
| token: ${{ secrets.GH_AW_GITHUB_TOKEN || secrets.GITHUB_TOKEN }} | ||
| persist-credentials: false | ||
| fetch-depth: 0 | ||
| fetch-depth: 1 |
| token: ${{ secrets.GH_AW_GITHUB_TOKEN || secrets.GITHUB_TOKEN }} | ||
| persist-credentials: false | ||
| fetch-depth: 0 | ||
| fetch-depth: 1 |
|
@copilot apply changes based on the comments in this thread |
Agent-Logs-Url: https://github.com/Hack23/riksdagsmonitor/sessions/519ed5e3-4717-4b52-b218-131c911733d8 Co-authored-by: pethers <1726836+pethers@users.noreply.github.com>
🔍 Lighthouse Performance Audit
📥 Download full Lighthouse report Budget Compliance: Performance budgets enforced via |
Addressed in 8d15dd3: the PR gate now hard-fails validation drift, file-ownership CLI help and aliases match behavior, malformed fences and non-trailing source-sha markers are rejected, worklist wording is repo-relative, and generated lock-file deltas are reverted. Targeted validator tests, ESLint, and CodeQL review passed. |
*.lock.yml