Conversation
…banned pattern subtitles, add script-generated metadata, meaningful document classification - Fix cross-party voting anomaly detection to use oppositionAlignment instead of raw vote counts (fixes 8853% display bug) - Improve subtitle generation to avoid banned "Analysis of N documents covering" pattern per ai-driven-analysis-guide.md - Add "Produced By: script" metadata and warning banner to all script-generated analysis files - Replace zero-significance stub in analyzeDocuments() with heuristic-based classification using detectPolicyDomainsWithConfidence - Improve SWOT and threat serializers to produce meaningful "no data" messages when AI analysis is needed - Wire synthesis key findings and risk summary into AnalysisEnrichment for article generation - Add domain extraction from keyInsights in classification serializer Agent-Logs-Url: https://github.com/Hack23/riksdagsmonitor/sessions/e3fec942-345b-4828-aa02-11b0e5849838 Co-authored-by: pethers <1726836+pethers@users.noreply.github.com>
…article subtitles in generators.ts Agent-Logs-Url: https://github.com/Hack23/riksdagsmonitor/sessions/e3fec942-345b-4828-aa02-11b0e5849838 Co-authored-by: pethers <1726836+pethers@users.noreply.github.com>
…document Swedish-only detection Address code review feedback: extract POLICY_DOMAIN_INSIGHT_PREFIX constant shared between pre-article-analysis.ts and markdown-serializer.ts, document why Swedish phrase detection is correct for Riksdag API data, add MOTION_PROPOSITION_RESPONSE_PREFIX constant. Agent-Logs-Url: https://github.com/Hack23/riksdagsmonitor/sessions/e3fec942-345b-4828-aa02-11b0e5849838 Co-authored-by: pethers <1726836+pethers@users.noreply.github.com>
🏷️ Automatic Labeling SummaryThis PR has been automatically labeled based on the files changed and PR metadata. Applied Labels: refactor,size-m Label Categories
For more information, see |
🔍 Lighthouse Performance Audit
📥 Download full Lighthouse report Budget Compliance: Performance budgets enforced via |
There was a problem hiding this comment.
Pull request overview
This PR improves the analysis and article-generation pipeline by fixing a cross-party voting anomaly metric bug, removing/avoiding banned subtitle boilerplate, clearly tagging script-generated analysis output, and exposing additional pre-computed analysis fields for article enrichment.
Changes:
- Fix cross-party voting anomaly detection by using
oppositionAlignment(percentage) instead ofcrossPartyVotes(raw count). - Add script-generated provenance metadata/banners and improve “no data” messaging for SWOT/threat serializers.
- Expand
AnalysisEnrichmentand populate it fromreadLatestAnalysis(); update dynamic/en subtitles to avoid banned patterns.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| scripts/pre-article-analysis/markdown-serializer.ts | Adds “Produced By” + warning banner; improves empty SWOT/threat output; extracts policy domains from keyInsights when perspectives are missing. |
| scripts/pre-article-analysis.ts | Replaces stub analysis with heuristic scoring + policy-domain tagging; introduces shared prefixes/constants for downstream consumers. |
| scripts/generate-news-enhanced/helpers.ts | Extends AnalysisEnrichment and wires in synthesis/risk/date fields from the latest analysis snapshot. |
| scripts/generate-news-enhanced/generators.ts | Updates English subtitles to avoid templated “Analysis of …” phrasing. |
| scripts/data-transformers/risk-analysis.ts | Corrects anomaly detection to use oppositionAlignment (%), keeping crossPartyVotes as raw-count context. |
| * Prefix used by pre-article-analysis.ts to tag policy domain classifications in keyInsights. | ||
| * Must match {@link POLICY_DOMAIN_INSIGHT_PREFIX} in the parent module. | ||
| */ | ||
| const POLICY_DOMAIN_INSIGHT_PREFIX = 'Policy domain:'; | ||
|
|
There was a problem hiding this comment.
POLICY_DOMAIN_INSIGHT_PREFIX is duplicated here and in scripts/pre-article-analysis.ts, which creates a drift risk (the JSDoc even notes it must match). Additionally, the regex used for parsing (/^Policy domain:/...) hard-codes the prefix instead of reusing the constant, so even a small change to the prefix would silently break domain extraction. Consider moving the prefix to a small shared module imported by both producer and serializer, and building the regex from that constant (or parsing based on the constant rather than a literal).
| * Prefix used by pre-article-analysis.ts to tag policy domain classifications in keyInsights. | |
| * Must match {@link POLICY_DOMAIN_INSIGHT_PREFIX} in the parent module. | |
| */ | |
| const POLICY_DOMAIN_INSIGHT_PREFIX = 'Policy domain:'; | |
| * Prefix used to tag policy domain classifications in keyInsights. | |
| * | |
| * Keep all parsing in this file derived from this constant so a future prefix | |
| * change does not silently drift from regex-based extraction logic. | |
| */ | |
| const POLICY_DOMAIN_INSIGHT_PREFIX = 'Policy domain:'; | |
| function escapeRegExp(value: string): string { | |
| return value.replace(/[.*+?^${}()|[\]\\]/g, '\\$&'); | |
| } | |
| const POLICY_DOMAIN_INSIGHT_REGEX = new RegExp( | |
| `^${escapeRegExp(POLICY_DOMAIN_INSIGHT_PREFIX)}\\s*(.+?)\\s*$`, | |
| 'i', | |
| ); | |
| function extractPolicyDomainFromInsight(insight: string): string | null { | |
| const match = POLICY_DOMAIN_INSIGHT_REGEX.exec(insight.trim()); | |
| return match?.[1]?.trim() || null; | |
| } |
| if (domainInsight) { | ||
| const match = domainInsight.match(/^Policy domain:\s*(.+?)\s*\(/); | ||
| if (match) { | ||
| domains = match[1].split(',').map(d => d.trim()); |
There was a problem hiding this comment.
When domains are extracted from keyInsights, the resulting array isn’t capped (unlike the perspectives-derived domains which are .slice(0, 3)). Since detectPolicyDomainsWithConfidence() can return multiple matches from keyword heuristics, this can produce very long “Domains” lines in the markdown. Consider applying the same .slice(0, 3) limit (or a dedicated max) after splitting to keep output consistent and readable.
| domains = match[1].split(',').map(d => d.trim()); | |
| domains = match[1] | |
| .split(',') | |
| .map(d => d.trim()) | |
| .slice(0, 3); |
|
@copilot apply changes based on the comments in this thread |
…REFIX constant, cap domains to 3 Address PR review feedback: - Build POLICY_DOMAIN_INSIGHT_REGEX from POLICY_DOMAIN_INSIGHT_PREFIX constant via escapeRegExp() to prevent silent drift if the prefix changes - Add extractPolicyDomainFromInsight() helper to centralize parsing logic - Cap keyInsights-derived domains with .slice(0, 3) for consistency with perspectives-derived domains Agent-Logs-Url: https://github.com/Hack23/riksdagsmonitor/sessions/218540b7-ad9a-43e0-ac94-01f8de79c0c3 Co-authored-by: pethers <1726836+pethers@users.noreply.github.com>
Applied both review comments in
|
🔍 Lighthouse Performance Audit
📥 Download full Lighthouse report Budget Compliance: Performance budgets enforced via |
🔍 Lighthouse Performance Audit
📥 Download full Lighthouse report Budget Compliance: Performance budgets enforced via |
| * Keep all parsing in this file derived from this constant so a future prefix | ||
| * change does not silently drift from regex-based extraction logic. | ||
| */ | ||
| const POLICY_DOMAIN_INSIGHT_PREFIX = 'Policy domain:'; |
There was a problem hiding this comment.
POLICY_DOMAIN_INSIGHT_PREFIX is defined here but also separately exported in scripts/pre-article-analysis.ts. Keeping two independent copies undermines the stated goal of avoiding silent drift between the producer and the regex-based parser. Consider defining this prefix in a single shared place (e.g., export it from this serializer module and import it in pre-article-analysis.ts, or move it to a small shared constants module) so both sides always stay in sync.
| const POLICY_DOMAIN_INSIGHT_PREFIX = 'Policy domain:'; | |
| export const POLICY_DOMAIN_INSIGHT_PREFIX = 'Policy domain:'; |
| .filter((v, i, arr) => arr.indexOf(v) === i) | ||
| .slice(0, 3); | ||
| if (domains.length === 0 && result.keyInsights.length > 0) { | ||
| const domainInsight = result.keyInsights.find(i => i.startsWith(POLICY_DOMAIN_INSIGHT_PREFIX)); |
There was a problem hiding this comment.
Domain fallback extraction looks for a matching key insight with startsWith(POLICY_DOMAIN_INSIGHT_PREFIX), which is case-sensitive and doesn't trim leading whitespace. This is inconsistent with the case-insensitive POLICY_DOMAIN_INSIGHT_REGEX used for parsing and can cause domains to be missed if the insight string casing/spacing changes. Prefer matching via the regex (e.g., test POLICY_DOMAIN_INSIGHT_REGEX against insight.trim()) and then parsing, rather than relying on startsWith.
| const domainInsight = result.keyInsights.find(i => i.startsWith(POLICY_DOMAIN_INSIGHT_PREFIX)); | |
| const domainInsight = result.keyInsights | |
| .map(insight => insight.trim()) | |
| .find(insight => POLICY_DOMAIN_INSIGHT_REGEX.test(insight)); |
|
@copilot apply changes based on the comments in this thread and fix failures in TypeScript & JavaScript Testing / Unit Tests (Vitest) (pull_request) |
…fix CI test failures Address second round of review feedback: - Export POLICY_DOMAIN_INSIGHT_PREFIX from markdown-serializer.ts and import in pre-article-analysis.ts (eliminates duplication drift risk) - Use POLICY_DOMAIN_INSIGHT_REGEX.test() instead of case-sensitive startsWith() for domain insight matching (consistent with regex-based extraction) Fix CI test failures in analysis-quality-validation: - Add per-file isStrictV2Format() checks in Mermaid, evidence density, and scoring tests — script-generated (v1) files in mixed-format directories are no longer subject to strict-v2 quality requirements - Broaden LXI_SCORING_PATTERN to match Likelihood/Impact table headers and inline L:N × I:N patterns used in Mermaid diagrams - Add COMMITTEE_CODE_PATTERN to countDokIds() so committee report abbreviations (FöU12, JuU15) count as valid document references alongside full dok_ids All 4060 tests pass across 99 test files. Agent-Logs-Url: https://github.com/Hack23/riksdagsmonitor/sessions/96fe0c5f-1545-45ed-9a9f-082ba8fff464 Co-authored-by: pethers <1726836+pethers@users.noreply.github.com>
🔍 Lighthouse Performance Audit
📥 Download full Lighthouse report Budget Compliance: Performance budgets enforced via |
Applied both review comments and fixed CI failures in Review feedback:
CI fixes:
All 4060 tests pass, TypeScript clean, CodeQL 0 alerts. |
scripts/pre-article-analysis.ts
Outdated
| const organ = d.organ || d.committee || ''; | ||
|
|
||
| // Domain classification using committee→domain mapping | ||
| const domainResult = detectPolicyDomainsWithConfidence(d, 'en'); |
There was a problem hiding this comment.
analyzeDocuments() takes a _lang parameter but hardcodes 'en' when calling detectPolicyDomainsWithConfidence(). This makes the function signature misleading and prevents policy-domain labels from being localised if the script is ever run for non-English output. Use the _lang argument (or remove it entirely) so the language selection is consistent end-to-end.
| const domainResult = detectPolicyDomainsWithConfidence(d, 'en'); | |
| const domainResult = detectPolicyDomainsWithConfidence(d, _lang); |
|
@copilot apply changes based on the comments in this thread |
🔍 Lighthouse Performance Audit
📥 Download full Lighthouse report Budget Compliance: Performance budgets enforced via |
Agent-Logs-Url: https://github.com/Hack23/riksdagsmonitor/sessions/a8775231-40ce-4641-b44b-4cd1e9a036f8 Co-authored-by: pethers <1726836+pethers@users.noreply.github.com>
Fixed in |
🔍 Lighthouse Performance Audit
📥 Download full Lighthouse report Budget Compliance: Performance budgets enforced via |
| /** Key findings from pre-computed synthesis analysis (for article enrichment) */ | ||
| synthesisKeyFindings?: string[]; | ||
| /** Aggregate risk summary from pre-computed risk assessment */ | ||
| riskSummary?: string; | ||
| /** Date of the analysis data (may differ from article date due to lookback) */ | ||
| analysisDate?: string; |
There was a problem hiding this comment.
synthesisKeyFindings is populated from analysis.synthesis.keyThemes, but the name/JSDoc imply these are “key findings”. This is misleading for consumers and log output. Consider renaming the field to something like synthesisKeyThemes (and update the log label), or alternatively populate it with actual “findings” if that exists in the parsed analysis model.
|
@copilot apply changes based on the comments in this thread |
Agent-Logs-Url: https://github.com/Hack23/riksdagsmonitor/sessions/69e90da8-c2c2-4782-9b18-9368ad97e842 Co-authored-by: pethers <1726836+pethers@users.noreply.github.com>
Renamed |
🔍 Lighthouse Performance Audit
📥 Download full Lighthouse report Budget Compliance: Performance budgets enforced via |
Uh oh!
There was an error while loading. Please reload this page.