Conversation
…t leaf modules Agent-Logs-Url: https://github.com/Hack23/riksdagsmonitor/sessions/ccf21783-581f-4545-9c59-a0b0d523b8c3 Co-authored-by: pethers <1726836+pethers@users.noreply.github.com>
…ntext split Agent-Logs-Url: https://github.com/Hack23/riksdagsmonitor/sessions/ccf21783-581f-4545-9c59-a0b0d523b8c3 Co-authored-by: pethers <1726836+pethers@users.noreply.github.com>
🏷️ Automatic Labeling SummaryThis PR has been automatically labeled based on the files changed and PR metadata. Applied Labels: documentation,testing,refactor,size-xl Label Categories
For more information, see |
🔍 Lighthouse Performance Audit
📥 Download full Lighthouse report Budget Compliance: Performance budgets enforced via |
…d-context leaf modules Agent-Logs-Url: https://github.com/Hack23/riksdagsmonitor/sessions/6302e73d-bc77-4fbf-b173-8b7a60325d74 Co-authored-by: pethers <1726836+pethers@users.noreply.github.com>
🔍 Lighthouse Performance Audit
📥 Download full Lighthouse report Budget Compliance: Performance budgets enforced via |
…leaf modules + add 103 leaf tests Agent-Logs-Url: https://github.com/Hack23/riksdagsmonitor/sessions/6302e73d-bc77-4fbf-b173-8b7a60325d74 Co-authored-by: pethers <1726836+pethers@users.noreply.github.com>
🔍 Lighthouse Performance Audit
📥 Download full Lighthouse report Budget Compliance: Performance budgets enforced via |
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
This PR refactors the generator “monoliths” into bounded-context leaf modules with thin CLI shims, adds focused unit tests for the new leaves, and preserves byte-identical outputs (timestamps aside).
Changes:
- Split sitemap XML, sitemap HTML, RSS, political-intelligence, and render-lib markdown/aggregator logic into leaf modules with public barrels.
- Added new unit test suites to pin leaf-module invariants and barrel parity.
- Introduced lightweight structural validators for sitemap XML and RSS feeds.
Reviewed changes
Copilot reviewed 74 out of 80 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| tests/sitemap-xml-leaf-modules.test.ts | Adds unit tests for sitemap-xml leaf helpers (hreflang, url entry, validator). |
| tests/sitemap-html-leaf-modules.test.ts | Adds unit tests for sitemap-html escaping and LANGUAGE_META invariants. |
| tests/rss-leaf-modules.test.ts | Adds unit tests for RSS escape/hreflang/pubDate/article meta/validator helpers. |
| tests/render-lib-leaf-modules.test.ts | Adds extensive leaf-module tests for render-lib aggregator + markdown pipeline and barrel parity. |
| tests/political-intelligence-leaf-modules.test.ts | Adds unit tests for political-intelligence i18n/helpers and render helpers. |
| scripts/sitemap-xml/validator.ts | New sitemap.xml structural validator leaf. |
| scripts/sitemap-xml/scanners/news.ts | New recursive news scanner grouping per base slug and unified lastmod. |
| scripts/sitemap-xml/scanners/docs.ts | New docs/ HTML scanner leaf. |
| scripts/sitemap-xml/scanners/api.ts | New api/ (TypeDoc) HTML scanner leaf. |
| scripts/sitemap-xml/render/url-entry.ts | New pure <url> XML block builder with hreflang alternates. |
| scripts/sitemap-xml/render/sitemap.ts | New sitemap.xml builder composing entries and scanner outputs. |
| scripts/sitemap-xml/index.ts | Public barrel for sitemap-xml bounded context. |
| scripts/sitemap-xml/hreflang.ts | New hreflang mapping helper (no → nb). |
| scripts/sitemap-xml/git-timestamps.ts | New git timestamp preload cache with filesystem fallback. |
| scripts/sitemap-html/render/page.ts | New sitemap HTML page renderer using chrome, i18n, scanners, and inline styles. |
| scripts/sitemap-html/index.ts | Public barrel for sitemap-html bounded context. |
| scripts/sitemap-html/escape.ts | New HTML escaper leaf used by sitemap-html rendering. |
| scripts/sitemap-html/articles/scanner.ts | New recursive news scanner + HTML metadata extractor for sitemap HTML pages. |
| scripts/sitemap-html/articles/docs-sections.ts | New docs section probe for conditional sitemap HTML links. |
| scripts/rss/validator.ts | New rss.xml structural validator leaf. |
| scripts/rss/scanner.ts | New RSS news scanner (English-primary) with alternates. |
| scripts/rss/render/feed.ts | New RSS feed renderer composing items and channel metadata. |
| scripts/rss/pub-date.ts | New deterministic pubDate resolver (filename → mtime → sentinel). |
| scripts/rss/index.ts | Public barrel for RSS bounded context. |
| scripts/rss/hreflang.ts | New hreflang helper for RSS context (no → nb). |
| scripts/rss/escape.ts | New XML escaper leaf used by RSS rendering. |
| scripts/rss/article-meta.ts | New extractor for title/description/pubDate/author/category from article HTML. |
| scripts/render-lib/markdown/sanitize-schema.ts | Extracted rehype-sanitize schema and heading ID prefix constant. |
| scripts/render-lib/markdown/rehype-wrap-tables.ts | Extracted rehype plugin to wrap tables for horizontal scrolling. |
| scripts/render-lib/markdown/rehype-slug-prefixed.ts | Extracted prefixed rehype-slug variant to ensure stable IDs. |
| scripts/render-lib/markdown/pipeline.ts | Extracted unified/remark/rehype pipeline orchestrator. |
| scripts/render-lib/markdown/mermaid-preprocess.ts | Extracted pre-remark mermaid fence → <pre class="mermaid"> transform. |
| scripts/render-lib/markdown/index.ts | New markdown pipeline barrel (leaf modules + orchestrator). |
| scripts/render-lib/markdown.ts | Reduced to compat shim re-exporting the new markdown barrel. |
| scripts/render-lib/aggregator/sources-appendix.ts | Extracted appendix builder for GitHub-linked artifact list. |
| scripts/render-lib/aggregator/seo/title.ts | Extracted SEO title parsing/cleaning + BLUF fallback. |
| scripts/render-lib/aggregator/seo/description.ts | Extracted SEO description readers and sentence-aware truncation. |
| scripts/render-lib/aggregator/reader-guide.ts | Extracted reader guide table builder + anchor parity logic. |
| scripts/render-lib/aggregator/per-document.ts | Extracted documents/* expansion logic and existence probe. |
| scripts/render-lib/aggregator/order.ts | Extracted canonical aggregation order + curated titles + fallback prettifier. |
| scripts/render-lib/aggregator/index.ts | New aggregator public barrel + frozen test API assembled from leaves. |
| scripts/render-lib/aggregator/frontmatter.ts | Extracted YAML and inline-markdown escaping + frontmatter builder. |
| scripts/render-lib/aggregator/cleaning/structural.ts | Extracted structural body cleaning, heading demotion, and link rewriting. |
| scripts/render-lib/aggregator/cleaning/process-meta.ts | Extracted line-level process metadata stripper and regex constants. |
| scripts/render-lib/aggregator/cleaning/pass-two.ts | Extracted pass-2 audit section stripper and regex constant. |
| scripts/render-lib/aggregator/cleaning/admin-bylines.ts | Extracted paragraph-level admin byline stripper and regex constants. |
| scripts/render-lib/aggregator/aggregate.ts | Extracted slim top-level aggregation orchestrator (file reads + composition). |
| scripts/political-intelligence/render/style.ts | Extracted inline CSS for political-intelligence page. |
| scripts/political-intelligence/render/page.ts | Extracted political-intelligence page builder (chrome, JSON-LD, sections). |
| scripts/political-intelligence/render/grid.ts | Extracted catalog grid renderer. |
| scripts/political-intelligence/render/daily-day.ts | Extracted daily-day renderer + artifact basename/icon helpers. |
| scripts/political-intelligence/index.ts | Public barrel for political-intelligence bounded context. |
| scripts/political-intelligence/i18n/artifact-i18n.ts | Extracted artifact title i18n + generic description fallback helpers. |
| scripts/political-intelligence/daily-streams.ts | Extracted daily stream/day discovery + artifact collection. |
| scripts/political-intelligence/catalog.ts | Extracted catalog discovery + GitHub URL builder. |
| scripts/generate-rss.ts | Converted to thin CLI shim + re-export shim for RSS generator. |
| analysis/daily/2026-04-27/propositions/article.md | Re-generated artifact showing Reader Guide placement shift and timestamp update. |
Comments suppressed due to low confidence (7)
scripts/render-lib/markdown/mermaid-preprocess.ts:1
- The markdown pipeline leaf module depends on
generate-sitemap-html.jsjust to getescapeHtml. This creates an undesirable cross-generator dependency (and can easily introduce circular imports or inflate import cost) inside what’s intended to be a small, reusable render-lib building block. Move the escaper into a shared, dependency-light utility (e.g., underscripts/render-lib/orscripts/shared/) or import from thescripts/sitemap-html/escape.jsleaf/barrel directly (not the CLI shim), so render-lib doesn’t depend on another generator’s entrypoint.
scripts/rss/render/feed.ts:1 - This makes
generateRss()non-deterministic when there are zero articles, because it falls back tonew Date().toUTCString(). The PR description stresses byte-identical output and reproducibility; this is a concrete case where output will vary run-to-run. Use a fixed sentinel (similar tostablePubDate’s sentinel) or derive a stable value from the repo (e.g., latest commit time) for the empty-feed case.
scripts/sitemap-xml/render/sitemap.ts:1 - The module-level description claims the function is pure and only composes scanner output, but
generateSitemap()in this file directly performs filesystem existence checks (e.g., RSS/dashboard variants) and callsgetFileModTime()on concrete paths. Please either update the docstring to reflect the current behavior, or move these filesystem checks into scanners/orchestrator inputs so the render module matches the stated ‘pure composition’ contract.
scripts/sitemap-xml/validator.ts:1 - The validator is used as a leaf-module unit under tests, but it logs to stdout on every call. This makes the helper impure and adds noise to test output/CI logs. Consider removing
console.logfromvalidateSitemap()(and letting the CLI shim handle logging), or accept an optional logger callback so library consumers can stay quiet by default.
scripts/sitemap-xml/validator.ts:1 - The validator is used as a leaf-module unit under tests, but it logs to stdout on every call. This makes the helper impure and adds noise to test output/CI logs. Consider removing
console.logfromvalidateSitemap()(and letting the CLI shim handle logging), or accept an optional logger callback so library consumers can stay quiet by default.
scripts/sitemap-xml/validator.ts:1 - The validator is used as a leaf-module unit under tests, but it logs to stdout on every call. This makes the helper impure and adds noise to test output/CI logs. Consider removing
console.logfromvalidateSitemap()(and letting the CLI shim handle logging), or accept an optional logger callback so library consumers can stay quiet by default.
scripts/sitemap-html/escape.ts:1 - Preserving arbitrary named/numeric entities means inputs like
<script>...will not have their&escaped and can decode into real tags when inserted into HTML, defeating the XSS protection goal stated in the docstring. If this function is used for any untrusted content, it should not preserve general entities; instead, escape&unconditionally or only allowlist a small set of safe entities (e.g.,&,<,>,",'). Align the tests with the safer contract if needed.
| import { fileURLToPath } from 'url'; | ||
|
|
||
| import type { Language } from '../../types/language.js'; | ||
| import { LANGUAGE_META, escapeHtml } from '../../generate-sitemap-html.js'; |
There was a problem hiding this comment.
Political-intelligence leaf modules import LANGUAGE_META / escapeHtml from generate-sitemap-html.js (a generator entrypoint). This undermines the bounded-context split by coupling PI rendering to another generator’s CLI shim API surface. Prefer importing from the sitemap-html bounded-context barrel (scripts/sitemap-html/index.js) or extracting LANGUAGE_META + escaping into a small shared module used by both contexts.
| import { LANGUAGE_META, escapeHtml } from '../../generate-sitemap-html.js'; | |
| import { LANGUAGE_META, escapeHtml } from '../../sitemap-html/index.js'; |
|
@copilot apply changes based on the comments in this thread and fix issues in TypeScript & JavaScript Testing / Unit Tests (Vitest) (pull_request) |
…sitemap-html CLI shim Agent-Logs-Url: https://github.com/Hack23/riksdagsmonitor/sessions/abefd963-1d55-4fc9-87a3-af54322599d6 Co-authored-by: pethers <1726836+pethers@users.noreply.github.com>
Both addressed in
|
🔍 Lighthouse Performance Audit
📥 Download full Lighthouse report Budget Compliance: Performance budgets enforced via |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 74 out of 80 changed files in this pull request and generated 1 comment.
Comments suppressed due to low confidence (6)
scripts/render-lib/markdown/mermaid-preprocess.ts:1
- This leaf module depends on
../../generate-sitemap-html.js(a CLI shim), reintroducing cross-context coupling and making the markdown pipeline sensitive to unrelated generator entrypoints. Move the escaping dependency into render-lib (e.g., a local minimal escaper inside this module, or a shared render-lib utility) so the markdown bounded context can load in isolation without importing another generator’s CLI.
scripts/render-lib/markdown/mermaid-preprocess.ts:1 - This leaf module depends on
../../generate-sitemap-html.js(a CLI shim), reintroducing cross-context coupling and making the markdown pipeline sensitive to unrelated generator entrypoints. Move the escaping dependency into render-lib (e.g., a local minimal escaper inside this module, or a shared render-lib utility) so the markdown bounded context can load in isolation without importing another generator’s CLI.
scripts/rss/render/feed.ts:1 - This breaks the stated determinism contract for RSS output when there are zero articles:
lastBuildDatebecomes the current time, making builds non-reproducible. Use a fixed sentinel (similar tostablePubDate’s sentinel) or derive a stable value from the repo (e.g., a git timestamp) so empty-feed renders are deterministic as well.
scripts/sitemap-xml/scanners/api.ts:1 - Unlike the docs scanner, this scanner never sorts
resultsbefore returning. That can make sitemap generation output order non-deterministic across filesystems/platforms. Sort byfile(and consider sortingentriesinsidescanDiras well) before returning so the sitemap XML stays stable.
scripts/sitemap-xml/render/url-entry.ts:1 - Values interpolated into XML (
loc,alt.href, and potentiallylastmod) are not XML-escaped or URL-encoded. If any scanned path contains characters like&,<,\", or non-URL-safe characters, the generated sitemap can become invalid XML or contain invalid URLs. Add escaping/encoding for attribute values and element text (at minimum&,<,>,\",') and considerencodeURI/encodeURIComponentas appropriate for URL paths.
scripts/sitemap-xml/validator.ts:1 - This validator is exported from the bounded-context barrel and currently always logs to stdout. That can add noise in unit tests and in library-style consumption. Consider injecting a logger (or a
quietoption) and/or moving logging responsibilities to the CLI shim sovalidateSitemapremains a pure validation utility.
| const chrome = buildChrome({ | ||
| lang, | ||
| title: t.title, | ||
| description: t.metaDescription, | ||
| keywords: t.metaKeywords, | ||
| canonicalPath: selfFile, | ||
| hreflangAlternates, | ||
| defaultAlternateBase: 'political-intelligence.html', | ||
| ogType: 'website', | ||
| section: t.title, | ||
| publishedIso: `${latestDate}T00:00:00Z`, | ||
| modifiedIso: buildIso, | ||
| rssHref: lang === 'en' ? '/rss/news.xml' : `/rss/news_${lang}.xml`, | ||
| breadcrumb: [ | ||
| { label: t.home, href: indexFile }, | ||
| { label: t.title }, | ||
| ], | ||
| jsonLd, | ||
| extraStyle: PI_EXTRA_STYLE, | ||
| }); |
There was a problem hiding this comment.
The RSS link path here (/rss/news.xml and /rss/news_${lang}.xml) is inconsistent with the RSS generator in this PR, which writes rss.xml at the site root (/rss.xml). This will publish a broken RSS <link>/header reference on the political-intelligence page. Align rssHref with the actual generated feed path (e.g. /rss.xml, and only add language variants if they are truly produced elsewhere).
|
@copilot apply changes based on the comments in this thread |
…n /rss/news.xml link) Agent-Logs-Url: https://github.com/Hack23/riksdagsmonitor/sessions/827708e4-f2cf-4ce5-b745-11af9f98b679 Co-authored-by: pethers <1726836+pethers@users.noreply.github.com>
Fixed in |
🔍 Lighthouse Performance Audit
📥 Download full Lighthouse report Budget Compliance: Performance budgets enforced via |
Refactor: bounded-context decomposition for sitemap/RSS/news-index/political-intelligence generators
Latest commit — fix broken RSS link in political-intelligence pages
Reviewer flagged that
political-intelligence/render/page.tssetrssHrefto/rss/news.xmland/rss/news_${lang}.xml, but the RSS generator only writesrss.xmlat the site root. This produced a broken<link rel="alternate" type="application/rss+xml">header on all 14 PI pages (a pre-existing bug carried over verbatim from the 2289-LOC monolith onmain).Changed to a single
rssHref: '/rss.xml'for all languages — verified the 14 regenerated PI pages now reference the actual feed file:Verification
npx tsc --project tsconfig.scripts.json --noEmit— cleangenerate-political-intelligence.test.tsand the new leaf-module suite<link rel="alternate" rss>and footer RSS link both point at/rss.xmlrss/newsreferences remainFinal sizes (LOC) — unchanged
generate-political-intelligence.tsgenerate-sitemap-html.tsgenerate-sitemap.tsgenerate-rss.tsgenerate-news-indexes.tsPlan
generate-political-intelligence.tsintoscripts/political-intelligence/generate-sitemap-html.tsintoscripts/sitemap-html/generate-sitemap.tsintoscripts/sitemap-xml/generate-rss.tsintoscripts/rss/generate-news-indexes.tsalready splitnpx tsx scripts/generate-X.tsnpm run lintgreenmainscripts/sitemap-html/i18n.tsimport path)<link>in PI pages — point at actual/rss.xml