Skip to content

Harden MCP coverage-state reporting, deferred retries, and voteringar lag diagnostics#2498

Merged
pethers merged 10 commits into
mainfrom
copilot/mitigate-mcp-indexing-lag
May 15, 2026
Merged

Harden MCP coverage-state reporting, deferred retries, and voteringar lag diagnostics#2498
pethers merged 10 commits into
mainfrom
copilot/mitigate-mcp-indexing-lag

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented May 14, 2026

Plan — round 2 review feedback

@github-actions github-actions Bot added the size-xs Extra small change (< 10 lines) label May 14, 2026
@github-actions
Copy link
Copy Markdown
Contributor

🏷️ Automatic Labeling Summary

This PR has been automatically labeled based on the files changed and PR metadata.

Applied Labels: size-xs

Label Categories

  • 🗳️ Content: news, dashboard, visualization, intelligence
  • 💻 Technology: html-css, javascript, workflow, security
  • 📊 Data: cia-data, riksdag-data, data-pipeline, schema
  • 🌍 I18n: i18n, translation, rtl
  • 🔒 ISMS: isms, iso-27001, nist-csf, cis-controls
  • 🏗️ Infrastructure: ci-cd, deployment, performance, monitoring
  • 🔄 Quality: testing, accessibility, documentation, refactor
  • 🤖 AI: agent, skill, agentic-workflow

For more information, see .github/labeler.yml.

@github-actions
Copy link
Copy Markdown
Contributor

🔍 Lighthouse Performance Audit

Category Score Status
Performance 85/100 🟡
Accessibility 95/100 🟢
Best Practices 90/100 🟢
SEO 95/100 🟢

📥 Download full Lighthouse report

Budget Compliance: Performance budgets enforced via budget.json

@github-actions github-actions Bot added documentation Documentation updates testing Test coverage refactor Code refactoring size-xl Extra large change (> 1000 lines) labels May 15, 2026
@github-actions
Copy link
Copy Markdown
Contributor

🔍 Lighthouse Performance Audit

Category Score Status
Performance 85/100 🟡
Accessibility 95/100 🟢
Best Practices 90/100 🟢
SEO 95/100 🟢

📥 Download full Lighthouse report

Budget Compliance: Performance budgets enforced via budget.json

Copilot AI and others added 2 commits May 15, 2026 00:47
Agent-Logs-Url: https://github.com/Hack23/riksdagsmonitor/sessions/db0617e3-4e82-40c9-b3ad-61ad861020d0

Co-authored-by: pethers <1726836+pethers@users.noreply.github.com>
@github-actions
Copy link
Copy Markdown
Contributor

🔍 Lighthouse Performance Audit

Category Score Status
Performance 85/100 🟡
Accessibility 95/100 🟢
Best Practices 90/100 🟢
SEO 95/100 🟢

📥 Download full Lighthouse report

Budget Compliance: Performance budgets enforced via budget.json

Agent-Logs-Url: https://github.com/Hack23/riksdagsmonitor/sessions/db0617e3-4e82-40c9-b3ad-61ad861020d0

Co-authored-by: pethers <1726836+pethers@users.noreply.github.com>
@github-actions
Copy link
Copy Markdown
Contributor

🔍 Lighthouse Performance Audit

Category Score Status
Performance 85/100 🟡
Accessibility 95/100 🟢
Best Practices 90/100 🟢
SEO 95/100 🟢

📥 Download full Lighthouse report

Budget Compliance: Performance budgets enforced via budget.json

Copilot AI changed the title [WIP] Improve MCP client to mitigate indexing lag and coverage gaps Harden MCP coverage-state reporting, deferred retries, and voteringar lag diagnostics May 15, 2026
@github-actions
Copy link
Copy Markdown
Contributor

🔍 Lighthouse Performance Audit

Category Score Status
Performance 85/100 🟡
Accessibility 95/100 🟢
Best Practices 90/100 🟢
SEO 95/100 🟢

📥 Download full Lighthouse report

Budget Compliance: Performance budgets enforced via budget.json

Copilot AI requested a review from pethers May 15, 2026 00:52
@pethers pethers marked this pull request as ready for review May 15, 2026 06:12
Copilot AI review requested due to automatic review settings May 15, 2026 06:12
@github-actions
Copy link
Copy Markdown
Contributor

🔍 Lighthouse Performance Audit

Category Score Status
Performance 85/100 🟡
Accessibility 95/100 🟢
Best Practices 90/100 🟢
SEO 95/100 🟢

📥 Download full Lighthouse report

Budget Compliance: Performance budgets enforced via budget.json

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds explicit MCP coverage/provenance reporting to the parliamentary data pipeline so analysis artifacts can distinguish full text, metadata-only responses, not-yet-indexed documents, and empty searches.

Changes:

  • Adds MCP coverage/provenance types, client diagnostics wrappers, and coverage inference helpers.
  • Introduces a file-backed deferred MCP retry queue and integrates it into the download pipeline/manifest.
  • Updates docs and tests for manifest diagnostics, retry queue behavior, smoke testing, and workflow coverage counts.

Reviewed changes

Copilot reviewed 19 out of 19 changed files in this pull request and generated 9 comments.

Show a summary per file
File Description
tests/network-diagnostics.test.ts Updates workflow count/list expectations to 14 news workflows.
tests/mcp-retry-queue.test.ts Adds unit tests for retry queue dedupe and resolved document drain behavior.
tests/mcp-client-smoke.test.ts Adds gated live MCP smoke test for structured provenance.
tests/mcp-client-core-part2.test.ts Adds coverage/diagnostic tests for empty searches, voting lag, and document details.
tests/data-downloader-enrichment.test.ts Updates mock client with MCP base URL for provenance generation.
tests/auto-full-text-top-n.test.ts Updates mocks and adds manifest serialization assertions for coverage/queue sections.
scripts/types/mcp.ts Defines coverage state, provenance, signal, and diagnostic interfaces.
scripts/parliamentary-data/mcp-retry-queue.ts Adds file-backed retry queue creation, persistence, enqueue, and drain logic.
scripts/parliamentary-data/full-text-threshold.ts Extracts shared full-text length threshold.
scripts/parliamentary-data/data-downloader.ts Propagates MCP coverage/provenance through downloads and full-text enrichment.
scripts/mcp-client/index.ts Exposes new diagnostic wrapper convenience functions.
scripts/mcp-client/coverage.ts Adds coverage inference and provenance helper utilities.
scripts/mcp-client/client.ts Adds diagnostic wrappers for document search, voting search, and document details.
scripts/mcp-client.ts Re-exports the new MCP diagnostic APIs.
scripts/download-parliamentary-data.ts Integrates retry draining/enqueueing and extends manifest serialization.
scripts/data-transformers/types.ts Adds MCP coverage/provenance fields to raw document shape.
data/mcp-retry-queue.json Adds initial empty retry queue file.
analysis/templates/data-download-manifest.md Documents the new MCP coverage-state manifest contract.
analysis/methodologies/ai-driven-analysis-guide.md Updates analysis workflow guidance for coverage/provenance and deferred retries.

Comment on lines +893 to +906
const provenance = buildMcpProvenance({
endpoint: client.baseURL,
tool: 'get_dokument_innehall',
query: { dok_id: dokId, include_full_text: true },
resultCount: 0,
coverageState: 'not_indexed',
});
outcome = {
dokId,
success: false,
chars: 0,
reason: `fetchDocumentDetails failed: ${err instanceof Error ? err.message : String(err)}`,
coverageState: 'not_indexed',
provenance,
dokId,
true,
{
requestedDate: typeof docRecord['datum'] === 'string' ? docRecord['datum'] as string : null,
Comment thread scripts/mcp-client/client.ts Outdated
try {
const response = await this.fetchDocumentDetails(dok_id, include_full_text);
const coverageState = inferDocumentCoverageState(response, {
requestedDate: options.requestedDate ?? extractDocumentDate(response),
Comment on lines +183 to +257
if (entry.resourceType === 'document_fulltext') {
const result = await client.fetchDocumentDetailsWithCoverage(
entry.resourceId,
true,
{
requestedDate: (entry.params['requestedDate'] as string | undefined) ?? null,
retrieval: 'retry_queue',
},
);

diagnostics.push({
tool: entry.tool,
query: { ...entry.params, dok_id: entry.resourceId, include_full_text: true },
resultCount: result.resultCount,
coverageState: result.coverageState,
provenance: result.provenance,
notes: entry.reason,
});

if (result.coverageState === 'full_text') {
resolved++;
resolvedDocuments[entry.resourceId] = result.document;
continue;
}

remaining.push({
...entry,
attemptCount: entry.attemptCount + 1,
coverageState: result.coverageState,
reason: entry.reason ?? `Deferred ${entry.tool} retry still ${result.coverageState}`,
lastAttemptAt,
});
retained++;
continue;
}

const votingParams = entry.params;
if (typeof votingParams !== 'object' || votingParams === null) {
remaining.push({
...entry,
attemptCount: entry.attemptCount + 1,
reason: 'retry queue entry has invalid voting params payload',
lastAttemptAt,
});
retained++;
continue;
}

const votingResult = await client.fetchVotingRecordsWithDiagnostics(
votingParams as FetchVotingFilters,
);

diagnostics.push({
tool: entry.tool,
query: { ...(entry.params as Record<string, unknown>) },
resultCount: votingResult.resultCount,
coverageState: votingResult.coverageState,
provenance: votingResult.provenance,
notes: entry.reason,
...(votingResult.signal ? { signal: votingResult.signal } : {}),
});

if (votingResult.resultCount > 0) {
resolved++;
continue;
}

remaining.push({
...entry,
attemptCount: entry.attemptCount + 1,
coverageState: votingResult.coverageState,
reason: votingResult.signal?.message ?? entry.reason,
lastAttemptAt,
});
retained++;
Comment on lines +231 to +233
const votingResult = await client.fetchVotingRecordsWithDiagnostics(
votingParams as FetchVotingFilters,
);
const coverageState = inferDocumentCoverageState(
{ ...docRecord, ...details },
{
requestedDate: typeof docRecord['datum'] === 'string' ? docRecord['datum'] as string : null,
Comment on lines +397 to +408
const notes = outcome?.reason
?? outcome?.filePath
?? (doc.contentFetched
? (typeof doc.summary === 'string' && doc.summary.trim().length > 0 ? 'summary present' : 'metadata-only payload')
: 'list payload only; get_dokument_innehall not attempted in this run');
return {
dokId,
coverageState: provenance.coverageState,
retrieval: provenance.retrieval,
tool: provenance.tool,
resultCount: provenance.resultCount,
notes,
Comment on lines +537 to +547
toolDiagnostics.push({
tool: task.source,
query,
resultCount: 0,
coverageState: 'search_empty',
provenance: buildMcpProvenance({
endpoint: client.baseURL,
tool: task.source,
query,
resultCount: 0,
coverageState: 'search_empty',
Comment thread scripts/download-parliamentary-data.ts Outdated
Comment on lines +637 to +643
const resolvedIds = new Set(Object.keys(retryDrain.resolvedDocuments));
for (const doc of allDocs) {
const dokId = extractDokId(doc, '');
if (!dokId || !resolvedIds.has(dokId)) continue;
Object.assign(doc, retryDrain.resolvedDocuments[dokId]);
}
console.log(` 🔁 Deferred queue restored full text for ${resolvedIds.size} document(s)`);
@pethers
Copy link
Copy Markdown
Member

pethers commented May 15, 2026

@copilot apply changes based on the comments in this thread

@github-actions
Copy link
Copy Markdown
Contributor

🔍 Lighthouse Performance Audit

Category Score Status
Performance 85/100 🟡
Accessibility 95/100 🟢
Best Practices 90/100 🟢
SEO 95/100 🟢

📥 Download full Lighthouse report

Budget Compliance: Performance budgets enforced via budget.json

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 19 out of 19 changed files in this pull request and generated 11 comments.

resultCount,
coverageState,
});
Object.assign(record, attachCoverageMetadata(record, provenance));
Comment on lines +722 to +739
if (fullTextOutcomes) {
const docMap = new Map(allDocs.map(doc => [extractDokId(doc, ''), doc]));
for (const outcome of fullTextOutcomes) {
const doc = docMap.get(outcome.dokId);
if (!doc) continue;
if (outcome.coverageState !== 'full_text' && typeof doc.datum === 'string' && doc.datum.slice(0, 10) === date) {
queueEntries.push(createRetryQueueEntry({
resourceType: 'document_fulltext',
resourceId: outcome.dokId,
tool: 'get_dokument_innehall',
coverageState: outcome.coverageState,
docType,
params: { requestedDate: doc.datum.slice(0, 10), include_full_text: true },
reason: outcome.reason,
requestedAt: new Date().toISOString(),
}));
}
}
Comment thread scripts/download-parliamentary-data.ts Outdated
const record = doc as Record<string, unknown>;
if (!record['mcpCoverageState']) {
const coverageState = inferDocumentCoverageState(record, {
requestedDate: typeof doc.datum === 'string' ? doc.datum : null,
Comment on lines 821 to +829
let content = selectContent(docRecord);
const runDate = new Date().toISOString().slice(0, 10);

if (content.length <= FULL_TEXT_MIN_LENGTH) {
details = (await client.fetchDocumentDetails(dokId, true)) as Record<string, unknown>;
const detailsWithCoverage = await client.fetchDocumentDetailsWithCoverage(
dokId,
true,
{
requestedDate: runDate,
Comment on lines +217 to +229
} catch (drainErr) {
console.warn(
`[mcp-retry-queue] Document retry failed for ${entry.resourceId}:`,
drainErr instanceof Error ? drainErr.message : String(drainErr),
);
remaining.push({
...entry,
attemptCount: entry.attemptCount + 1,
reason: `Retry failed: ${drainErr instanceof Error ? drainErr.message : String(drainErr)}`,
lastAttemptAt,
});
retained++;
}
Comment on lines +299 to +304
const notes = diag.signal?.code
? `${diag.signal.code}: ${diag.signal.message}`
: (diag.notes ?? '');
lines.push(
`| ${diag.tool} | \`${JSON.stringify(diag.query)}\` | ${diag.resultCount} | ${diag.coverageState} | ${notes} |`,
);
Comment on lines +211 to +217
const NOT_INDEXED_ERROR_PATTERNS = [
'not found',
'404',
'not indexed',
'no document',
'ingen',
] as const;
Comment on lines +897 to +901
// Distinguish true indexing-gap errors from transient/operational failures
const isNotIndexed = ['not found', '404', 'not indexed', 'no document', 'ingen'].some(
(p) => msgLower.includes(p),
);
const state = isNotIndexed ? 'not_indexed' : 'fetch_error' as const;
Comment on lines +288 to +293
const updatedQueue: MCPRetryQueueFile = {
schema: MCP_RETRY_QUEUE_SCHEMA,
updatedAt: now.toISOString(),
entries: remaining,
};
saveMcpRetryQueue(updatedQueue, queuePath);
Comment on lines +260 to +262
if (votingResult.resultCount > 0) {
resolved++;
continue;
@pethers
Copy link
Copy Markdown
Member

pethers commented May 15, 2026

@copilot apply changes based on the comments in this thread and resolve conflicts and improve

…etry-queue isolation, escaping)

Agent-Logs-Url: https://github.com/Hack23/riksdagsmonitor/sessions/70723dd0-25d0-4003-80e3-19b0590ab06f

Co-authored-by: pethers <1726836+pethers@users.noreply.github.com>
Copy link
Copy Markdown
Contributor Author

Copilot AI commented May 15, 2026

@copilot apply changes based on the comments in this thread and resolve conflicts and improve

Merged origin/main and applied all 11 review points in ede25f40:

  1. Setup riksdagsmonitor: 14-language Swedish Parliament intelligence platform with ISMS compliance #1 — Raw analysis/data/ files now stay byte-identical: data-persistence.ts strips the in-memory mcpCoverageState/mcpProvenance/mcpSignals annotations before writing JSON; provenance stays in the sidecar .meta.json and the manifest.
  2. Sync styles.css from Hack23/homepage #2downloadAllDocuments(..., { analysisRunDate }) is now plumbed from the pipeline date, so the default enrichment path tags coverage relative to the run date and any not_indexed enriched docs surface to the queue / manifest.
  3. Bump actions/cache from 5.0.2 to 5.0.3 #3buildDocumentCoverageSummary(allDocs, fullTextOutcomes, date) uses the pipeline date instead of doc.datum / wall-clock.
  4. Fix MCP server paths and add rule-based Copilot instructions #4fetchFullTextForTopN(client, docs, topN, outputDir, { runDate }) accepts the analysis date; the caller in download-parliamentary-data.ts passes date through, so backfill/historical reruns no longer wall-clock-classify.
  5. [StepSecurity] Apply security best practices #5 / Bump actions/upload-artifact from 4.6.2 to 6.0.0 #6 — Retained document AND voteringar retries push a fetch_error diagnostic with the exact failed query + error message so ## MCP Query Diagnostics keeps surfacing the failure instead of collapsing to a retained counter.
  6. Bump github/codeql-action from 3.32.1 to 4.32.1 #7 — Added escapeMarkdownCell (escapes |, collapses \n/\r/\t) and applied it to the diagnostics, coverage, and full-text tables so MCP error traces and |-bearing queries can't corrupt the table.
  7. Bump ossf/scorecard-action from 2.4.0 to 2.4.3 #8 / Bump actions/checkout from 4.3.1 to 6.0.2 #9 — Replaced the bare-substring scan with isDocumentNotIndexedError (and classifyDocumentErrorAsNotIndexed in the client): transport sentinels (mcp server error, transport error, endpoint, econnrefused, etimedout, fetch failed, network, gateway, \b50[023]\b) force fetch_error; document-level "not found" is only accepted when the dok_id is echoed.
  8. Prepare comprehensive issue specifications for riksdagsmonitor CIA integration and visualization enhancements #10drainMcpRetryQueue now skips the queue-file write when the queue was already empty and there was nothing to process, so data/mcp-retry-queue.json no longer dirties on every workflow run.
  9. [WIP] Improve integration with sample data JSON schema #11 — Voting retries now expose resolvedVoteringar and the downloader appends recovered rows to data.votes for the current run, even when the original query is no longer in the current download selection.

Test runs after the merge: targeted (tests/mcp-retry-queue.test.ts, tests/auto-full-text-top-n.test.ts, tests/data-persistence.test.ts) = 87 passing; full vitest = 5795 passed / 34 skipped / 0 failed.

@github-actions github-actions Bot added html-css HTML/CSS changes schema Data schema changes translation Translation updates rtl RTL language support (Arabic, Hebrew) news News articles and content generation labels May 15, 2026
@pethers
Copy link
Copy Markdown
Member

pethers commented May 15, 2026

@copilot resolve the merge conflicts in this pull request

…indexing-lag

# Conflicts:
#	scripts/download-parliamentary-data.ts
#	tests/auto-full-text-top-n.test.ts

Co-authored-by: pethers <1726836+pethers@users.noreply.github.com>
Copy link
Copy Markdown
Contributor Author

Copilot AI commented May 15, 2026

@copilot resolve the merge conflicts in this pull request

Resolved in 46c38b2a. Merged origin/main (5 new commits including dependabot bumps and news articles) — conflicts were in scripts/download-parliamentary-data.ts (3 hunks: queue entries, fetchFullTextForTopN runDate param, and serializeDataManifest coverage args) and tests/auto-full-text-top-n.test.ts (import of serializeDataManifest). Kept our branch's coverage/provenance additions in all cases. Verified with a true 2-parent merge commit and 87/87 targeted tests passing.

@github-actions
Copy link
Copy Markdown
Contributor

🔍 Lighthouse Performance Audit

Category Score Status
Performance 85/100 🟡
Accessibility 95/100 🟢
Best Practices 90/100 🟢
SEO 95/100 🟢

📥 Download full Lighthouse report

Budget Compliance: Performance budgets enforced via budget.json

Comment on lines +284 to +288
return str
.replace(/\r\n?/g, ' ')
.replace(/\n/g, ' ')
.replace(/\t/g, ' ')
.replace(/\|/g, '\\|')
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 21 out of 21 changed files in this pull request and generated 4 comments.

Comment on lines +875 to +887
const detailsWithCoverage = await client.fetchDocumentDetailsWithCoverage(
dokId,
true,
{
requestedDate: runDate,
},
);
details = detailsWithCoverage.document;
content = selectContent(details);
}

const coverageState = inferDocumentCoverageState(
{ ...docRecord, ...(details ?? {}) },
Comment on lines +672 to +679
const coverageState = inferDocumentCoverageState(
{ ...docRecord, ...details },
{
requestedDate: analysisRunDate,
fullTextRequested: true,
},
);
docRecord['mcpCoverageState'] = coverageState;
Comment on lines +713 to +721
let mergedVoteCount = 0;
for (const [queryKey, items] of Object.entries(retryDrain.resolvedVoteringar)) {
if (!Array.isArray(items) || items.length === 0) continue;
data.votes.push(...(items as RawDocument[]));
mergedVoteCount += items.length;
console.log(` 🗳️ Recovered ${items.length} voteringar from deferred queue (${queryKey})`);
}
if (mergedVoteCount > 0) {
console.log(` 🔁 Deferred queue restored ${mergedVoteCount} voteringar row(s) — appended to current-run output`);
Comment on lines +296 to +298
// the current download selection.
if (Array.isArray(votingResult.items)) {
resolvedVoteringar[entry.resourceId] = votingResult.items as unknown[];
@pethers
Copy link
Copy Markdown
Member

pethers commented May 15, 2026

@copilot apply changes based on the comments in this thread

@pethers pethers merged commit 7efcdb4 into main May 15, 2026
18 of 20 checks passed
@pethers pethers deleted the copilot/mitigate-mcp-indexing-lag branch May 15, 2026 13:01
Copilot stopped work on behalf of pethers due to an error May 15, 2026 13:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Documentation updates html-css HTML/CSS changes news News articles and content generation refactor Code refactoring rtl RTL language support (Arabic, Hebrew) schema Data schema changes size-xl Extra large change (> 1000 lines) size-xs Extra small change (< 10 lines) testing Test coverage translation Translation updates

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[MCP Client] Mitigate riksdag-regering MCP indexing lag and metadata-only coverage gaps for voteringar / full-text retrieval

4 participants