Skip to content

test(router): add explicit coverage for llm → semantic routing#129

Open
Copilot wants to merge 6 commits intomainfrom
copilot/add-coverage-for-llm-routing
Open

test(router): add explicit coverage for llm → semantic routing#129
Copilot wants to merge 6 commits intomainfrom
copilot/add-coverage-for-llm-routing

Conversation

Copy link
Contributor

Copilot AI commented Feb 26, 2026

The router test suite covered llm → metadata/graph and fallback/circuit behavior but had no assertion that semantic is accepted when the LLM returns it on a low-confidence rule path — leaving that branch silently unguarded against regressions.

Changes

  • api/src/services/query-router.test.ts — two new tests, no production code touched:
    • Ollama path: entity-pattern query ("who is AuthService") hits rule 5 at conf 0.7, mocked LLM responds {"strategy":"semantic","confidence":0.8} → asserts method === "llm", strategy === "semantic", confidence === 0.8, rule === "entity_pattern"
    • OpenAI parity: same scenario via choices[0].message.content shape → same outcome assertions
it("accepts semantic strategy returned by LLM on low-confidence entity_pattern path", async () => {
  vi.stubEnv("ROUTER_LLM_ENABLED", "true");
  vi.stubEnv("OLLAMA_URL", "http://localhost:11434");

  const fetchMock = vi.fn().mockResolvedValue({
    ok: true,
    json: async () => ({ response: '{"strategy":"semantic","confidence":0.8}' }),
  } as unknown as Response);
  vi.stubGlobal("fetch", fetchMock);

  const result = await classifyQuery({ query: "who is AuthService" });
  expect(result.method).toBe("llm");
  expect(result.strategy).toBe("semantic");
  expect(result.confidence).toBe(0.8);
  expect(result.rule).toBe("entity_pattern");
});
Original prompt

This section details on the original issue you should resolve

<issue_title>test(router): add explicit coverage for non-explicit LLM routing to semantic</issue_title>
<issue_description>## Summary
Add explicit automated coverage for non-explicit llm -> semantic routing in api/src/services/query-router.test.ts, with one required generic path test and one preferred OpenAI parsing parity test.

Problem Statement

Current router tests cover llm -> metadata/graph, fallback/circuit behavior, and OpenAI model-selection behavior, but do not directly assert that semantic is accepted when returned by LLM on a low-confidence rule path.

Without this, regressions could silently reject or bias away from semantic while still passing existing tests.

Scope

In scope

  • Test-only updates in api/src/services/query-router.test.ts.
  • Required: generic/Ollama-path llm -> semantic assertion.
  • Preferred: OpenAI chat-completions llm -> semantic parsing assertion.

Out of scope

  • Production code changes in router implementation.
  • Routing threshold/circuit-breaker behavior changes.
  • API contract changes.

Implementation Approach (Single Issue)

Use one tightly-coupled implementation pass in the same test file:

  1. Add required generic/Ollama-path regression test:

    • Query triggers low-confidence rule path (e.g., entity pattern).
    • Mocked LLM returns valid JSON with strategy="semantic", confidence=0.8.
    • Assert:
      • result.method === "llm"
      • result.strategy === "semantic"
      • result.confidence === 0.8
  2. Add preferred OpenAI provider parity test:

    • OpenAI chat-completions response content contains JSON with strategy="semantic", confidence=0.8.
    • Assert same routing outcome (method, strategy, confidence).

Dependencies

Validation Plan

  • cd api && npm test -- src/services/query-router.test.ts

Risks and Mitigations

  • Risk: test accidentally uses non-LLM path.
    • Mitigation: use low-confidence rule query and assert method === "llm".
  • Risk: OpenAI mock shape mismatch.
    • Mitigation: use current parser shape (choices[].message.content).
  • Risk: environment leakage across tests.
    • Mitigation: rely on existing env/global reset hooks in test suite.

Decision Log (assumption -> resolution)

Assumption Resolution
Work should be split for tracking Not needed; requirements are tightly coupled and belong in one issue.
OpenAI branch can be ignored Kept as preferred parity check, still same issue and same PR.
Code changes may be required Explicitly prohibited; test-only delivery.

Proposed Definition of Done

  • Required generic/Ollama-path unit test proves llm -> semantic on low-confidence rule path.
  • Preferred OpenAI provider unit test proves semantic acceptance from chat-completions payload.
  • api/src/services/query-router.test.ts passes locally/CI with added tests.
  • No production file changes outside tests.
  • PR/issue includes short validation evidence (command + pass summary).

AC/DoD/Non-goal Coverage Matrix

Item Type (AC/DoD/Non-goal) Status (Met/Partial/Unmet/Unverified) Evidence (spec/tests/behavior) Notes
Unit test fails if llm -> semantic path regresses AC Unverified Planned test additions in api/src/services/query-router.test.ts Required item
Test passes for both generic LLM fallback and (optionally) OpenAI provider branch AC Partial Required generic + preferred OpenAI parity test in same issue OpenAI remains preferred per original scope
No behavior changes required; test-only PR is acceptable AC Unverified Scope and DoD constraints Validate at PR diff
Required generic/Ollama-path unit test proves llm -> semantic on low-confidence rule path. DoD Unverified Proposed DoD item 1 Added because source issue has no explicit DoD
Preferred OpenAI provider unit test proves semantic acceptance from chat-completions payload. DoD Unverified Proposed DoD item 2 Added because source issue has no explicit DoD
api/src/services/query-router.test.ts passes locally/CI with added tests. DoD Unverified Proposed DoD item 3 Added because source issue has no explicit DoD
No production file changes outside tests. DoD Unverified Proposed DoD item 4 Added because source issue has no explicit DoD
PR/issue includes short validation evidence (command + pass summary). DoD Unverified Proposed DoD item 5 Added because source issue has no explicit DoD
Production router code changes Non-goal Unverified Out-of-scope section Must remain absent in PR diff
Routing threshold/circuit-breaker behavior changes Non-goal Unverified Out-of-scope section Must remain absent in PR diff
API contract changes Non-goal Unverified Out-of-scope sect...

💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

@mfittko
Copy link
Owner

mfittko commented Feb 26, 2026

Piggyback note for refinement governance update (applies to this PR context):

We updated the refiner agent prompt to enforce a KISS project-management heuristic for tightly-coupled small scopes.

Source update:

  • .github/agents/refiner.agent.md
  • New section: KISS Project-Management Heuristic

Key rule now in prompt:

  • Do not create sub-issues when work is tightly coupled, expected in one PR, and roughly <= ~500 LOC total change.
  • Prefer a single issue with required vs preferred checklist items and one AC/DoD/Non-goal matrix.
  • Split only when distinct owners, sequencing constraints, meaningful parallelization, or large/uncertain scope requires it.

Why this is relevant here:

Co-authored-by: mfittko <326798+mfittko@users.noreply.github.com>
Copilot AI changed the title [WIP] Add explicit coverage for non-explicit LLM routing to semantic test(router): add explicit coverage for llm → semantic routing Feb 26, 2026
@mfittko mfittko marked this pull request as ready for review February 26, 2026 18:12
Copilot AI review requested due to automatic review settings February 26, 2026 18:12
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds explicit test coverage for the llm → semantic routing path in the query router, closing a gap where the LLM could return "semantic" strategy on low-confidence rule paths but had no regression protection. The changes include two new test cases and unrelated agent configuration updates.

Changes:

  • Added Ollama-path test verifying LLM can return semantic strategy on entity_pattern rule path
  • Added OpenAI-provider parity test for semantic strategy via chat-completions API
  • Updated refiner agent configuration with RFC gate workflow and KISS heuristic (unrelated to PR purpose)

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
api/src/services/query-router.test.ts Added two new test cases for llm → semantic routing path: one for Ollama provider and one for OpenAI provider, both testing low-confidence entity_pattern query handling
.github/agents/refiner.agent.md Added RFC gate workflow section and KISS project-management heuristic to refiner agent configuration (appears unrelated to test coverage work)

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 34 to +98
@@ -65,6 +79,23 @@ If critical input is missing, ask concise clarification questions first.
- Add progress comments whenever substantive updates are made.
- Summarize decisions and rationale in each issue comment.

## KISS Project-Management Heuristic

Default to the simplest tracking structure that preserves clarity.

- **Do not create sub-issues** when work is tightly coupled, likely delivered in one PR, and estimated to stay small (rule of thumb: **~500 LOC or less** total change across touched files).
- In those cases, keep a **single issue** with:
- one clear implementation approach,
- required vs preferred checklist items,
- one AC/DoD/Non-goal matrix.
- Only split into sub-issues when at least one of these is true:
- distinct owners or teams are expected,
- independent sequencing/blocking relationships are required,
- meaningful parallelization is possible without merge contention,
- scope is large/uncertain enough that one issue reduces clarity.

When uncertain, choose the single-issue path first and justify any split explicitly in the decision log.

Copy link

Copilot AI Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR description states this is a test-only change to add coverage for llm → semantic routing in the query router test suite. However, this file contains changes to the refiner agent configuration that add RFC gate workflow steps and a KISS project-management heuristic.

These changes appear unrelated to the stated purpose of the PR (adding test coverage) and should likely be in a separate PR focused on agent configuration improvements. Including unrelated changes makes it harder to review and understand the scope of each change.

Consider removing these agent configuration changes from this PR and submitting them separately if they are needed.

Copilot uses AI. Check for mistakes.
Comment on lines +137 to +138
If unresolved foundational decisions are present, refinement is not complete: it must stop at RFC issue(s) and explicitly mark deeper refinement as blocked until RFCs are resolved.

Copy link

Copilot AI Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change appears unrelated to the test coverage work described in the PR. The refiner agent configuration updates should be submitted in a separate PR if they are needed.

Copilot uses AI. Check for mistakes.
- Do not add unnecessary code examples.
- Prefer generic, technology-agnostic requirement language unless stack specifics are essential.
- Keep issue comments factual, decision-oriented, and audit-friendly. No newline at end of file
- Keep issue comments factual, decision-oriented, and audit-friendly.
Copy link

Copilot AI Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line change appears to be whitespace-only or a minor formatting adjustment unrelated to the test coverage work. If this is part of broader agent configuration changes, those should be in a separate PR.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

test(router): add explicit coverage for non-explicit LLM routing to semantic

3 participants