Skip to content

Commit 2f6cdcf

Browse files
committed
docs: platform knowledge documentation + integration and e2e tests
Add QUERY_ROUTER.md source doc with bundled platform knowledge section. Add integration tests verifying the real platform-corpus.json (243 entries, 5 categories, keyword search). Add e2e tests exercising QueryRouter init, classification, and retrieval with platform knowledge enabled/disabled.
1 parent f457956 commit 2f6cdcf

3 files changed

Lines changed: 566 additions & 0 deletions

File tree

docs/QUERY_ROUTER.md

Lines changed: 173 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,173 @@
1+
AgentOS includes a `QueryRouter` that turns one user question into a three-stage pipeline:
2+
3+
1. classify the query into tier `0` through `3`
4+
2. retrieve the right amount of context
5+
3. generate a grounded answer from that context
6+
7+
## What Is Live Today
8+
9+
- Tier classification uses an LLM prompt with corpus topics, recent conversation history, and optional tool names.
10+
- The router embeds local markdown docs into an in-memory vector store when an embedding provider is available.
11+
- If embeddings are unavailable or vector search fails, the router falls back to keyword search automatically.
12+
- Result metadata includes `tiersUsed` and `fallbacksUsed`.
13+
- Lifecycle events cover classification, retrieval, research, generation, and route completion.
14+
15+
## Current Limitations
16+
17+
The QueryRouter scaffold is ahead of the wired runtime in a few places:
18+
19+
- `graphExpand()` is now a built-in corpus-neighborhood heuristic, not yet a true GraphRAG engine.
20+
- `rerank()` is now a built-in lexical heuristic reranker, not yet a cross-encoder service.
21+
- `deepResearch()` is now a built-in local-corpus heuristic synthesis pass, not yet a web-backed research runtime.
22+
- The router is useful today for query classification, vector retrieval, keyword fallback, heuristic graph expansion, heuristic reranking, heuristic local research synthesis, and grounded answer generation, but it is not yet a full GraphRAG or web-research runtime.
23+
24+
## Host-Injected Runtime Hooks
25+
26+
You can replace the built-in heuristic branches without forking `QueryRouter`
27+
by passing host-provided callbacks in the constructor:
28+
29+
- `graphExpand(seedChunks)` for GraphRAG or relationship expansion
30+
- `rerank(query, chunks, topN)` for provider-backed reranking
31+
- `deepResearch(query, sources)` for real multi-source research
32+
33+
When these hooks are supplied, `router.getCorpusStats()` will report the
34+
corresponding runtime mode as `active` instead of the built-in `heuristic`
35+
mode.
36+
37+
## Example
38+
39+
Runnable source: `packages/agentos/examples/query-router.mjs`
40+
41+
```ts
42+
import { QueryRouter } from '@framers/agentos';
43+
44+
const router = new QueryRouter({
45+
knowledgeCorpus: ['./docs', './packages/agentos/docs'],
46+
availableTools: ['web_search', 'deep_research'],
47+
});
48+
49+
await router.init();
50+
51+
console.log(router.getCorpusStats());
52+
53+
const result = await router.route('How does memory retrieval work?');
54+
55+
console.log(result.answer);
56+
console.log(result.classification.tier);
57+
console.log(result.tiersUsed);
58+
console.log(result.fallbacksUsed);
59+
console.log(result.sources);
60+
61+
await router.close();
62+
```
63+
64+
### Host-Injected Runtime Example
65+
66+
Runnable source: `packages/agentos/examples/query-router-host-hooks.mjs`
67+
68+
```ts
69+
const router = new QueryRouter({
70+
knowledgeCorpus: ['./docs', './packages/agentos/docs'],
71+
graphEnabled: true,
72+
deepResearchEnabled: true,
73+
graphExpand: async (seedChunks) => [...seedChunks, extraGraphChunk],
74+
rerank: async (_query, chunks, topN) => chunks.slice(0, topN),
75+
deepResearch: async (query, sources) => ({
76+
synthesis: `Host-provided research for ${query}`,
77+
sources: externalResearchChunks,
78+
}),
79+
});
80+
81+
await router.init();
82+
console.log(router.getCorpusStats()); // graph/deepResearch/rerank runtime modes become active
83+
```
84+
85+
## Bundled Platform Knowledge
86+
87+
The QueryRouter ships with **243 pre-built knowledge entries** that cover the entire AgentOS platform surface. These entries are auto-loaded at startup and merged into the corpus alongside your project docs — no configuration required.
88+
89+
### What's Included
90+
91+
| Category | Count | Examples |
92+
|----------|-------|---------|
93+
| **Tools** | 105 | All channel adapters, productivity tools, orchestration tools |
94+
| **Skills** | 79 | Every curated skill from the skills registry |
95+
| **FAQ** | 30 | "How do I add voice?", "What models are supported?", "Does AgentOS support streaming?" |
96+
| **API** | 14 | generateText(), streamText(), agent(), agency(), embedText(), generateImage() |
97+
| **Troubleshooting** | 15 | Missing API keys, model not found, embedding init failures |
98+
99+
### How It Works
100+
101+
Platform knowledge is loaded from `knowledge/platform-corpus.json` inside the `@framers/agentos` package. During `init()`, these entries are converted to `CorpusChunk` objects and appended to the user corpus. Both the vector index and the keyword fallback index cover platform entries, so they work regardless of whether an embedding API key is available.
102+
103+
The platform knowledge layer sits beneath your project documentation:
104+
105+
```
106+
User project docs (your ./docs, ./guides, etc.)
107+
+ Platform knowledge (243 entries — tools, skills, FAQ, API, troubleshooting)
108+
+ GitHub repos (optional — indexed asynchronously after init)
109+
= Complete corpus
110+
```
111+
112+
This means an agent can answer questions like "What vector stores does AgentOS support?" or "How do I set up a Bluesky channel?" without any project-specific documentation — the answer comes from the bundled platform knowledge.
113+
114+
### Configuration
115+
116+
Platform knowledge is enabled by default. To disable it:
117+
118+
```typescript
119+
const router = new QueryRouter({
120+
knowledgeCorpus: ['./docs'],
121+
includePlatformKnowledge: false,
122+
});
123+
```
124+
125+
### Regenerating Platform Knowledge
126+
127+
If you are contributing to AgentOS and need to update the bundled knowledge:
128+
129+
```bash
130+
npm run build:knowledge
131+
```
132+
133+
This regenerates `knowledge/platform-corpus.json` from the current tool manifests, skill registry, FAQ sources, and API documentation.
134+
135+
## Config Notes
136+
137+
- `knowledgeCorpus` is required.
138+
- `init()` throws if `knowledgeCorpus` resolves to zero readable `.md` / `.mdx` sections.
139+
- `availableTools` is optional and is only used to help the classifier reason about what the runtime can do.
140+
- `apiKey` / `baseUrl` configure classifier and generator LLM calls. When omitted, QueryRouter prefers `OPENAI_API_KEY` and falls back to `OPENROUTER_API_KEY` with the OpenRouter compatibility base URL.
141+
- `embeddingApiKey` / `embeddingBaseUrl` override only the embedding path when vector retrieval should use a different provider or credential. When omitted, embeddings fall back through `apiKey`, then `OPENAI_API_KEY`, then `OPENROUTER_API_KEY`.
142+
- `githubRepos` optionally enables non-blocking GitHub corpus indexing after `init()`. Newly indexed repo chunks are merged back into the live corpus, keyword fallback, classifier topics, and the vector index when embeddings are active.
143+
- `deepResearchEnabled` controls whether the tier-3 research branch is attempted; the default core implementation is a local-corpus heuristic, and hosts can still inject a real web-backed implementation.
144+
- `onClassification` and `onRetrieval` are hooks for consumers that want lightweight runtime integration without reading the full event stream.
145+
- `router.getCorpusStats()` returns a `QueryRouterCorpusStats` snapshot with configured path count, loaded chunk/topic/source counts, whether retrieval is running in `vector+keyword-fallback` or `keyword-only` mode, the embedding health field `embeddingStatus`, and the runtime-truth fields `graphRuntimeMode`, `rerankRuntimeMode`, and `deepResearchRuntimeMode`.
146+
- `embeddingStatus: 'active'` means the vector index initialized successfully, `'disabled-no-key'` means init stayed keyword-only because no embedding credential was available, and `'failed-init'` means embedding bootstrap was attempted but failed and the router fell back to keyword-only mode.
147+
- `graphRuntimeMode: 'heuristic'` means the built-in same-document / heading-overlap expansion is active; `'active'` is reserved for a future wired graph expansion service or a host-injected hook.
148+
- `rerankRuntimeMode: 'heuristic'` means the built-in lexical reranker is active; `'active'` is reserved for a future wired reranker service.
149+
- `deepResearchRuntimeMode: 'heuristic'` means the built-in local-corpus synthesis pass is active; `'active'` is reserved for a host-injected or future provider-backed research runtime.
150+
151+
## Result Metadata
152+
153+
`QueryResult` includes:
154+
155+
- `classification`: the final classification result
156+
- `sources`: citations built from retrieved chunks
157+
- `tiersUsed`: the tiers actually exercised after fallbacks
158+
- `fallbacksUsed`: retrieval/classification fallback strategy names such as `keyword-fallback` or `research-skip`
159+
- `durationMs`: total end-to-end wall-clock time for classification, retrieval, and generation
160+
161+
## Events
162+
163+
The router records typed events for:
164+
165+
- `classify:start`
166+
- `classify:complete`
167+
- `classify:error`
168+
- `retrieve:*`
169+
- `research:*`
170+
- `generate:*`
171+
- `route:complete`
172+
173+
These events are intended for observability, audit trails, and future workbench/runtime inspection surfaces.
Lines changed: 187 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,187 @@
1+
/**
2+
* @fileoverview Integration tests for the bundled platform knowledge corpus.
3+
*
4+
* These tests exercise the REAL `knowledge/platform-corpus.json` file (not
5+
* mocked) to verify structural integrity, category coverage, specific entry
6+
* existence, and keyword-based searchability.
7+
*
8+
* No LLM calls or embedding APIs are needed — all assertions are against the
9+
* static corpus file and the KeywordFallback engine.
10+
*
11+
* @module @framers/agentos/query-router/__tests__/platform-knowledge.integration
12+
*/
13+
14+
import { existsSync, readFileSync } from 'node:fs';
15+
import { dirname, join, resolve } from 'node:path';
16+
import { fileURLToPath } from 'node:url';
17+
import { describe, expect, it, beforeAll } from 'vitest';
18+
19+
import { KeywordFallback } from '../KeywordFallback.js';
20+
import type { CorpusChunk } from '../types.js';
21+
22+
// ---------------------------------------------------------------------------
23+
// Locate the real platform corpus
24+
// ---------------------------------------------------------------------------
25+
26+
const MODULE_DIR = dirname(fileURLToPath(import.meta.url));
27+
28+
/** Candidate paths where the corpus file may live relative to this test file. */
29+
const CORPUS_CANDIDATES = [
30+
// From src/query-router/__tests__/ -> knowledge/
31+
resolve(MODULE_DIR, '../../../knowledge/platform-corpus.json'),
32+
// From dist/query-router/__tests__/ -> knowledge/
33+
resolve(MODULE_DIR, '../../../../knowledge/platform-corpus.json'),
34+
];
35+
36+
/** Resolved path to the platform corpus, or null if not found. */
37+
const corpusPath = CORPUS_CANDIDATES.find((p) => existsSync(p)) ?? null;
38+
39+
// ---------------------------------------------------------------------------
40+
// Types for raw corpus entries
41+
// ---------------------------------------------------------------------------
42+
43+
interface PlatformCorpusEntry {
44+
id: string;
45+
heading: string;
46+
content: string;
47+
category: string;
48+
}
49+
50+
// ---------------------------------------------------------------------------
51+
// Test suite
52+
// ---------------------------------------------------------------------------
53+
54+
describe('Platform Knowledge Corpus — integration', () => {
55+
let entries: PlatformCorpusEntry[];
56+
let chunks: CorpusChunk[];
57+
let fallback: KeywordFallback;
58+
59+
beforeAll(() => {
60+
expect(corpusPath).not.toBeNull();
61+
const raw = readFileSync(corpusPath!, 'utf-8');
62+
entries = JSON.parse(raw) as PlatformCorpusEntry[];
63+
64+
// Convert to CorpusChunk format (same transform as QueryRouter.loadPlatformKnowledge)
65+
chunks = entries.map((entry) => ({
66+
id: entry.id,
67+
heading: entry.heading,
68+
content: entry.content,
69+
sourcePath: `platform:${entry.category}/${entry.id}`,
70+
}));
71+
72+
fallback = new KeywordFallback(chunks);
73+
});
74+
75+
// =========================================================================
76+
// Structural integrity
77+
// =========================================================================
78+
79+
it('contains at least 200 entries', () => {
80+
expect(entries.length).toBeGreaterThanOrEqual(200);
81+
});
82+
83+
it('has all 5 expected categories', () => {
84+
const categories = new Set(entries.map((e) => e.category));
85+
expect(categories).toContain('tools');
86+
expect(categories).toContain('skills');
87+
expect(categories).toContain('faq');
88+
expect(categories).toContain('api');
89+
expect(categories).toContain('troubleshooting');
90+
});
91+
92+
it('every entry has non-empty id, heading, content, and category', () => {
93+
for (const entry of entries) {
94+
expect(entry.id).toBeTruthy();
95+
expect(entry.heading).toBeTruthy();
96+
expect(entry.content).toBeTruthy();
97+
expect(entry.category).toBeTruthy();
98+
}
99+
});
100+
101+
// =========================================================================
102+
// Specific entry existence
103+
// =========================================================================
104+
105+
it('contains the generateText() API entry', () => {
106+
const match = entries.find((e) => e.id === 'api:generateText');
107+
expect(match).toBeDefined();
108+
expect(match!.heading).toContain('generateText');
109+
expect(match!.category).toBe('api');
110+
});
111+
112+
it('contains the "How do I add voice?" FAQ entry', () => {
113+
const match = entries.find((e) => e.id === 'faq:add-voice');
114+
expect(match).toBeDefined();
115+
expect(match!.heading.toLowerCase()).toContain('voice');
116+
expect(match!.category).toBe('faq');
117+
});
118+
119+
it('contains the document-export tool reference', () => {
120+
const match = entries.find((e) => e.id === 'tool-ref:com.framers.productivity.document-export');
121+
expect(match).toBeDefined();
122+
expect(match!.category).toBe('tools');
123+
});
124+
125+
it('contains the streamText() API entry', () => {
126+
const match = entries.find((e) => e.id === 'api:streamText');
127+
expect(match).toBeDefined();
128+
expect(match!.heading).toContain('streamText');
129+
expect(match!.category).toBe('api');
130+
});
131+
132+
it('contains the "What models are supported?" FAQ entry', () => {
133+
const match = entries.find((e) => e.id === 'faq:supported-models');
134+
expect(match).toBeDefined();
135+
expect(match!.category).toBe('faq');
136+
});
137+
138+
// =========================================================================
139+
// Keyword fallback search
140+
// =========================================================================
141+
142+
it('finds document-export when searching "PDF generation"', () => {
143+
const results = fallback.search('PDF generation document export', 10);
144+
expect(results.length).toBeGreaterThan(0);
145+
const ids = results.map((r) => r.id);
146+
const hasDocExport = ids.some(
147+
(id) => id.includes('document-export') || id.includes('pdf')
148+
);
149+
expect(hasDocExport).toBe(true);
150+
});
151+
152+
it('finds FAQ entry when searching "what models are supported"', () => {
153+
const results = fallback.search('what models are supported', 10);
154+
expect(results.length).toBeGreaterThan(0);
155+
const ids = results.map((r) => r.id);
156+
const hasFaq = ids.some((id) => id.includes('faq:'));
157+
expect(hasFaq).toBe(true);
158+
});
159+
160+
it('finds streamText API entry when searching "streaming"', () => {
161+
const results = fallback.search('streaming text generation', 10);
162+
expect(results.length).toBeGreaterThan(0);
163+
const ids = results.map((r) => r.id);
164+
const hasStream = ids.some(
165+
(id) => id.includes('streamText') || id.includes('streaming')
166+
);
167+
expect(hasStream).toBe(true);
168+
});
169+
170+
it('finds voice-related entries when searching "voice pipeline"', () => {
171+
const results = fallback.search('voice pipeline speech recognition', 10);
172+
expect(results.length).toBeGreaterThan(0);
173+
const ids = results.map((r) => r.id);
174+
const hasVoice = ids.some(
175+
(id) => id.includes('voice') || id.includes('stt') || id.includes('tts')
176+
);
177+
expect(hasVoice).toBe(true);
178+
});
179+
180+
it('returns results with valid relevance scores', () => {
181+
const results = fallback.search('authentication tokens', 5);
182+
for (const result of results) {
183+
expect(result.relevanceScore).toBeGreaterThanOrEqual(0);
184+
expect(result.relevanceScore).toBeLessThanOrEqual(1);
185+
}
186+
});
187+
});

0 commit comments

Comments
 (0)