feat: replace orchestrator/query with synthesis agents by charlie83Gs · Pull Request #39 · openktree/knowledge-tree

charlie83Gs · 2026-03-26T01:09:53Z

Summary

Major architectural refactoring that removes the deprecated orchestrator, query, and conversation systems and replaces them with a new synthesis-based research flow.

Removed (~10,880 lines):

Legacy orchestrator agents (wave planner, scope planner, sub-explorer)
Query mode (worker-query, QueryAgent, query tools)
Conversation system (worker-conversations, follow-up turns, resynthesis)
Chat/query/answer frontend UI (ChatPanel, QueryBar, ConversationHistory, etc.)
Renamed worker-orchestrator → worker-bottomup

Added (~3,645 lines):

worker-synthesis — new service with:
- SynthesizerAgent (extends BaseAgent) with 8 navigation tools matching the MCP synthesizer pattern: search_graph, search_facts, get_node, get_edges, get_facts, get_dimensions, get_fact_sources, get_node_paths
- SuperSynthesizerAgent for multi-scope meta-synthesis (reconnaissance → parallel dispatch → combine)
- Document processing pipeline (sentence splitting, embedding, node text linking, fact embedding linking)
- System prompts adapted from the reference MCP agent with 4-phase investigation strategy
Data model — SynthesisSentence, SentenceFact, SentenceNodeLink, SynthesisChild tables + migrations; supersynthesis NodeType; Visibility enum; visibility/creator_id on Node
API endpoints — 9 routes for synthesis CRUD (create, list, get document, get sentence facts, get nodes, delete, update visibility)
Frontend document view — /syntheses list page, /syntheses/[id] detail page with sentence-level fact panel, node list, sub-synthesis list

Test plan

Run just test-all to verify no backend regressions from cleanup
Run cd frontend && pnpm lint && pnpm type-check && pnpm test for frontend
Run uv sync --all-packages to verify workspace resolution with new package
Verify just worker starts without import errors
Test synthesis creation via POST /api/v1/syntheses
Test synthesis document retrieval via GET /api/v1/syntheses/{id}
Test super-synthesis creation via POST /api/v1/super-syntheses
Verify frontend /syntheses page renders
Verify frontend /syntheses/[id] document view renders with sentence interaction

🤖 Generated with Claude Code

Phase 0A: Delete legacy orchestrator agents directory - Move ScopePlan, ScopeBriefing, WaveAccumulator to bottom_up/state.py - Move scout_impl to bottom_up/scout.py - Move wave_planner to bottom_up/wave_planner.py - Delete agents/ and prompts/ directories - Delete agent-related tests Phase 0B: Delete query mode and conversations - Delete worker-query service entirely - Delete worker-conversations service entirely - Remove from worker-all imports and workflow registrations - Remove conversations router and endpoints from API - Remove ConversationState from kt-agents-core - Remove synthesize_answer_impl (unused after query removal) - Remove conversation tests from test_api.py Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- Delete components/chat/ (ChatPanel, ChatInput, ChatMessage, etc.) - Delete components/answer/ (AnswerView, ConfidenceIndicator, etc.) - Delete components/query/ (QueryBar, QueryBudgetControls, QuickActions) - Delete useConversation hook and conversation/[id] page - Delete ResearchNodeDialog (depended on conversation flow) - Inline ConfidenceIndicator as Badge in DimensionsTab and ConvergenceTab - Replace home page query UI with simple landing page - Rename sidebar "Query" to "Home" - Update research components to not navigate to /conversation/ Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- Rename service directory and Python module - Update all internal imports (kt_worker_orchestrator -> kt_worker_bottomup) - Update pyproject.toml package name and dependencies across workspace - Update docker-compose.yml, Dockerfile, justfile, helm charts, CI workflows - Remove worker-query and worker-conversations from docker-compose, CI, helm - Fix docstring references in worker-nodes composite pipeline Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Phase 1: Data model changes for synthesis/supersynthesis features - Add supersynthesis NodeType and Visibility enum to kt-config - Add ALL_SYNTHESES_ID and ALL_SUPERSYNTHESES_ID default parents - Add visibility and creator_id columns to Node model (graph-db) - Add visibility and creator_id columns to WriteNode model (write-db) - Add SynthesisSentence model (sentences in a synthesis document) - Add SentenceFact model (sentence-to-fact links by embedding distance) - Add SentenceNodeLink model (sentence-to-node links by text match) - Add SynthesisChild model (supersynthesis-to-synthesis links) - Add graph-db migration (zzae) for new tables and columns - Add write-db migration (w030) for visibility/creator_id on write_nodes - Add SynthesisDocumentRepository with full CRUD operations - Add SynthesizerInput/Output and SuperSynthesizerInput/Output to kt-hatchet Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Phase 2: Create the worker-synthesis service New service: services/worker-synthesis/ - SynthesizerAgent (extends BaseAgent) with 8 navigation tools: search_graph, search_facts, get_node, get_edges, get_facts, get_dimensions, get_fact_sources, get_node_paths - finish_synthesis tool for document submission - System prompt adapted from MCP synthesizer agent with 4-phase investigation strategy and core principles - Document processing pipeline: split sentences, embed, link nodes by text match, link facts by embedding similarity - Hatchet workflow (synthesizer_wf) that creates synthesis node, runs agent, and processes the document - Registered in worker-all and justfile Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Phase 3: SuperSynthesizer for multi-scope meta-synthesis - SuperSynthesizerAgent (extends BaseAgent) with tools: read_synthesis, get_synthesis_nodes, search_graph, get_node, finish_super_synthesis - System prompt adapted from super-synthesize skill with cross-pollination, thematic reorganization, meta-pattern detection - 3-task Hatchet workflow (super_synthesizer_wf): 1. reconnaissance: LLM plans 3-7 thematic scopes from graph search 2. run_sub_syntheses: dispatches synthesizer_wf in parallel 3. combine: runs SuperSynthesizerAgent to produce meta-synthesis - Registered in worker-all and worker-synthesis __main__ Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Phase 4: Backend API endpoints - POST /api/v1/syntheses — dispatch synthesizer_wf - POST /api/v1/super-syntheses — dispatch super_synthesizer_wf - GET /api/v1/syntheses — list syntheses (paginated, visibility filter) - GET /api/v1/syntheses/{id} — full document with sentences, fact counts, node links, sub-syntheses (for supersynthesis) - GET /api/v1/syntheses/{id}/sentences/{pos}/facts — facts for sentence grouped by source - GET /api/v1/syntheses/{id}/nodes — all referenced nodes - DELETE /api/v1/syntheses/{id} — delete synthesis - PATCH /api/v1/syntheses/{id} — update visibility Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Phase 5: Frontend synthesis document viewer New pages: - /syntheses — list all syntheses with create button - /syntheses/[id] — full document view New components: - SynthesisDocument — main document renderer with sentence-level interactivity (hover for fact count, click to open fact panel) - SynthesisFactPanel — right panel showing facts grouped by source for selected sentence - SynthesisNodeList — bottom section with all referenced nodes - SubSynthesisList — child syntheses for supersynthesis docs - CreateSynthesisDialog — form for creating new synthesis Updates: - Add synthesis TypeScript types to types/index.ts - Add synthesis API methods to lib/api.ts - Add "Syntheses" to sidebar navigation - Update home page with syntheses card Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Phase 6: Integration - Add worker-synthesis Dockerfile - Add worker-synthesis to docker-compose.yml - Add worker-synthesis to CI build matrix - Update CLAUDE.md comprehensively: - Update project description (document-based, not chat-based) - Replace worker-orchestrator/query/conversations with worker-bottomup and worker-synthesis - Update workflow documentation (synthesizer_wf, super_synthesizer_wf) - Update agent architecture (SynthesizerAgent, SuperSynthesizerAgent) - Add synthesis/supersynthesis node types - Update import names, scopes, code location tables - Update research flow description - Remove conversation/query references throughout Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- Regenerate uv.lock after removing worker-query/worker-conversations packages (uv sync --frozen was failing on deleted distributions) - Remove stale kt-worker-query and kt-worker-conversations from api/pyproject.toml dependencies and sources - Add kt-worker-synthesis to api/pyproject.toml - Fix frontend lint errors: - Replace <a> tags with <Link> from next/link (page.tsx, syntheses/[id]/page.tsx, ResearchBuildProgress.tsx) - Fix setState-in-effect pattern in syntheses/[id]/page.tsx (use useCallback + separate fetch function) - Remove unused Search import from Sidebar.tsx - Remove unused researchDialogOpen state and Research button from NodeDetailPanel.tsx - Remove unused result variable in SourceUploadForm.tsx Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- Auto-fix 16 ruff import sorting issues across synthesis service - Add 'supersynthesis' to expected node types in test_sanity.py Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- Run ruff format on 9 files that had formatting issues - Clean up leftover empty dirs from deleted services (worker-conversations, worker-query, worker-orchestrator) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…chet parents Hatchet SDK v1 expects parent task references as function objects, not string names. Fix parents= and ctx.task_output() calls in super_synthesizer_wf. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

The zzae migration was pointing to zzad as its parent, but ggg8h9i0j1k2 and hhh9i0j1k2l3 were added after zzad on main, creating a branch with two heads. Fix by setting zzae's down_revision to hhh9i0j1k2l3 (the actual latest head). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

… dialog - Center list and detail pages with mx-auto + px-4 - Add synthesis type selector (Synthesis vs Super-Synthesis) to the create dialog with visual card-style toggle - Regular synthesis shows exploration budget control - Super-synthesis shows explanation of automatic scope planning - Both types share topic and visibility inputs Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

The request() function already prepends API_PREFIX (/api/v1), so the synthesis API functions should use bare paths like /syntheses, not /api/v1/syntheses (which resulted in /api/v1/api/v1/syntheses -> 404). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Use hatchet.runs.aio_create(workflow_name=...) instead of importing worker modules directly. This follows best practice — the API should not depend on worker packages. Workers communicate via Hatchet only. - Remove kt-worker-synthesis from api pyproject.toml dependencies - Use hatchet.runs.aio_create("synthesizer_wf", input={...}) - Use hatchet.runs.aio_create("super_synthesizer_wf", input={...}) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…imports Add dispatch_workflow() and run_workflow() helpers to kt_hatchet.client that dispatch workflows by name string via hatchet.runs.aio_create(). Refactor all API endpoints to use these helpers instead of importing worker modules directly. This eliminates cross-service dependencies between the API and worker packages (except kt-worker-ingest which is still imported for pipeline processing functions). Files changed: - kt_hatchet/client.py: add dispatch_workflow(), run_workflow() - syntheses.py: already using dispatch_workflow - graph_builder.py: auto_build_task - edges.py: enrich_edge_task - sources.py: reingest_source_wf (sync wait via run_workflow) - nodes.py: rebuild_node_task, node_pipeline_wf, regenerate_composite_task - seeds.py: build_composite_task (sync), node_pipeline_wf - research.py: ingest_confirm_wf, ingest_decompose_wf, ingest_build_wf, bottom_up_prepare_wf, agent_select_wf - api/pyproject.toml: remove kt-worker-bottomup, kt-worker-nodes deps Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- dispatch_workflow() now raises RuntimeError with a clear message when Hatchet rejects the request (e.g. no active worker) - Synthesis endpoints catch RuntimeError and return 503 with actionable error message ("Ensure the worker is running") - run_workflow() reuses dispatch_workflow() to avoid duplication Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

The runs.aio_create() v1 REST API returns 500 errors in the current Hatchet version. Switch dispatch_workflow() and run_workflow() to use admin.aio_run_workflow() instead, which is the same gRPC-based method that Workflow.aio_run_no_wait() uses internally and works correctly. Key difference: admin.aio_run_workflow takes JSON-serialized input (string) instead of a dict, and uses the gRPC v0 client. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…d bug - Set max_tokens=32000 for both SynthesizerAgent and SuperSynthesizerAgent (default was 1000, causing the agent to hang when generating documents) - Add get_model_kwargs() hook to BaseAgent for subclass overrides - Fix search_facts: EmbeddingService has embed_batch(), not embed() Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

GraphEngine.create_node() doesn't accept node_id or definition. Use create_node() then set_node_definition() separately, matching the pattern used by the existing node pipeline. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Replace the 4 synthesis tables (synthesis_sentences, sentence_facts, sentence_node_links, synthesis_children) with a JSON document stored in the WriteNode.metadata_ field under "synthesis_document". This follows the project's dual-database architecture: - Pipeline writes JSON to write-db metadata via GraphEngine - Sync worker propagates metadata to graph-db - API reads from graph-db Node.metadata_ Changes: - Rewrite document_processing.py to return JSON dict instead of DB records - Update workflows to store doc in WriteNode.metadata_ - Update API to read from node metadata - Remove synthesis table models and repository - Simplify zzae migration to only add visibility/creator_id columns - Update frontend for flat fact responses Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Keep the synthesis document JSON in metadata_ (no separate tables), but strip it from network responses in _build_node_response(). This means wiki, node lists, and search results don't transfer the ~50KB document blob. Only the /syntheses/{id} endpoint reads and returns it. - Add _strip_synthesis_doc() to nodes.py — filters metadata before response - Remove SynthesisDocument and WriteSynthesisDocument models (not needed) - Keep the JSON-in-metadata approach: simple, follows write-db -> sync pattern Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Clicking a sentence in the synthesis document now shows a right panel with: - The sentence text (stripped of markdown links) - Related nodes (clickable links to /nodes/{id}) resolved from the referenced_nodes lookup - Closest facts (clickable links to /facts/{id}) with embedding similarity percentage All data comes from the already-loaded document response — no extra API calls needed. The badge on each sentence shows counts like "2n 5f" (2 nodes, 5 facts). Changes: - Add fact_links (fact_id + distance) to SynthesisSentenceResponse in both API and frontend types - Rewrite SynthesisDocument to show inline detail panel - Also parse fact links [text](/facts/uuid) in sentence text - Remove SynthesisFactPanel (no longer needed) - Remove getSentenceFacts API call (data is already in the response) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Initial document load now only includes sentence text, fact_count, and node_ids (lightweight). Fact links (fact_id + distance) are fetched on demand when a sentence is clicked via the /syntheses/{id}/sentences/{position}/facts endpoint. This gives a more responsive initial experience — the document renders immediately, and fact details load only when needed. - Remove fact_links from SynthesisSentenceResponse (API + types) - Restore getSentenceFacts API call for lazy loading - Show loading spinner while fetching facts - Node list still loads instantly (already in the document response) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Add node_ids filter to QdrantFactRepository.search_similar() using MatchAny on the node_ids payload field. The synthesis pipeline now only searches facts linked to nodes the agent visited, keeping results relevant and improving search performance. - Add node_ids parameter to search_similar() and _build_filter() - Import MatchAny from qdrant_client.models - Fix argument order in link_facts_by_embedding call - Fix r.id -> r.fact_id for FactSearchResult Note: lefthook pyright check fails on pre-existing errors in base.py, engine.py, results.py — not from this change. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Replace plain text sentence rendering with react-markdown that renders the full definition with proper headings, paragraphs, bold, links, and lists. Sentence-level interactivity is overlaid via custom markdown components that match text nodes to sentences from the processed document data. - Use react-markdown with custom p, li, a, h1-h3 components - InteractiveText component wraps text nodes with click handlers and badges when they match a sentence with facts/nodes - Internal links (/nodes/*, /facts/*) render as Next.js Links - Sentence matching uses first 80 chars as key for fuzzy matching - Preserves lazy-loading of fact links on sentence click Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Replace per-text-node sentence matching (which broke on inline links/bold) with paragraph-level matching. Each paragraph is matched to the sentences it contains by comparing stripped plain text. - buildParagraphSentenceMap strips markdown formatting from both paragraph and sentence text before matching - Entire paragraphs are clickable, showing aggregated fact/node counts - Right panel shows all sentences in the clicked paragraph - Each sentence is expandable to see its nodes and lazy-loaded facts - Links within paragraphs remain clickable (stopPropagation) - Headings render properly without losing content Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

… only - Headings (h1, h2, h3) are now clickable like paragraphs, showing a badge with node count when they match sentences with linked nodes - Right panel simplified: shows just the list of unique related nodes for the clicked section (no sentence cards, no fact loading) - Removed unused fact loading state and imports (Loader2, FileText, SentenceFactLink, getSentenceFacts) - All section types (p, h1-h3, li) use shared getSectionInfo helper Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Move the paragraph preview out of the scrollable area into a sticky shrink-0 section above the nodes/facts. The paragraph stays visible at the top of the dialog while scrolling through evidence below, making it easy to compare the text to its linked facts. - Subtle background (stone-50/50) to visually separate from content - max-h-24 with overflow-y-auto for long paragraphs - Show up to 400 chars (was 300) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Downloads an HTML file styled for print-to-PDF with: Document body: - Serif typography (Georgia) with academic styling - Node references as blue superscript numbers [1] - Fact citations as brown superscript numbers [1] inserted after the relevant sentence Evidence Citations section (page break): - Disclaimer about LLM fact extraction (atomic facts, pronoun replacement, minor differences from source text) - Each fact: numbered, with type badge, full content, author, source title, and source URL Annex: Node Definitions (page break): - Each referenced node: numbered, concept name, type, definition text (truncated to 800 chars) Workflow: 1. Loads full document from API 2. Lazy-loads fact details for all sentences with facts 3. Fetches node definitions from the nodes API 4. Generates structured HTML with print-optimized CSS 5. Downloads as .html file (user can print to PDF) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

New API endpoint: GET /api/v1/prompts (public, no auth) Returns all LLM system prompts used in the knowledge pipeline, grouped by stage: Synthesis, Fact Decomposition, Node Pipeline, Composite Nodes. Each prompt includes: id, name, stage, purpose, and full text. Loaded via lazy imports from worker packages (try/except for graceful degradation if a package isn't installed). Export update: - Fetches prompts from the API during export - Adds "Prompt Transparency" section after Node Annexes - Groups prompts by pipeline stage with explanatory intro - Shows exact system instructions in monospace code blocks - Truncates prompts >3000 chars with pointer to API endpoint - Intro explains why prompt transparency matters for research API dependencies: - Add kt-worker-synthesis and kt-worker-nodes to api pyproject.toml (needed for prompt string imports only, not workflow dispatch) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Section order: Body → Annexes (definitions) → Evidence → Prompts Facts moved to after annexes since they are much longer. Definitions: - Show full definition text (was truncated to 800 chars) - Render definitions as HTML via markdownToHTML Links: - Node references [N] now link to #node-N in annexes section - Fact citations [N] now link to #fact-N in evidence section - Remaining markdown links converted to <a> tags - Anchor IDs added to annex and fact entries - Styled: no underline by default, underline on hover Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Prompts change over time — truncating reduces transparency. Remove the 3000-char limit and max-height/overflow CSS so the full prompt text renders completely in the export. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

The agent makes multiple tool calls per node visit (search, get_node, get_facts, get_edges, get_dimensions), each taking 2 LangGraph steps. The old formula (budget * 6 + 20) was too tight — budget 15 gave 110 which hit the limit. New formula: budget * 20 + 50, minimum 200. - Budget 15 → 350 (was 110) - Budget 20 → 450 (was 140) - Super-synth with 5 sub-syntheses → 150 (was 60) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Users can now select previous synthesis documents to include in a super-synthesis, leveraging prior research alongside new scoped investigations. Changes: - Add existing_synthesis_ids to SuperSynthesizerInput (Hatchet model) - Add to CreateSuperSynthesisRequest (API + frontend types) - API passes existing_synthesis_ids to workflow dispatch - Workflow appends existing IDs to the sub-synthesis list before the combine step - Frontend: super-synthesis dialog loads existing syntheses with checkbox selection, shows count of selected items The super-synthesizer agent reads all sub-syntheses (new + existing) and produces a combined meta-synthesis. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

With 8 tools and thorough investigation, the agent easily needs 40-60 LangGraph steps per node visited. Budget 15 now gets 900, budget 20 gets 1200. Minimum 1000 for any run. - Synthesizer: budget * 60, min 1000 - Super-synthesizer combine: sub_count * 60, min 1000 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

When the user includes existing syntheses in a super-synthesis, the reconnaissance task now reads their definitions and includes summaries in the LLM scope planning prompt. The LLM is instructed to design scopes that COMPLEMENT the existing research, not duplicate it. - Read existing synthesis definitions (first 500 chars each) - Include them under "Existing Research — DO NOT overlap" - Prompt explicitly says to not duplicate covered topics Only user-selected syntheses are shown, not all syntheses. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Users can now set the exact number of scopes (sub-investigations) for a super-synthesis. Set to 0 (default) to let the LLM decide (3-7), or set a specific number to enforce it. - Add scope_count to SuperSynthesizerInput, API request, and types - Reconnaissance prompt uses exact count when provided - Frontend: number input in super-synthesis dialog with explanation Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

System prompt updates: - New section "CRITICAL: Use Your FULL Exploration Budget" at the top of the investigation strategy - Explicit budget allocation: 20% discovery, 50% neighbor exploration, 25% deep evidence, 5% writing - Strong emphasis on get_edges() → visit neighbors pattern - "Do NOT start writing until you have used most of your budget" Agent behavior: - post_llm_hook now detects early finish attempts (<60% budget used) and redirects the agent to explore neighbors instead - If agent tries to stop without tool calls and <70% budget used, nudges it to keep exploring with specific instructions - Explicitly suggests get_edges() and neighbor visiting in nudges This addresses the pattern where agents were only exploring 50% of their budget before writing, producing shallow syntheses. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

List page: - Sub-syntheses that belong to a supersynthesis are nested below their parent with a left border indent - Top-level only shows standalone syntheses and supersyntheses - Compact card style for nested children - Supersynthesis cards show "X sub-syntheses" count Dialog: - Default scope count changed from 0 to 5 API: - SynthesisListItem now includes sub_synthesis_ids Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

The early-finish prevention (<60% budget) was causing loops where the agent tried to finish, got blocked, tried again, got blocked... burning through the recursion limit without making progress. Changes: - Recursion limit: budget * 100, minimum 2000 (was budget * 60, min 1000) Budget 15 → 2000, Budget 20 → 2000 - Remove finish_synthesis blocking — let the agent write when ready - Reduce nudge threshold from 70% to 50% — only nudge when clearly underexplored, not when the agent has done decent investigation - System prompt already strongly encourages full budget usage Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

QdrantFactRepository.search_similar() returns FactSearchResult objects with fact_id/score/fact_type attributes, not raw Qdrant points with id/payload. Every search_facts call was failing with: 'FactSearchResult' object has no attribute 'id' This caused agents to retry search_facts repeatedly, wasting recursion steps without getting results. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

BaseAgent.agent_node now logs before each LLM call: [synthesizer] LLM call: 45/60 messages, ~12k tokens (trimmed count / total count, approximate token estimate) SynthesizerAgent.check_budget_exhaustion now logs on every iteration: [synthesizer] budget: 8/15 visited (7 remaining), messages: 45 This helps diagnose: - Token growth rate per tool call - Whether messages are being trimmed - Budget consumption patterns - Potential nudge loops (many iterations with no budget change) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Root cause: check_budget_exhaustion and post_llm_hook were causing infinite loops: 1. check_budget_exhaustion returned a HumanMessage nudge at remaining <= 3. With route_nudges_to_agent=True, this routed back to agent_node, which called check_budget_exhaustion again, which returned another nudge → infinite loop burning 2 messages per iteration. 2. post_llm_hook nudged when the LLM responded without tool calls. If the LLM kept responding with text (no tools), each nudge routed back → another text response → another nudge → loop. Fixes: - check_budget_exhaustion: only nudge at budget exhaustion, and skip if the last message is already our nudge text. Removed the remaining<=3 nudge entirely. - post_llm_hook: count recent consecutive HumanMessages. If 2+ nudges already in the last 6 messages, stop nudging and let the agent end naturally. This explains why 12 sub-syntheses hit recursion limit 2000 with only 9/12 nodes visited — the loops were consuming ~1900 of the 2000 steps with zero progress. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

The old query agent (worker-query) handled nudges correctly: 1. route_nudges_to_agent = False — nudges go to END, not back to agent, preventing infinite loops 2. Send-once pattern: scan message history for the nudge text before sending, so it's never sent twice Applied to both SynthesizerAgent and SuperSynthesizerAgent: - route_nudges_to_agent = False (was True, causing loops) - check_budget_exhaustion: scans for 'BUDGET EXHAUSTED' in message history before sending nudge (send-once pattern) - post_llm_hook: simplified since nudges now go to END Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Without nudge loops, actual step usage is predictable: ~5 tool calls per node × 2 steps each = ~10 steps/node + overhead. New formula: budget * 30, minimum 500. - Budget 12 → 500 - Budget 15 → 500 - Budget 20 → 600 Was budget * 100, min 2000 — that masked the nudge loop bug. Now if the limit is hit, it's a real problem worth investigating. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

The search_facts bug fix (b8d4b79) accidentally removed fact content and node_ids from the output — the agent could only see fact IDs and types, losing the ability to read what facts say and which nodes they connect. Now fetches fact content from the DB and node links from NodeFact after the Qdrant similarity search. Output shows: - [claim] (score=0.85) First 200 chars of fact content... nodes: [uuid1, uuid2] Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Two export buttons in the synthesis header: - HTML: downloads the formatted document as an .html file (single continuous document, good for archival) - PDF: opens the document in a new window and triggers the browser print dialog (Save as PDF option) Both generate the same HTML with all sections (body, annexes, evidence citations, prompt transparency). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- Remove unused score_map variable in search_facts tool - Remove unused FileText and X imports from SynthesisDocument - Remove unused useState import from NodeDetailPanel - Auto-format 11 files with ruff format All ruff check/format and frontend lint pass cleanly. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

React 19 types ReactElement.props as {} not any, so .children doesn't exist. Use any cast for this runtime introspection. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

… removal The conversations module was removed in PR #39 but research.py and export.py still imported _conversation_to_response from it, causing a 500 error on POST /api/v1/research/bottom-up/prepare. Inline the helper functions in research.py and update export.py to import from research instead. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

charlie83Gs and others added 30 commits March 25, 2026 15:33

fix: ruff lint fixes and update kt-config sanity test

f66ec54

- Auto-fix 16 ruff import sorting issues across synthesis service - Add 'supersynthesis' to expected node types in test_sanity.py Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

charlie83Gs and others added 23 commits March 27, 2026 08:10

ci: retrigger after billing fix

89c4456

fix: TypeScript error in extractText — ReactElement.props typed as {}

51f05d6

React 19 types ReactElement.props as {} not any, so .children doesn't exist. Use any cast for this runtime introspection. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

charlie83Gs merged commit e10059a into main Mar 27, 2026
16 checks passed

charlie83Gs deleted the synthesis-agents-refactor branch March 27, 2026 16:47

charlie83Gs added a commit that referenced this pull request Mar 27, 2026

feat: replace orchestrator/query with synthesis agents (#39)

c811448

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

This was referenced Mar 27, 2026

fix(docker): update Dockerfiles for renamed worker services #43

Merged

fix(helm): update worker deployments for synthesis/bottomup services #45

Merged

charlie83Gs mentioned this pull request Mar 27, 2026

fix(api): inline _conversation_to_response after conversations module removal #46

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: replace orchestrator/query with synthesis agents#39

feat: replace orchestrator/query with synthesis agents#39
charlie83Gs merged 89 commits intomainfrom
synthesis-agents-refactor

charlie83Gs commented Mar 26, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant