feat(worker-sync): dispatch sync_graph_task via registry (Phase 5 #51) by charlie83Gs · Pull Request #287 · openktree/knowledge-tree

charlie83Gs · 2026-04-21T18:40:21Z

Summary

Closes fix(worker-nodes): remove cross-worker import of kt_worker_search #51 (P5: wire providers into workers)
First Phase-5 consumer wiring: sync_graph_task routes through the registry-resolved SyncProvider declared by the graph's composition.
Mirrors the P4 fact-decomposition wiring pattern from feat(frontend): earth-tone theme, dark/light toggle, synthesis page improvements #44.

What changes

sync_graph_task (services/worker-sync/workflows/sync.py) resolves SyncProvider via kt_hatchet.composition.resolve_sync_provider(state, graph_id):
- Provider found → await provider.initialize(services) + await provider.sync_cycle(ctx) with session factories + shared singletons packed into SyncContext.options.
- Provider None (rolling deployment, plugin not loaded) → falls back to direct SyncEngine construction. Keeps sync draining.
kt_hatchet.composition.resolve_sync_provider helper added alongside resolve_fact_decomposition_provider — same fail-fast posture, same WARN-on-unresolved-id tolerant path.
DefaultSyncProvider (kt-plugin-be-sync) unpacks the options dict into SyncEngine under the hood — observable behaviour matches legacy path exactly. Plugin boundary is a registration adapter, not a code move.

Scope note

This PR wires the sync provider only. The other P5 providers (definition, source-cache, source-contribution) still run through their legacy call sites. Follow-up PRs will thread the remaining providers (same helper pattern, separate consumer sites). Dimensions / relations have ABC-vs-factory-signature tension that needs design work before wiring.

Test plan

uv run --project libs/kt-hatchet pytest -x -v (71 green — 4 new tests for resolve_sync_provider: registry round-trip, None-graph-id short-circuit, empty-composition short-circuit, WARN on unresolved id)
uv run --project services/worker-sync pytest -x -v (18 green)
uv run --frozen ruff format --check . / ruff check .
Pre-push hook ran full unit test suite — all green

🤖 Generated with Claude Code

First Phase-5 consumer wiring: sync_graph_task in services/worker-sync/workflows/sync.py now routes through the registry-resolved SyncProvider declared by the graph's composition. Wiring pattern (mirrors the P4 fact-decomposition wiring on PR #44): 1. Task entry calls kt_hatchet.composition.resolve_sync_provider(state, graph_id). 2. When a provider resolves, the task calls provider.initialize(services) + provider.sync_cycle(SyncContext) with session factories + shared singletons (embedding service, qdrant client) packed into options. 3. When no provider resolves (worker hasn't loaded the plugin, or the task runs without a graph context), the task falls back to constructing SyncEngine directly — keeps rolling deployments safe. The DefaultSyncProvider (kt-plugin-be-sync) unpacks the options dict into a SyncEngine under the hood, so observable behaviour matches the legacy path exactly — the plugin boundary is a registration adapter in this PR, not a code move. resolve_sync_provider helper lives in kt_hatchet.composition alongside resolve_fact_decomposition_provider so every worker applies the same fallback rules + fail-fast semantics (WARN on unresolved id; raise on transport / state errors). Tests: 4 new composition helper tests (registry lookup round-trip, None-graph-id short-circuit, empty-composition short-circuit, WARN on unresolved id). 18 worker-sync unit tests still green.

Non-blocking review items on the sync wiring PR: #4 Replace `assert worker_state.services is not None` with an explicit `if ... raise RuntimeError` so the invariant holds under `python -O` (which strips asserts) and we never dispatch against a silently-None services container. #5 Dedup the 'no plugin contributes this id' WARN from `resolve_sync_provider` per `(graph_id, provider_id)` pair. Sync dispatch is a cron that runs once per graph every minute; without dedup a rolling deployment fills the log with N×60 duplicate lines per hour per missing plugin. First occurrence still fires at WARNING; subsequent occurrences drop to DEBUG so operators can still see them under a verbose logger but don't get paged on spam. Set stays bounded by (graph_id, provider_id) cardinality (handful × handful) — no eviction needed; a worker restart re-warns once, which is the signal operators want. #6 Surface `SyncResult.failures` in the task log line + emitted event. Legacy path doesn't have a failures counter (engine's dead-letter inserts are logged inside the engine), so the field is 0 on the legacy branch and populated from the provider result on the registry branch. No behaviour change on legacy; parity for the provider contract. #7 Move the `from kt_worker_sync.sync_engine import SyncEngine` import inside the legacy-path branch — the provider-driven path never needs it, so we skip the import on workers whose plugin is registered. Micro-optimisation; clearer at the call site too. Skipped: - #1 PR body overstates scope: will amend body on GitHub instead. - #2 ABC lifecycle mismatch (init once vs per-task): flagged for a Phase-5 follow-up. Cache-on-WorkerState design needs more thought than fits in a review pass. - #3 options-dict workaround: real design tension — per-graph session factories don't fit the `initialize(services)` once-per- worker contract. Phase-5 follow-up. Tests: 15 composition helper tests (2 new dedup tests — one asserting repeated resolves produce a single WARN, one asserting distinct keys warn independently), 18 worker-sync unit tests.

charlie83Gs · 2026-04-21T19:52:08Z

Applied review feedback in commit f91a07f:

#4 Replaced assert worker_state.services is not None with explicit if ... raise RuntimeError — holds under python -O.

#5 Deduped 'no plugin contributes this id' WARN in resolve_sync_provider per (graph_id, provider_id). First occurrence at WARNING, subsequent at DEBUG. Set bounded by key cardinality; worker restart re-warns once. Two new tests: repeated resolves → 1 WARN; distinct keys → N WARNs.

#6 SyncResult.failures now surfaced in the task log line + emitted sync_complete event. Legacy path reports failures=0 (engine's dead-letter inserts are logged inside the engine); provider path propagates result.failures.

#7 Moved SyncEngine import inside the legacy-path branch — provider-resolved runs skip the import.

Scope notes (follow-ups, not blockers):

ci: add CI/CD pipelines for testing, releases, and Docker builds #1 PR body revised here in spirit (would edit body but gh pr edit hit a GraphQL issue on this repo): wiring is consumer-side only. Until kt-plugin-be-sync ships, get_sync_provider("default") always returns None and the legacy path always runs. Merging this PR first is safe because the fallback IS the observable path today.
fix: resolve cross-worker test imports and enforce unit/integration test boundaries #2 ABC lifecycle mismatch (init-once vs per-task): Phase-5 follow-up. Cache-on-WorkerState design needs more thought than fits in a review pass.
feat: integration test pipeline with containerized services #3 options-dict workaround for per-graph session factories: real design tension — Phase-5 follow-up.

Tests: 73 kt-hatchet (2 new dedup tests), 18 worker-sync. Full suite green via pre-push hook.

github-actions · 2026-04-21T20:15:55Z

Thank you for your submission, we really appreciate it. Like many open-source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution. You can sign the CLA by just posting a Pull Request Comment same as the below format.

I have read the CLA Document and I hereby sign the CLA

_{You can retrigger this bot by commenting recheck in this Pull Request.}_{Posted by the CLA Assistant Lite bot.}

charlie83Gs added 2 commits April 21, 2026 12:39

charlie83Gs merged commit def1fb3 into main Apr 21, 2026
14 checks passed

charlie83Gs deleted the feat/p5-wire-sync-registry branch April 21, 2026 20:15

charlie83Gs mentioned this pull request Apr 21, 2026

feat(worker-synthesis): dispatch synthesizer_wf + super_synthesizer_wf via registry (Phase 6 #54) #289

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(worker-sync): dispatch sync_graph_task via registry (Phase 5 #51)#287

feat(worker-sync): dispatch sync_graph_task via registry (Phase 5 #51)#287
charlie83Gs merged 2 commits intomainfrom
feat/p5-wire-sync-registry

charlie83Gs commented Apr 21, 2026

Uh oh!

charlie83Gs commented Apr 21, 2026

Uh oh!

Uh oh!

github-actions Bot commented Apr 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

charlie83Gs commented Apr 21, 2026

Summary

What changes

Scope note

Test plan

Uh oh!

charlie83Gs commented Apr 21, 2026

Uh oh!

Uh oh!

github-actions Bot commented Apr 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant