Skip to content

feat(worker-search): per-task search allowlist + fetch chain override (Phase 3 #38)#273

Merged
charlie83Gs merged 2 commits intomainfrom
feat/per-task-search-allowlist
Apr 20, 2026
Merged

feat(worker-search): per-task search allowlist + fetch chain override (Phase 3 #38)#273
charlie83Gs merged 2 commits intomainfrom
feat/per-task-search-allowlist

Conversation

@charlie83Gs
Copy link
Copy Markdown
Contributor

Summary

  • ProviderRegistry.search_all() gains optional allowlist: Iterable[str] | None — filters the registered providers to the ids named, preserves legacy behaviour when omitted.
  • FetchProviderRegistry.fetch_many() gains optional chain: list[str] | None — applies the override to every URI in the batch.
  • web_search task resolves state.services.graph_config(graph_id) and threads composition.search_providers + composition.fetch_chain into both calls. Falls back to registry defaults when no composition is available.

Why

First consumer-side wire-up for Phase 3 — proves per-task provider selection driven by GraphTypeComposition works end-to-end before rolling it into every pipeline phase.

Unknown ids in the allowlist are skipped silently so a graph type can declare providers that haven't rolled to every deployment yet. Empty allowlist → zero results (explicit over implicit).

Part of #38. Follow-ups will apply the same pattern to decompose_page_wf, ingest_build_wf, and node_pipeline_wf.

Test plan

  • uv run --project libs/kt-providers pytest libs/kt-providers/tests/test_providers.py libs/kt-providers/tests/fetch/test_registry.py -x -q — 30/30 green (4 new allowlist tests + 2 new fetch_many chain tests)
  • uv run --project libs/kt-providers pytest libs/kt-providers/tests/ --ignore=tests/integration -x -q — 170/170 green
  • uv run --project services/worker-search pytest services/worker-search/tests/ -x -q — 4/4 green
  • CI: full workspace tests

🤖 Generated with Claude Code

charlie83Gs and others added 2 commits April 20, 2026 16:02
… (Phase 3 #38)

``ProviderRegistry.search_all`` and ``FetchProviderRegistry.fetch_many``
now accept per-call ``allowlist`` / ``chain`` overrides. ``web_search``
resolves the running graph's ``GraphConfig`` via
``state.services.graph_config(graph_id)`` and threads
``composition.search_providers`` + ``composition.fetch_chain`` through
both calls. When the lookup fails or no composition is available the
workflow falls back to the registry-wide default — preserving the
legacy path for CLI callers and graphs whose plugin isn't registered
on this worker.

Unknown ids in the allowlist are skipped silently so a graph type can
declare providers that haven't rolled to every deployment yet. Empty
allowlist → no providers queried (explicit over implicit).

Tests: ``test_registry_search_all_allowlist_*`` cover filter / unknown-
id tolerance / empty-list / multi-query semantics. ``test_fetch_many_
honors_chain_override`` + default-chain guard cover the fetch side.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…elper

Applies PR #273 review:

- Remove ``except Exception`` fallback around ``state.services.graph_config``.
  Silently widening to every registered provider on a resolver hiccup
  could route search through providers the graph type never authorized
  — violates the fail-fast rule in ``CLAUDE.md`` and the authorization
  invariant of composition-based tenancy. Errors propagate; the task
  fails and operators see the bug.
- Extract ``_resolve_composition_selectors(state, graph_id)`` helper
  so the wire-up is unit-testable without a Hatchet runtime. Four new
  tests in ``test_composition_selectors.py`` cover composition →
  allowlist/chain, ``graph_id=None`` fallback, ``services=None``
  fallback, and the must-have resolver-raises-propagates assertion.
- Tighten ``search_all`` docstring: iteration is registration order,
  not composition order (dedup by URI keeps the observable ordering
  stable).
- Add ``services/worker-search/tests/conftest.py`` installing the fake
  Hatchet env so tests can import ``workflows.search`` without a real
  Hatchet client token.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@charlie83Gs
Copy link
Copy Markdown
Contributor Author

Review fixes pushed. All priority items applied:

  1. Fail-fast resolver (ci: add CI/CD pipelines for testing, releases, and Docker builds #1): removed the except Exception fallback around state.services.graph_config in web_search. A running graph with a composition MUST route through its authorized providers — silently widening on a resolver hiccup could leak search queries through providers the tenant never authorized. Errors propagate; the task fails and ops see the bug. Matches CLAUDE.md fail-fast rule.
  2. Docstring fix (fix: resolve cross-worker test imports and enforce unit/integration test boundaries #2): search_all docstring now reflects reality — iteration is registration order, not composition order. Dedup by URI keeps the visible ordering stable.
  3. Worker-search test (feat: integration test pipeline with containerized services #3): extracted _resolve_composition_selectors(state, graph_id) helper + 4 tests covering composition→allowlist/chain happy path, graph_id=None fallback, services=None fallback, and the must-have resolver-raises-propagates assertion (pinning the authorization invariant).
  4. Zero-match warn (perf: optimize CI test execution with parallelization and change detection #4, optional): skipped for now — composition validation at registration time is the better place for this (already via validate_all_graph_types); runtime warn would duplicate the signal. Can revisit if ops ask.

Other notes:

  • conftest.py added to services/worker-search/tests/ installing the fake Hatchet env so workflows.search module import doesn't require a real token. Mirrors pattern in services/worker-nodes.
  • All 8 worker-search unit tests green; 172/172 kt-providers tests green.

@charlie83Gs charlie83Gs merged commit 87ed84a into main Apr 20, 2026
28 checks passed
@charlie83Gs charlie83Gs deleted the feat/per-task-search-allowlist branch April 20, 2026 22:36
@github-actions
Copy link
Copy Markdown


Thank you for your submission, we really appreciate it. Like many open-source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution. You can sign the CLA by just posting a Pull Request Comment same as the below format.


I have read the CLA Document and I hereby sign the CLA


You can retrigger this bot by commenting recheck in this Pull Request. Posted by the CLA Assistant Lite bot.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant