Feat/postgres multi step chains#87
Merged
Merged
Conversation
The REST extractor's hardcoded "well-known param name → SearchParamType"
match silently misclassified anything outside the list (e.g. Goal.target-date
fell through to a value-shape heuristic, which mongodb's strict typed query
builder then rejected with HTTP 400). Three other call sites carried similar
duplicate tables.
Replace the lot with a single deterministic resolver:
resolve_param_type(registry, resource_type, name, values)
1. registry.get_param(resource_type, name)
2. registry.get_param("Resource", name) ← global params
3. value-shape heuristic (unregistered custom only)
Plumbed via a new SearchProvider::search_param_registry() trait method that
every backend already had as a concrete accessor; REST handlers acquire it
from state.storage(). Hardcoded match tables in mongodb's build_search_params
and sqlite's chain_builder are gone; sqlite's dead-code infer_target_type
also removed.
Effect on the failing inferno mongodb test: Goal?patient&target-date now
resolves target-date as Date deterministically, build_date_filter parses the
value correctly, returns 200. The two DiagnosticReport patient&status 400s
were never caused by type inference (status was already Token) and need
separate triage.
Net: −150 lines of redundant hardcoded mappings, +30 lines of resolver,
1 trait method, 8 new unit tests covering registry hit / Resource-base
fallback / value heuristic / target lookup. All 642 persistence and 184
rest lib tests pass; clippy clean with project CI flags.
Postgres' ChainedSearchProvider was returning Ok(Vec::new()) for any forward chain longer than 2 segments and for nested _has reverse chains, silently producing empty result sets where SQLite returned correct matches. Backend- behavior divergence with no error signal — the kind of "simplified for now" stub PR #80 set out to find. Port the proven SQLite ChainQueryBuilder pattern to Postgres: nested SELECTs over search_index, registry-driven target-type resolution via resolve_param_targets(), the same Patient-default for ambiguous multi-target references SQLite uses. Postgres-specific syntax: $N placeholders, ILIKE for case-insensitive match, SUBSTRING(... FROM POSITION('/' IN ...) + 1) in place of SUBSTR + INSTR for the reverse-chain reference-id extraction. Verification: - 6 new unit tests in chain_builder covering parsing, SQL shape, explicit type modifier, ambiguous-target inference, empty-chain error, and reverse-chain SUBSTRING/POSITION emission - 2 new testcontainers-backed integration tests porting the SQLite three- level chain and reverse-chain terminal cases — both pass against a real Postgres instance - All 609 helios-persistence lib tests still green; clippy clean
Composite's resolve_chain was delegating to a stub (resolve_chain_iterative) that returned Ok(Vec::new()) for any forward chain — silent wrong-empty results regardless of backend capability. Same anti-pattern as PR #80 ("simplified implementation...for now, just return empty"). Replace with iterative SearchQuery resolution against composite's regular search routing (the analog of how resolve_reverse_chain already works): 1. Parse the chain via the registry to resolve target type per link. Falls back to the same hardcoded inference table SQLite/Postgres use for ambiguous multi-target references — keeps behaviour aligned. 2. Search the deepest target type for the terminal param. 3. Walk back: at each step, search the parent type for resources whose reference param matches the accumulated refs (multi-value OR). 4. Outermost step returns raw IDs. Cost is O(chain_depth) queries; the inner backends apply OR semantics on multi-value reference params natively. Verification: - 3 new unit tests using MockStorage (with search provider registered) prove the implementation runs to completion for 3-segment, 2-segment, and invalid (single-part) chains. End-to-end correctness comes from inferno against real backends. - 612 helios-persistence lib tests pass; clippy clean. Drops the resolve_chain_iterative stub.
The merge from main dropped the SearchParamType import in search_query_builder.rs and left an unformatted let binding in composite/storage.rs, both flagged by CI lint.
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
mauripunzueta
added a commit
that referenced
this pull request
May 5, 2026
…expand hot path Reverts the two spawn_blocking additions to process_expand that caused the run #86 regression (264K → vs 368K wRPS baseline): - populate_concept_flags(): per-system concept flag lookup (abstract/inactive) - used-codesystem emission: code_system_version_for_url call per system The SystemVersionCache / ConceptFlagsCache infrastructure added in run #87 to partially mitigate the pool pressure is also removed from mod.rs since it is no longer needed. The trait methods (concept_expansion_flags, code_system_version_for_url) and their SQLite implementations are retained but simplified back to direct spawn_blocking form (no caching) for future use. Restores run #84 performance baseline (~368K wRPS).
mauripunzueta
added a commit
that referenced
this pull request
May 9, 2026
Two parallel fixes to recover bench parity with run #87 (commit 6492389): 1. SNOMED implicit ValueSet preflight (VC01-03, EX01-03, EX08): When packages.fhir.org ships a CodeSystem stub for SNOMED CT (content=not-present) before the RF2 import lands a second row at (url, version, content=complete), the old resolver could pick the stub and every ?fhir_vs / ?fhir_vs=isa query keyed on its empty system_id silently returned zero rows. - resolve_system_id_with_version_cached / resolve_compose_system_id: primary sort by content tier (complete/supplement < fragment < not-present), then EXISTS(concepts), then version DESC, id ASC. - validate_fhir_vs, bfs_expand_page, bfs_isa_page, ensure_implicit_cache: JOIN concepts/concept_closure/concepts_fts through code_systems.url so the lookup is robust even if the resolver still picks a stub row. - ensure_concepts_fts_for_url: build the FTS5 index for every row matching the URL so URL-keyed FTS joins always have rows. 2. LK01-04 \$lookup perf (was -77 to -86% RPS @50VU): - operations/lookup.rs: replaced search(url, count=1) blob-load and full serde_json::from_str just to read .language with a dedicated trait method code_system_language() that runs ONE json_extract query and is memoised in a process-wide cache (invalidated alongside cs_id_cache on CRUD). - backends/sqlite/code_system.rs: gated fetch_synthesised_properties behind needs_synth (only runs when caller asked for parent/child/ inactive or *), and folded its 3 queries into one prepare_cached UNION ALL with a kind_ord sort key. Same wire output, 3x fewer round-trips per \$lookup. - traits + postgres backend: parallel code_system_language() impl for trait uniformity.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.