Skip to content

refactor(search): registry-driven search-param type resolution#80

Merged
smunini merged 1 commit into
mainfrom
feat/registry-driven-search-param-types
Apr 27, 2026
Merged

refactor(search): registry-driven search-param type resolution#80
smunini merged 1 commit into
mainfrom
feat/registry-driven-search-param-types

Conversation

@smunini
Copy link
Copy Markdown
Contributor

@smunini smunini commented Apr 27, 2026

Summary

  • Eliminates the hardcoded "well-known param name → SearchParamType" lookup in the REST extractor (and three duplicates in the SQLite + MongoDB backends) in favor of a single deterministic resolver that consults the SearchParameterRegistry already loaded from data/search-parameters-{r4,r4b,r5,r6}.json.
  • Fixes the Goal?patient=…&target-date=… HTTP 400 surfaced by Inferno against the mongodb backend (Bug 2 from the recent inferno run).
  • Net -150 lines of redundant hardcoded mappings, +30 lines of resolver. Eight new unit tests.

Why

The previous flow:

REST  --infer_param_type("target-date")--> ???
        not in hardcoded list → fall through to value-shape heuristic
                              → wrong type → mongodb's strict typed query
                                builder rejects with HTTP 400

Every storage backend already owned an Arc<RwLock<SearchParameterRegistry>> populated from canonical FHIR spec data (1,375 R4 entries with the correct param_type). The REST layer simply never had access to it. Three other places (sqlite/search/chain_builder.rs, sqlite/search_impl.rs, mongodb/search_impl.rs) carried additional hardcoded fallback tables that drifted from the REST one and from each other.

What

Single source of truthcrates/persistence/src/search/registry.rs:

  • resolve_param_type(registry, resource_type, name, values) — registry-by-type → registry-by-"Resource" (global params like _lastUpdated) → value-shape heuristic (only reached for genuinely unregistered custom params)
  • resolve_param_targets(registry, resource_type, name) — for reference target resolution

Trait surfaceSearchProvider::search_param_registry() added; one-line forwarding impl on every backend (sqlite, postgres, mongodb, elasticsearch, s3, composite). All five backends already owned the field.

REST extractorbuild_search_query{,_from_map} now take &SearchParameterRegistry. The 90-line infer_param_type hardcoded match plus its infer_from_value heuristic are deleted; the extractor delegates to the shared resolver. Three handler call sites (search.rs ×2, compartment.rs ×1) acquire the registry via state.storage().search_param_registry().read(), scoped tightly so the parking_lot guard never crosses an await.

Backend cleanup:

  • mongodb's build_search_params collapses the 9-line "registry-first then hardcoded fallback" block into one resolve_param_type call; removed the now-orphan lookup_param_type helper.
  • sqlite chain_builder's infer_param_type (~15 lines) deleted.
  • sqlite search_impl's dead-code infer_target_type and its test deleted (was #[allow(dead_code)], called nowhere in production).

Test plan

  • 642 helios-persistence lib tests pass with --features R4,sqlite,postgres,elasticsearch,mongodb.
  • 184 helios-rest lib tests pass.
  • cargo clippy --all-targets ... -- -D warnings clean with the project's CI flags.
  • 8 new unit tests in search::registry::tests and search_query_builder::tests cover: registry hit (Goal.target-date → Date), "Resource"-base fallback (_lastUpdated), value heuristic (date / reference / token / default), multi-target lookup, and the regression case via parse_search_parameter.
  • End-to-end smoke against in-memory sqlite via the hfs binary: GET /Goal?patient=x&target-date=ge2010-01-01 and GET /Goal?patient=x&target-date=2010 both return HTTP 200 (vs. the mongodb 400 they were failing with).
  • Confirm inferno mongodb workflow run no longer reports *_goal_patient_target_date_search_test as failing — needs an inferno.yml re-trigger after merge.

Out of scope

  • The two DiagnosticReport?patient=…&status=… HTTP 400 failures from the same inferno run are not fixed here. status was already correctly typed as Token by both the old hardcoded list and the registry, so this refactor doesn't change their behavior. Their root cause needs separate triage with the actual HFS error response body (Inferno's result_message only carries the HTTP status).
  • chain_builder::infer_target_type (which picks ONE target from the registry's declared multi-target list for chained queries) is intentionally left alone — that's target disambiguation, not type inference, and changing it would shift product behavior on chained queries.

The REST extractor's hardcoded "well-known param name → SearchParamType"
match silently misclassified anything outside the list (e.g. Goal.target-date
fell through to a value-shape heuristic, which mongodb's strict typed query
builder then rejected with HTTP 400). Three other call sites carried similar
duplicate tables.

Replace the lot with a single deterministic resolver:

  resolve_param_type(registry, resource_type, name, values)
    1. registry.get_param(resource_type, name)
    2. registry.get_param("Resource", name)            ← global params
    3. value-shape heuristic (unregistered custom only)

Plumbed via a new SearchProvider::search_param_registry() trait method that
every backend already had as a concrete accessor; REST handlers acquire it
from state.storage(). Hardcoded match tables in mongodb's build_search_params
and sqlite's chain_builder are gone; sqlite's dead-code infer_target_type
also removed.

Effect on the failing inferno mongodb test: Goal?patient&target-date now
resolves target-date as Date deterministically, build_date_filter parses the
value correctly, returns 200. The two DiagnosticReport patient&status 400s
were never caused by type inference (status was already Token) and need
separate triage.

Net: −150 lines of redundant hardcoded mappings, +30 lines of resolver,
1 trait method, 8 new unit tests covering registry hit / Resource-base
fallback / value heuristic / target lookup. All 642 persistence and 184
rest lib tests pass; clippy clean with project CI flags.
@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 27, 2026

Codecov Report

❌ Patch coverage is 82.71605% with 42 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
crates/persistence/src/composite/storage.rs 0.00% 37 Missing ⚠️
crates/rest/src/handlers/search.rs 50.00% 3 Missing ⚠️
crates/persistence/src/search/registry.rs 97.95% 2 Missing ⚠️

📢 Thoughts on this report? Let us know!

@smunini smunini merged commit bfdc4f6 into main Apr 27, 2026
33 of 47 checks passed
smunini added a commit that referenced this pull request Apr 27, 2026
…include (#82)

Composite's extract_references for _include / _revinclude was doing hardcoded
JSON field-name matching (with a small `patient/subject/encounter/performer`
alias map). Custom search parameters and standard ones whose JSON field name
doesn't match the param name silently returned zero references — the same
"in a real implementation, FHIRPath would..." stub pattern PR #80 set out
to find.

Replace the body with a registry-first lookup:

  1. registry.get_param(resource_type, name)
  2. registry.get_param("Resource", name)
  3. SearchParameterExtractor::extract_for_param — runs the canonical
     FHIRPath expression and filters to IndexValue::Reference

The hardcoded alias map is retained but only fires for params that aren't
in the registry at all, so unregistered custom params keep working.

Verification:
- New test composite::storage::tests::test_extract_references_uses_registry_expression
  proves the registry's FHIRPath expression is what gets evaluated
- 643 helios-persistence lib tests pass (was 642)
- Clippy clean
smunini added a commit that referenced this pull request Apr 27, 2026
The merge of `main` into `feat/registry-driven-search-param-types`
combined two changes that touched adjacent code: #80 (registry-driven
type resolution, removed `SearchParamType` from imports) and #f9196dc1
(fix prefix-only-for-date/number/quantity, used `SearchParamType` in
the body). The merge resolved by keeping main's import list but the
branch's body, leaving the type referenced in `parse_search_parameter`
without a declaration. CI hit E0433 on both Linux and Windows.

Re-add `SearchParamType` to the `helios_persistence::types` use list.
smunini added a commit that referenced this pull request Apr 27, 2026
* refactor(search): make search-param type resolution registry-driven

The REST extractor's hardcoded "well-known param name → SearchParamType"
match silently misclassified anything outside the list (e.g. Goal.target-date
fell through to a value-shape heuristic, which mongodb's strict typed query
builder then rejected with HTTP 400). Three other call sites carried similar
duplicate tables.

Replace the lot with a single deterministic resolver:

  resolve_param_type(registry, resource_type, name, values)
    1. registry.get_param(resource_type, name)
    2. registry.get_param("Resource", name)            ← global params
    3. value-shape heuristic (unregistered custom only)

Plumbed via a new SearchProvider::search_param_registry() trait method that
every backend already had as a concrete accessor; REST handlers acquire it
from state.storage(). Hardcoded match tables in mongodb's build_search_params
and sqlite's chain_builder are gone; sqlite's dead-code infer_target_type
also removed.

Effect on the failing inferno mongodb test: Goal?patient&target-date now
resolves target-date as Date deterministically, build_date_filter parses the
value correctly, returns 200. The two DiagnosticReport patient&status 400s
were never caused by type inference (status was already Token) and need
separate triage.

Net: −150 lines of redundant hardcoded mappings, +30 lines of resolver,
1 trait method, 8 new unit tests covering registry hit / Resource-base
fallback / value heuristic / target lookup. All 642 persistence and 184
rest lib tests pass; clippy clean with project CI flags.

* feat(persistence/composite): registry-driven extract_references for _include (#82)

Composite's extract_references for _include / _revinclude was doing hardcoded
JSON field-name matching (with a small `patient/subject/encounter/performer`
alias map). Custom search parameters and standard ones whose JSON field name
doesn't match the param name silently returned zero references — the same
"in a real implementation, FHIRPath would..." stub pattern PR #80 set out
to find.

Replace the body with a registry-first lookup:

  1. registry.get_param(resource_type, name)
  2. registry.get_param("Resource", name)
  3. SearchParameterExtractor::extract_for_param — runs the canonical
     FHIRPath expression and filters to IndexValue::Reference

The hardcoded alias map is retained but only fires for params that aren't
in the registry at all, so unregistered custom params keep working.

Verification:
- New test composite::storage::tests::test_extract_references_uses_registry_expression
  proves the registry's FHIRPath expression is what gets evaluated
- 643 helios-persistence lib tests pass (was 642)
- Clippy clean

* fix(hts,subscriptions): make testcontainers watchdog feature non-Windows only

Extends the fix from #79 to two crates that were missed. The
`watchdog` feature on `testcontainers` 0.27 pulls in
`signal_hook::consts::SIGQUIT` and `signal_hook::iterator`, both of
which are gated to non-Windows. Under `cargo clippy --all-features`
on Windows, feature unification across the workspace re-activated
`watchdog` on `testcontainers` because `helios-hts` and
`helios-subscriptions` still requested it unconditionally, causing
E0432 "unresolved imports" on the Windows linting job.

Move the `watchdog`-enabled testcontainers dependency under
`[target.'cfg(not(windows))'.dev-dependencies]` in both crates and
add a plain testcontainers entry for Windows, mirroring the existing
split in `helios-persistence`.

* fix(rest): restore SearchParamType import dropped in merge

The merge of `main` into `feat/registry-driven-search-param-types`
combined two changes that touched adjacent code: #80 (registry-driven
type resolution, removed `SearchParamType` from imports) and #f9196dc1
(fix prefix-only-for-date/number/quantity, used `SearchParamType` in
the body). The merge resolved by keeping main's import list but the
branch's body, leaving the type referenced in `parse_search_parameter`
without a declaration. CI hit E0433 on both Linux and Windows.

Re-add `SearchParamType` to the `helios_persistence::types` use list.
smunini added a commit that referenced this pull request Apr 27, 2026
…ns (#81)

Postgres' ChainedSearchProvider was returning Ok(Vec::new()) for any forward
chain longer than 2 segments and for nested _has reverse chains, silently
producing empty result sets where SQLite returned correct matches. Backend-
behavior divergence with no error signal — the kind of "simplified for now"
stub PR #80 set out to find.

Port the proven SQLite ChainQueryBuilder pattern to Postgres: nested SELECTs
over search_index, registry-driven target-type resolution via
resolve_param_targets(), the same Patient-default for ambiguous multi-target
references SQLite uses. Postgres-specific syntax: $N placeholders, ILIKE for
case-insensitive match, SUBSTRING(... FROM POSITION('/' IN ...) + 1) in
place of SUBSTR + INSTR for the reverse-chain reference-id extraction.

Verification:
- 6 new unit tests in chain_builder covering parsing, SQL shape, explicit
  type modifier, ambiguous-target inference, empty-chain error, and
  reverse-chain SUBSTRING/POSITION emission
- 2 new testcontainers-backed integration tests porting the SQLite three-
  level chain and reverse-chain terminal cases — both pass against a real
  Postgres instance
- All 609 helios-persistence lib tests still green; clippy clean
smunini added a commit that referenced this pull request Apr 27, 2026
Composite's resolve_chain was delegating to a stub (resolve_chain_iterative)
that returned Ok(Vec::new()) for any forward chain — silent wrong-empty
results regardless of backend capability. Same anti-pattern as PR #80
("simplified implementation...for now, just return empty").

Replace with iterative SearchQuery resolution against composite's regular
search routing (the analog of how resolve_reverse_chain already works):

  1. Parse the chain via the registry to resolve target type per link.
     Falls back to the same hardcoded inference table SQLite/Postgres use
     for ambiguous multi-target references — keeps behaviour aligned.
  2. Search the deepest target type for the terminal param.
  3. Walk back: at each step, search the parent type for resources whose
     reference param matches the accumulated refs (multi-value OR).
  4. Outermost step returns raw IDs.

Cost is O(chain_depth) queries; the inner backends apply OR semantics on
multi-value reference params natively.

Verification:
- 3 new unit tests using MockStorage (with search provider registered) prove
  the implementation runs to completion for 3-segment, 2-segment, and
  invalid (single-part) chains. End-to-end correctness comes from inferno
  against real backends.
- 612 helios-persistence lib tests pass; clippy clean.

Drops the resolve_chain_iterative stub.
smunini added a commit that referenced this pull request Apr 27, 2026
* refactor(search): make search-param type resolution registry-driven

The REST extractor's hardcoded "well-known param name → SearchParamType"
match silently misclassified anything outside the list (e.g. Goal.target-date
fell through to a value-shape heuristic, which mongodb's strict typed query
builder then rejected with HTTP 400). Three other call sites carried similar
duplicate tables.

Replace the lot with a single deterministic resolver:

  resolve_param_type(registry, resource_type, name, values)
    1. registry.get_param(resource_type, name)
    2. registry.get_param("Resource", name)            ← global params
    3. value-shape heuristic (unregistered custom only)

Plumbed via a new SearchProvider::search_param_registry() trait method that
every backend already had as a concrete accessor; REST handlers acquire it
from state.storage(). Hardcoded match tables in mongodb's build_search_params
and sqlite's chain_builder are gone; sqlite's dead-code infer_target_type
also removed.

Effect on the failing inferno mongodb test: Goal?patient&target-date now
resolves target-date as Date deterministically, build_date_filter parses the
value correctly, returns 200. The two DiagnosticReport patient&status 400s
were never caused by type inference (status was already Token) and need
separate triage.

Net: −150 lines of redundant hardcoded mappings, +30 lines of resolver,
1 trait method, 8 new unit tests covering registry hit / Resource-base
fallback / value heuristic / target lookup. All 642 persistence and 184
rest lib tests pass; clippy clean with project CI flags.

* feat(persistence/postgres): support multi-step forward & reverse chains

Postgres' ChainedSearchProvider was returning Ok(Vec::new()) for any forward
chain longer than 2 segments and for nested _has reverse chains, silently
producing empty result sets where SQLite returned correct matches. Backend-
behavior divergence with no error signal — the kind of "simplified for now"
stub PR #80 set out to find.

Port the proven SQLite ChainQueryBuilder pattern to Postgres: nested SELECTs
over search_index, registry-driven target-type resolution via
resolve_param_targets(), the same Patient-default for ambiguous multi-target
references SQLite uses. Postgres-specific syntax: $N placeholders, ILIKE for
case-insensitive match, SUBSTRING(... FROM POSITION('/' IN ...) + 1) in
place of SUBSTR + INSTR for the reverse-chain reference-id extraction.

Verification:
- 6 new unit tests in chain_builder covering parsing, SQL shape, explicit
  type modifier, ambiguous-target inference, empty-chain error, and
  reverse-chain SUBSTRING/POSITION emission
- 2 new testcontainers-backed integration tests porting the SQLite three-
  level chain and reverse-chain terminal cases — both pass against a real
  Postgres instance
- All 609 helios-persistence lib tests still green; clippy clean

* feat(persistence/composite): iterative forward-chain resolution (#83)

Composite's resolve_chain was delegating to a stub (resolve_chain_iterative)
that returned Ok(Vec::new()) for any forward chain — silent wrong-empty
results regardless of backend capability. Same anti-pattern as PR #80
("simplified implementation...for now, just return empty").

Replace with iterative SearchQuery resolution against composite's regular
search routing (the analog of how resolve_reverse_chain already works):

  1. Parse the chain via the registry to resolve target type per link.
     Falls back to the same hardcoded inference table SQLite/Postgres use
     for ambiguous multi-target references — keeps behaviour aligned.
  2. Search the deepest target type for the terminal param.
  3. Walk back: at each step, search the parent type for resources whose
     reference param matches the accumulated refs (multi-value OR).
  4. Outermost step returns raw IDs.

Cost is O(chain_depth) queries; the inner backends apply OR semantics on
multi-value reference params natively.

Verification:
- 3 new unit tests using MockStorage (with search provider registered) prove
  the implementation runs to completion for 3-segment, 2-segment, and
  invalid (single-part) chains. End-to-end correctness comes from inferno
  against real backends.
- 612 helios-persistence lib tests pass; clippy clean.

Drops the resolve_chain_iterative stub.

* fix(rest,persistence): restore SearchParamType import and fix fmt

The merge from main dropped the SearchParamType import in
search_query_builder.rs and left an unformatted let binding in
composite/storage.rs, both flagged by CI lint.
smunini added a commit that referenced this pull request Apr 27, 2026
* refactor(search): make search-param type resolution registry-driven

The REST extractor's hardcoded "well-known param name → SearchParamType"
match silently misclassified anything outside the list (e.g. Goal.target-date
fell through to a value-shape heuristic, which mongodb's strict typed query
builder then rejected with HTTP 400). Three other call sites carried similar
duplicate tables.

Replace the lot with a single deterministic resolver:

  resolve_param_type(registry, resource_type, name, values)
    1. registry.get_param(resource_type, name)
    2. registry.get_param("Resource", name)            ← global params
    3. value-shape heuristic (unregistered custom only)

Plumbed via a new SearchProvider::search_param_registry() trait method that
every backend already had as a concrete accessor; REST handlers acquire it
from state.storage(). Hardcoded match tables in mongodb's build_search_params
and sqlite's chain_builder are gone; sqlite's dead-code infer_target_type
also removed.

Effect on the failing inferno mongodb test: Goal?patient&target-date now
resolves target-date as Date deterministically, build_date_filter parses the
value correctly, returns 200. The two DiagnosticReport patient&status 400s
were never caused by type inference (status was already Token) and need
separate triage.

Net: −150 lines of redundant hardcoded mappings, +30 lines of resolver,
1 trait method, 8 new unit tests covering registry hit / Resource-base
fallback / value heuristic / target lookup. All 642 persistence and 184
rest lib tests pass; clippy clean with project CI flags.

* feat(persistence/composite): registry-driven extract_references for _include (#82)

Composite's extract_references for _include / _revinclude was doing hardcoded
JSON field-name matching (with a small `patient/subject/encounter/performer`
alias map). Custom search parameters and standard ones whose JSON field name
doesn't match the param name silently returned zero references — the same
"in a real implementation, FHIRPath would..." stub pattern PR #80 set out
to find.

Replace the body with a registry-first lookup:

  1. registry.get_param(resource_type, name)
  2. registry.get_param("Resource", name)
  3. SearchParameterExtractor::extract_for_param — runs the canonical
     FHIRPath expression and filters to IndexValue::Reference

The hardcoded alias map is retained but only fires for params that aren't
in the registry at all, so unregistered custom params keep working.

Verification:
- New test composite::storage::tests::test_extract_references_uses_registry_expression
  proves the registry's FHIRPath expression is what gets evaluated
- 643 helios-persistence lib tests pass (was 642)
- Clippy clean

* fix(hts,subscriptions): make testcontainers watchdog feature non-Windows only

Extends the fix from #79 to two crates that were missed. The
`watchdog` feature on `testcontainers` 0.27 pulls in
`signal_hook::consts::SIGQUIT` and `signal_hook::iterator`, both of
which are gated to non-Windows. Under `cargo clippy --all-features`
on Windows, feature unification across the workspace re-activated
`watchdog` on `testcontainers` because `helios-hts` and
`helios-subscriptions` still requested it unconditionally, causing
E0432 "unresolved imports" on the Windows linting job.

Move the `watchdog`-enabled testcontainers dependency under
`[target.'cfg(not(windows))'.dev-dependencies]` in both crates and
add a plain testcontainers entry for Windows, mirroring the existing
split in `helios-persistence`.

* fix(rest): restore SearchParamType import dropped in merge

The merge of `main` into `feat/registry-driven-search-param-types`
combined two changes that touched adjacent code: #80 (registry-driven
type resolution, removed `SearchParamType` from imports) and #f9196dc1
(fix prefix-only-for-date/number/quantity, used `SearchParamType` in
the body). The merge resolved by keeping main's import list but the
branch's body, leaving the type referenced in `parse_search_parameter`
without a declaration. CI hit E0433 on both Linux and Windows.

Re-add `SearchParamType` to the `helios_persistence::types` use list.

* feat(persistence/postgres): support multi-step forward & reverse chains (#81)

Postgres' ChainedSearchProvider was returning Ok(Vec::new()) for any forward
chain longer than 2 segments and for nested _has reverse chains, silently
producing empty result sets where SQLite returned correct matches. Backend-
behavior divergence with no error signal — the kind of "simplified for now"
stub PR #80 set out to find.

Port the proven SQLite ChainQueryBuilder pattern to Postgres: nested SELECTs
over search_index, registry-driven target-type resolution via
resolve_param_targets(), the same Patient-default for ambiguous multi-target
references SQLite uses. Postgres-specific syntax: $N placeholders, ILIKE for
case-insensitive match, SUBSTRING(... FROM POSITION('/' IN ...) + 1) in
place of SUBSTR + INSTR for the reverse-chain reference-id extraction.

Verification:
- 6 new unit tests in chain_builder covering parsing, SQL shape, explicit
  type modifier, ambiguous-target inference, empty-chain error, and
  reverse-chain SUBSTRING/POSITION emission
- 2 new testcontainers-backed integration tests porting the SQLite three-
  level chain and reverse-chain terminal cases — both pass against a real
  Postgres instance
- All 609 helios-persistence lib tests still green; clippy clean
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant