-
Notifications
You must be signed in to change notification settings - Fork 2
Description
Background
Controlled keywords are attached at the experiment level (via experiment_controlled_keywords bridge table), not directly on score sets. However, score sets are already searchable by keyword label through the keywords field in ScoreSetsSearch, which traverses ScoreSet → Experiment → keyword_objs → ControlledKeyword.
The filtering itself works, but the /score-sets/search/filter-options endpoint does not yet return available keyword options, so clients have no way to discover which keyword labels are valid filter values.
What Needs to Be Done
1. Implement keyword options query in src/mavedb/lib/score_sets.py
In fetch_score_sets_search_filter_options(), query the distinct set of ControlledKeyword objects reachable from published score sets by joining through ScoreSet → Experiment → ExperimentControlledKeywordAssociation → ControlledKeyword, filtered to only published score sets to match search scope.
2. Extend ScoreSetsSearchFilterOptionsResponse in src/mavedb/view_models/search.py
There are three design options for how to surface keywords in the response. This is the key design decision for this issue.
-
Labels only (
keywords: list[ScoreSetsSearchFilterOption]) — consistent with how other filter options return flat string values and requires no new view model. The tradeoff is thatlabelalone strips all semantic context (key,system,code,description), so clients cannot group, sort, or display keywords meaningfully without hardcoding that knowledge themselves. -
Slim keyword objects (
keywords: list[SlimControlledKeyword]) — a new view model exposing only the fields relevant to search consumers (e.g.key,label,system,description), omitting internal fields likeid,special, and dates. Gives clients enough context to group and render keywords without over-exposing the model. The tradeoff is that it still returns a flat heterogeneous list, so clients must group bykeythemselves if they want to present keywords by category. -
Pre-separated by semantic group — add one field per distinct
keyvalue (e.g.keyword_biological_processes,keyword_molecular_functions), each holdinglist[ScoreSetsSearchFilterOption]. This mirrors the existing pattern inScoreSetsSearchFilterOptionsResponsewhere logically distinct categories have their own top-level fields, keeping the response self-describing and eliminating any client-side grouping logic. The tradeoff is that the response shape becomes data-driven: it is tied to the set ofkeyvalues in the database, so adding a new keyword category requires a coordinated API and client change rather than being picked up automatically.
Notes
- The search filter (
keywordsinScoreSetsSearch) already exists and works — this issue is specifically about surfacing available keyword options through the filter options endpoint and adding these options to the UI. - Keyword data lives at the experiment level by design; the traversal pattern using
ScoreSet.experiment.has(Experiment.keyword_objs.any(...))is already established and should be reused. - Eager loading for keywords is already set up in
search_score_sets()viaselectinloadonExperiment.keyword_objsjoined toExperimentControlledKeywordAssociation.controlled_keyword— the filter options query should follow the same pattern.