Skip to content

Add Controlled Keywords to Score Set Search Filter Options #693

@bencap

Description

@bencap

Background

Controlled keywords are attached at the experiment level (via experiment_controlled_keywords bridge table), not directly on score sets. However, score sets are already searchable by keyword label through the keywords field in ScoreSetsSearch, which traverses ScoreSet → Experiment → keyword_objs → ControlledKeyword.

The filtering itself works, but the /score-sets/search/filter-options endpoint does not yet return available keyword options, so clients have no way to discover which keyword labels are valid filter values.

What Needs to Be Done

1. Implement keyword options query in src/mavedb/lib/score_sets.py

In fetch_score_sets_search_filter_options(), query the distinct set of ControlledKeyword objects reachable from published score sets by joining through ScoreSet → Experiment → ExperimentControlledKeywordAssociation → ControlledKeyword, filtered to only published score sets to match search scope.

2. Extend ScoreSetsSearchFilterOptionsResponse in src/mavedb/view_models/search.py

There are three design options for how to surface keywords in the response. This is the key design decision for this issue.

  • Labels only (keywords: list[ScoreSetsSearchFilterOption]) — consistent with how other filter options return flat string values and requires no new view model. The tradeoff is that label alone strips all semantic context (key, system, code, description), so clients cannot group, sort, or display keywords meaningfully without hardcoding that knowledge themselves.

  • Slim keyword objects (keywords: list[SlimControlledKeyword]) — a new view model exposing only the fields relevant to search consumers (e.g. key, label, system, description), omitting internal fields like id, special, and dates. Gives clients enough context to group and render keywords without over-exposing the model. The tradeoff is that it still returns a flat heterogeneous list, so clients must group by key themselves if they want to present keywords by category.

  • Pre-separated by semantic group — add one field per distinct key value (e.g. keyword_biological_processes, keyword_molecular_functions), each holding list[ScoreSetsSearchFilterOption]. This mirrors the existing pattern in ScoreSetsSearchFilterOptionsResponse where logically distinct categories have their own top-level fields, keeping the response self-describing and eliminating any client-side grouping logic. The tradeoff is that the response shape becomes data-driven: it is tied to the set of key values in the database, so adding a new keyword category requires a coordinated API and client change rather than being picked up automatically.

Notes

  • The search filter (keywords in ScoreSetsSearch) already exists and works — this issue is specifically about surfacing available keyword options through the filter options endpoint and adding these options to the UI.
  • Keyword data lives at the experiment level by design; the traversal pattern using ScoreSet.experiment.has(Experiment.keyword_objs.any(...)) is already established and should be reused.
  • Eager loading for keywords is already set up in search_score_sets() via selectinload on Experiment.keyword_objs joined to ExperimentControlledKeywordAssociation.controlled_keyword — the filter options query should follow the same pattern.

Metadata

Metadata

Assignees

No one assigned

    Labels

    app: backendTask implementation touches the backendapp: frontendTask implementation touches the frontendtype: enhancementEnhancement to an existing feature

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions