-
Notifications
You must be signed in to change notification settings - Fork 2
Description
The get_experiment_score_sets endpoint can return the same score set multiple times when multiple private (non-readable) score sets in a supersession chain resolve to the same published ancestor.
Steps to Reproduce
Create an experiment with a published score set (A)
Create two private score sets (B and C) that both supersede A (possible because superseded_score_set_id has no unique constraint)
As an anonymous user, request GET /experiments/{urn}/score-sets
Expected Behavior
Score set A appears once in the response.
Actual Behavior
Score set A appears twice — once for each private superseding score set that resolves back to A via find_superseded_score_set_tail.
Root Cause
In src/mavedb/routers/experiments.py:198-201, the endpoint:
- Queries for all score sets not superseded by anything (~ScoreSet.superseding_score_set.has())
- Calls find_superseded_score_set_tail on each, which walks backward through the supersession chain to find the most recent score set the user can read
- Collects the results into a list without deduplication
- When multiple chain heads resolve to the same visible ancestor, the same score set appears multiple times.
A contributing factor: superseded_score_set_id (src/mavedb/models/score_set.py:127) lacks a unique constraint, allowing supersession "forks" where multiple score sets claim to supersede the same one. This is more of a separate issue though.
Proposed Fix
- Deduplicate filtered_score_sets by score set ID after the tail-finding step.