Summary
Introduce a lightweight calibration controls data model that stores the variant → clinical status pairs used as empirical ground truth when deriving a calibration's score thresholds. Surfacing these controls gives clinicians the evidence they need to assess whether a calibration is appropriate for their specific interpretation context.
Background
MaveDB calibrations define how functional scores map to clinical interpretations (e.g., scores above threshold X are pathogenic). These thresholds are derived from a set of variants with independently known clinical significance — the "controls." Today, this empirical basis is not stored in MaveDB, making it impossible for clinicians to audit or gain confidence in a calibration before using it for variant interpretation.
Design Decisions
| Decision |
Choice |
Rationale |
| Scope of controls |
Variants in the calibration's score set only |
A control without a DMS score is meaningless for threshold derivation |
| Clinical status vocabulary |
2-tier ACMG: pathogenic | benign |
Minimalist; excludes intermediate tiers (VUS, LP, LB) by design |
| Storage level |
Flat list at calibration level |
One pool of known controls; bin-level detail is not needed |
| PHI acknowledgment |
controls_not_phi boolean stored on ScoreCalibration; gates publishing |
Ensures submitter has explicitly affirmed no PHI before data goes public |
| Disease context |
Optional disease field on ScoreCalibration |
Disease context belongs at calibration level, not per-control (which would risk PHI via phenotype inference) |
| UI reusability |
Shared CalibrationVariantAssignmentTable component parameterized by assignment options |
Avoids duplicate UI for controls vs. class-based assignment; enables future inline editing of class assignments |
Out of Scope
Architecture
Data model (mavedb-api)
- New
calibration_controls table: calibration_id (FK, cascade delete), variant_id (FK), clinical_status enum (pathogenic | benign), audit fields; UNIQUE on (calibration_id, variant_id)
CalibrationControl ORM model at src/mavedb/models/calibration_control.py
CalibrationControlStatus enum at src/mavedb/models/enums/calibration_control_status.py
- New columns on
score_calibrations: disease VARCHAR NULL, controls_not_phi BOOLEAN NULL
API (mavedb-api)
- Pydantic models:
CalibrationControlBase, CalibrationControlCreate, SavedCalibrationControl
ScoreCalibration response extended with controls: list[SavedCalibrationControl], disease, controls_not_phi
- Validation: controls must reference variants in the calibration's score set
- Publishing gate: calibrations with controls require
controls_not_phi = True
- Create/update endpoints accept controls via inline JSON array or
controls_file CSV (replace semantics); providing both returns 422
UI (mavedb-ui)
- New
CalibrationVariantAssignmentTable component: reusable inline table with CSV upload, parameterized by assignment options
- Controls section in
CalibrationFields.vue wired to draft.controls
- PHI acknowledgment checkbox shown when controls are present
- Controls table in calibration viewer (read-only mode of
CalibrationVariantAssignmentTable)
Child Issues
Backend (mavedb-api)
Frontend (mavedb-ui)
Follow-up
Dependency Graph
#748 (enum)
└─→ #749 (schema + migration)
└─→ #750 (view models)
└─→ #751 (validation)
└─→ #752 (PHI gate)
└─→ #753 (router wiring)
└─→ mavedb-ui#679 (reusable component)
└─→ mavedb-ui#681 (controls editor wiring)
└─→ mavedb-ui#677 (PHI checkbox)
└─→ mavedb-ui#678 (controls viewer)
Summary
Introduce a lightweight calibration controls data model that stores the variant → clinical status pairs used as empirical ground truth when deriving a calibration's score thresholds. Surfacing these controls gives clinicians the evidence they need to assess whether a calibration is appropriate for their specific interpretation context.
Background
MaveDB calibrations define how functional scores map to clinical interpretations (e.g., scores above threshold X are pathogenic). These thresholds are derived from a set of variants with independently known clinical significance — the "controls." Today, this empirical basis is not stored in MaveDB, making it impossible for clinicians to audit or gain confidence in a calibration before using it for variant interpretation.
Design Decisions
pathogenic|benigncontrols_not_phiboolean stored onScoreCalibration; gates publishingdiseasefield onScoreCalibrationCalibrationVariantAssignmentTablecomponent parameterized by assignment optionsOut of Scope
Architecture
Data model (
mavedb-api)calibration_controlstable:calibration_id(FK, cascade delete),variant_id(FK),clinical_statusenum (pathogenic|benign), audit fields; UNIQUE on(calibration_id, variant_id)CalibrationControlORM model atsrc/mavedb/models/calibration_control.pyCalibrationControlStatusenum atsrc/mavedb/models/enums/calibration_control_status.pyscore_calibrations:disease VARCHAR NULL,controls_not_phi BOOLEAN NULLAPI (
mavedb-api)CalibrationControlBase,CalibrationControlCreate,SavedCalibrationControlScoreCalibrationresponse extended withcontrols: list[SavedCalibrationControl],disease,controls_not_phicontrols_not_phi = Truecontrols_fileCSV (replace semantics); providing both returns 422UI (
mavedb-ui)CalibrationVariantAssignmentTablecomponent: reusable inline table with CSV upload, parameterized by assignment optionsCalibrationFields.vuewired todraft.controlsCalibrationVariantAssignmentTable)Child Issues
Backend (mavedb-api)
CalibrationControlStatusenum (pathogenic,benign)disease/controls_not_phionScoreCalibration)CalibrationControlcontrols_fileCSV)Frontend (mavedb-ui)
CalibrationVariantAssignmentTablecomponent (inline editing + CSV upload)CalibrationVariantAssignmentTableinto calibration controls editorFollow-up
CalibrationVariantAssignmentTable(inline variant → class assignment, currently CSV-only)Dependency Graph