Issues with the current database implementation of Within-subject experiments:
- “Exclude if reached” doesn’t work - monitored_decision_point is used for enrollment data just in the case of Within Subject experiments.
- ‘monitored_decision_point_log’ table records redundant and unused information every time /mark is called, making it the largest table in the production database with (currently) exactly 0 records that contain useful information.
- The correlation between the “stack” of conditions returned by /assign and what is actually marked is not guaranteed. And at least in one scenario - where there are entries in ‘monitored_decision_point_log’ for the decision point from before the current experiment started - it operates on false information: it considers them part of the ‘count’ of conditions that have already been marked in the current experiment.
- The distribution of within-subject enrollment data across many different tables (including the above, that have huge numbers of records) makes querying within subject experiment data unnecessarily complex and resource-intensive.
Recommended architectural changes:
- Stop using monitored_decision_point_log (at least for within-subject enrollment information, which seems like its only use).
- Use monitored_decision_point only to support “Exclude if reached” functionality.
- Create a new table ‘repeated_enrollments’, which
- exists in a many-to-one relationship with ‘individual_enrollments’
- uses the ‘uniquifier’ scheme previously employed to correlate ‘mark’ records with ‘log’ records, but makes the uniquifier non-nullable (as it is in the ‘log’ table)
- references the ‘experiment_condition’ table, rather than adding ‘site’ and ‘target’ as arbitrary strings
Issues with the current database implementation of Within-subject experiments:
Recommended architectural changes: