Skip to content

Updates pk creation & adds merging to process log#69

Merged
sangeetabhatia03 merged 5 commits into
mainfrom
prepare-redcap-id-log-update
May 15, 2025
Merged

Updates pk creation & adds merging to process log#69
sangeetabhatia03 merged 5 commits into
mainfrom
prepare-redcap-id-log-update

Conversation

@tristan-myles
Copy link
Copy Markdown
Collaborator

@tristan-myles tristan-myles commented May 13, 2025

This PR combines two updates (sorry!)

  1. It adds the details of which article ids were updated due to being extracted over multiple forms to the process report and cleans the cli output related to form continuations.
  2. It updates the creation of the [table]_access_id primary keys (pks). These are used so that downstream orderly tasks are compatible with data extracted using REDCap.

Due to the format of the REDCap data, the only unique identifier is the record_id (which gets renamed to Article_ID). Previously, the [table]_access_id pks were created as auto-incrementing integers, on the assumption that the data were ordered by time. However, this assumption was incorrect. Consequently, if new data is extracted using an existing form, the [table]_access_id may change. This is problematic since these ids are needed by the fixed double extraction files when merging them to the matching and single extraction files.

The solution is assign an id based on the unique record id and the order of extraction of the data in REDCap, i.e. record_id_extraction_number. A potential flaw with this approach is that we will be unable to know someone deletes a record and then recreates a record in its place - however, this seems like an edge case.

The ideal solution would be to assign a primary key to every outbreak, model, and parameter instance in REDCap and to use that instead of the [table]_access_id throughout the code base.

- PK is created using both the record ID and the article(/record) ID
- Assuming rows aren't deleted and new entries created in their place
  this should be unique. Ideally, a unique ID should be created in
RedCap.
- The data are sorted prior to UUID creation
@sangeetabhatia03 sangeetabhatia03 merged commit aaba5e7 into main May 15, 2025
0 of 2 checks passed
@sangeetabhatia03 sangeetabhatia03 deleted the prepare-redcap-id-log-update branch May 15, 2025 19:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants