Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Feedback on PR from @evan-phelps (thanks!!):
Great question. We have currently done central linking of waveform files to OMOP data, but we expect sites to do more linkage locally as the project progresses. That statement is not referring to PPRL; that topic will likely need to be covered in a different SOP, as it would relate to matching patients across sites in addition to linking their data modes together.
Agreed. I'll soften the wording and emphasize that it's perfectly fine to decouple those identifiers.
The main concern is having identifier values clash once they reach the PROCEDURE OCCURRENC table. If, as you suggest in (2) we decouple the fileid values from the procedure_occurrence_id values, this range allocation is moot and the fileid values can be arbitrary as long as they're unique. But we do expect the data engineers to ensure that the procedure_occurrence_id remains a proper primary key with no duplicate values after inserting data from the registry tables.
Many sites are facing similar issues, and we have modified the OMOP CDM DDL on the central cloud accordingly to handle bigints. The 2B+ selection was arbitrary, and mostly stems from the OHDSI convention for custom concept id assignments. If you're already using bigints you can go wild with your ID selections :) It would just be useful to know what ranges you end up using so we can sort them out centrally.
Agreed. Will update wording accordingly.
This is somewhat dependent on the file format chosen, so it was intentionally vague. Now that it seems like WFDB will be the winning format I can provide guidance/examples.
Absolutely. I will add this important caveat. We need to establish a solid feedback loop here, though, in order to design cohort definitions dependent on multimodal data that apply across all sites.
Agreed. Will update accordingly