Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create matrix of project deliverables and the software packages/repos that support those deliverables #10

Open
jshoughtaling opened this issue Jun 6, 2024 · 1 comment
Assignees

Comments

@jshoughtaling
Copy link
Contributor

In an effort to eliminate ambiguity regarding development efforts and the deliverables they support, we will catalog and align deliverables with the packages and display them on this site.

@jshoughtaling
Copy link
Contributor Author

jshoughtaling commented Jun 6, 2024

@del42 - I created a collapsible table structure below to capture items per team and per major deliverable.

Let me know if you think this is a reasonable way to represent the data and I'll do it for the other teams and then publish it on the developer site.

Data Acquisition

Expand Aims
1. Site startup
SUBAIM INVOLVED SOFTWARE DOCUMENTATION
1.1) Project Readiness
1.2) Ensure IRB readiness
2. Cohort Sampling and Size Justification
SUBAIM INVOLVED SOFTWARE DOCUMENTATION
2.1) Identify intellectual property concerns at each site
2.2) Ensure diversity in site data
2.3) Obtain population- representative inferences JARED container-apps [atlas]
3. Federation process to access data from all patients
SUBAIM INVOLVED SOFTWARE DOCUMENTATION
3.1) Ensure federated process to identify cohorts and request data JARED container-apps [atlas]
3.2) Document federated processes, with tooling module
3.3) Define and prepare meta dataset JARED container-apps [atlas]
4. Structured Data
SUBAIM INVOLVED SOFTWARE DOCUMENTATION
4.1) Prepare OMOP Data JARED container-apps [etl] Central Processing SOP
4.1) Prepare to query OMOP data JARED container-apps [etl]
5. Obtaining High-resolution physiological data
SUBAIM INVOLVED SOFTWARE DOCUMENTATION
5.1) Integrate OMOP vocabulary for physiological data JARED chorus-mapping [vocab] SOP
5.2) Standardize signal processing in waveforms BRIAN chorus_waveform
5.3) Quality control
6. Collection and processing of image data
SUBAIM INVOLVED SOFTWARE DOCUMENTATION
6.1) Ensure collection of image data at all sites
6.2) Ensure generation of metadata from images
6.3) Ensure collection of image data at all sites
6.4) Ensure linkage to other data domain JARED container-apps [registry] MM SOP
7. Collection and Processing of Clinical notes
SUBAIM INVOLVED SOFTWARE DOCUMENTATION
7.1) Customize NLP algorithm
7.2) Validate the NLP tool using honest brokers
7.3) Unstructured EHR extraction
7.4) Implement NLP tool
8. Mining for socioeconomic status data
SUBAIM INVOLVED SOFTWARE DOCUMENTATION
8.1) Identify common data SDoH elements from collective experience of sites
8.2) Identify and resolve discrepancies in performance of SDoH elements
9. Collection of contextual SDoH data
SUBAIM INVOLVED SOFTWARE DOCUMENTATION
9.1) Review SDoH variables and data sources
9.2) Prepare geospatial crosswalk datasets
10. Linkage among EHR
SUBAIM INVOLVED SOFTWARE DOCUMENTATION
10.0)Design and communicate linkage SOP JARED MM SOP
10.1) Retrieve EHR data through MRNs at each site JARED container-apps [registry] MM SOP
10.2) Minimal EHR dataset elements created and communicated to all sites
10.3) EHR data extracted JARED
10.4) Site-specific EHR extracts validated
10.5) Implement tools to accurately link EHR and physiologic data JARED container-apps [registry]
10.6) Gap analysis of linkage performed JARED
11. Deidentification of data
SUBAIM INVOLVED SOFTWARE DOCUMENTATION
11.1) Prepare safe harboring approach
11.2) Apply SOP on safe harboring approach
11.3) Quality control of linked safe-harbor datasets
12. Event Annotation
SUBAIM INVOLVED SOFTWARE DOCUMENTATION
12.1) Develop semi-automatic predictive monitoring model to identify events
12.2) ndividual chart review of potential events by clinicians
12.3) Develop phenotype algorithm in OHDSI Phenotype repository JARED container-apps [atlas]
12.4) Implement phenotype algorithm in OHDSI Phenotype repository JARED container-apps [atlas]
12.5) Generate silver-standard labels at scale automatically JARED
12.6) Implement annotation pipeline to combine physiological data for resolved timeline
12.7) Store results both locally and centrally
12.8) Create datasheet describing each cohort JARED container-apps [atlas], CHoRUSReports
13. Site-specific metadata
SUBAIM INVOLVED SOFTWARE DOCUMENTATION
13.1) Encode de-identified hospital number on each data set
13.2) Report site-specific metadata from full patient dataset
14. Quality control
SUBAIM INVOLVED SOFTWARE DOCUMENTATION
14.1) Write protocols for quality controls checks at all sites JARED CHoRUSReports Quality Central
14.2) Quantify variability when it exists JARED CHoRUSReports Quality Central
14.3) Use the Data Quality Dashboard from OHSDI to evaluate datasets JARED container-apps [ares, www-dgs] Quality Central
15. Privacy Check
SUBAIM INVOLVED SOFTWARE DOCUMENTATION
15.1) Examine unique identifiability with combinations of attributes to check potential risk of linkability
15.2) Evaluate I-diversity and t-closeness
15.3) For unstructured data, extract concepts through whitelist mechanism and test for linkability
16. Generate synthetic data
SUBAIM INVOLVED SOFTWARE DOCUMENTATION
16.1) Use generative adversarial network (GAN) approaches to produce site-specific synthetic data sets
16.2) Post the data and the source code
17. Hold-out validation dataset
SUBAIM INVOLVED SOFTWARE DOCUMENTATION
17.1) Set aside 20% of patients for algorithm testing
17.2) Pursue qualification of reserved dataset for FDA's Medical Device Development Tool program
18. Data storage platform - CHoRUS
SUBAIM INVOLVED SOFTWARE DOCUMENTATION
18.1) Organize files by year/month, then Subject ID, data modalities JARED container-apps [etl] Central Processing SOP
18.2) Store dataset in in site-specific staging area JARED container-apps [etl] Central Processing SOP
18.3) Implement automated data integrity check; review error logs JARED CHoRUSReports Quality Central
18.4) Copy files to data lake using secure FTP JARED container-apps [etl] Central Processing SOP

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: In Progress
Development

No branches or pull requests

2 participants