Skip to content

Commit

Permalink
Merge pull request #16 from chorus-ai/reorganization
Browse files Browse the repository at this point in the history
Reorganize the site to reflect the SOP categories
  • Loading branch information
jshoughtaling committed Apr 22, 2024
2 parents 6fcfb75 + 9855fff commit 1171172
Show file tree
Hide file tree
Showing 50 changed files with 438 additions and 304 deletions.
5 changes: 3 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@ The following SOPs are under development:
| Privacy | Xioqian | https://github.com/chorus-ai/Chorus_SOP/issues/12 |
| Common data elements | Tezcan | https://github.com/chorus-ai/Chorus_SOP/issues/13 |
| Safe Harboring Approach | Xioqian | https://github.com/chorus-ai/Chorus_SOP/issues/14 |
| OMOP Mapping | Polina | https://github.com/chorus-ai/Chorus_SOP/issues/15 |

## How to contribute to the SOP documentation

Expand All @@ -34,10 +35,10 @@ The following SOPs are under development:
## How to edit the contents of online documentation site

1. Navigate to the target page of online documentation site, and click `Edit this page` button at the bottom.
2. A github editing page will pop up for you to edit the contents, make your own edition and commit your changes to a new branch.
2. A GitHub editing page will pop up for you to edit the contents, make your own edition and commit your changes to a new branch.
3. Submit the pull request

if you would like to add more topics which are not existed, you can add your request to [`issues`](https://github.com/chorus-ai/data_acq_SOP/issues) as well.
if you would like to add topics that are not yet represented, please add your request to [`issues`](https://github.com/chorus-ai/Chorus_SOP/issues).

## Resources
- [MDX](https://mdxjs.com/)
Expand Down
2 changes: 2 additions & 0 deletions sop-website/.gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -18,3 +18,5 @@
npm-debug.log*
yarn-debug.log*
yarn-error.log*

/.idea
10 changes: 0 additions & 10 deletions sop-website/blog/2023-06-28/index.md

This file was deleted.

12 changes: 12 additions & 0 deletions sop-website/blog/2024-04-22/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
---
slug: welcome
title: Welcome
authors: [jared]
tags: [hello, docusaurus]
---

Hey - welcome to the new CHoRUS SOP site!

The purpose of this site is to aggregate all *validated* documentation related to Standards, Data Acquisition, and Tooling into a single location.

If you have any comments, questions or concerns, please feel free to draft a blog post or create an issue in the [GitHub repository](https://github.com/chorus-ai/Chorus_SOP)!
8 changes: 7 additions & 1 deletion sop-website/blog/authors.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,4 +3,10 @@ chester:
name: Ziyuan Guan
title: System Manager and Data Analyst, Intelligent Critical Care Center, Department of Medicine, UFL
url: https://github.com/Chesterguan
image_url: https://github.com/Chesterguan.png
image_url: https://github.com/Chesterguan.png

jared:
name: Jared Houghtaling
title: Co-Investigator, Clinical and Translational Sciences Institute (CTSI), Tufts Medical Center
url: https://github.com/jshoughtaling
image_url: https://avatars.githubusercontent.com/u/67749079?s=400&u=98960d673f9340f2e005bee7e9a24affe798d14d&v=4
7 changes: 7 additions & 0 deletions sop-website/docs/Central-Processing/Central-Processing.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
---
title: Central Processing
id: Central Processing
description: An SOP for processing data deliveries on the central cloud
---


8 changes: 8 additions & 0 deletions sop-website/docs/Central-Processing/_category_.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
{
"label": "Central Processing",
"position": 1,
"link": {
"type": "generated-index",
"description": "Learn how your data is processed centrally"
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
---
title: Common Data Elements
id: Common Data Elements
description: An SOP for general conventions related to common data elements
---


8 changes: 8 additions & 0 deletions sop-website/docs/Common-Data-Elements/_category_.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
{
"label": "Common Data Elements",
"position": 2,
"link": {
"type": "generated-index",
"description": "Learn about handling common data elements"
}
}
Empty file.
Empty file.
Empty file.
Empty file.
9 changes: 0 additions & 9 deletions sop-website/docs/Data-Cohort/_category_.json

This file was deleted.

7 changes: 7 additions & 0 deletions sop-website/docs/Data-Quality/Data-Quality.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
---
title: Data Quality
id: Data Quality
description: An SOP for evaluating the quality of data extracts
---


8 changes: 8 additions & 0 deletions sop-website/docs/Data-Quality/_category_.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
{
"label": "Data Quality",
"position": 3,
"link": {
"type": "generated-index",
"description": "Learn about applying quality checks to your data"
}
}
5 changes: 5 additions & 0 deletions sop-website/docs/Data-Requests/Data-Requests.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
---
title: Data Requests
id: Data Requests
description: An SOP for elements related to data requests
---
8 changes: 8 additions & 0 deletions sop-website/docs/Data-Requests/_category_.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
{
"label": "Data Requests",
"position": 4,
"link": {
"type": "generated-index",
"description": "Learn about the process for submitting data requests"
}
}
Empty file.
8 changes: 0 additions & 8 deletions sop-website/docs/Data-Serving/_category_.json

This file was deleted.

212 changes: 0 additions & 212 deletions sop-website/docs/Data-Uploading/Broad DA SOP.mdx

This file was deleted.

8 changes: 0 additions & 8 deletions sop-website/docs/Data-Uploading/_category_.json

This file was deleted.

Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ description: Uploading to Central Data Warehouse

# Uploading to Central Data Warehouse

[Download Data Uploading SOP doc](https://github.com/chorus-ai/data_acq_SOP/blob/main/sop-website/docs/Data-Uploading/Data%20Upload%20SOP.docx)
[Download Data Uploading SOP doc](https://github.com/chorus-ai/Chorus_SOP/blob/main/sop-website/docs/Data-Uploading/Data%20Upload%20SOP.docx)

## Purpose

Expand Down
8 changes: 8 additions & 0 deletions sop-website/docs/Data-Uploads/_category_.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
{
"label": "Data Uploading",
"position": 5,
"link": {
"type": "generated-index",
"description": "Learn how to upload your data into the central cloud"
}
}
Empty file.
8 changes: 0 additions & 8 deletions sop-website/docs/Featured-Tools/_category_.json

This file was deleted.

Original file line number Diff line number Diff line change
@@ -1,11 +1,10 @@
---
title: EHR data processing
id: ehr-data-processing
description: EHR data elements
title: Flowsheet Data
id: Flowsheet Data
description: An SOP for handling data from flowsheet orders
---



[High-Priority Flowsheet Data Elements](https://docs.google.com/spreadsheets/d/1Bu0wCXa8uJhWIY_CRp1bZG7ehy1pr-D3/edit?usp=sharing&ouid=102935042242647676321&rtpof=true&sd=true)

<div width="100%">
Expand Down
8 changes: 8 additions & 0 deletions sop-website/docs/Flowsheet-Data/_category_.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
{
"label": "Flowsheet Data",
"position": 6,
"link": {
"type": "generated-index",
"description": "Learn how to handle flowsheet data properly"
}
}
48 changes: 48 additions & 0 deletions sop-website/docs/Freetext-Data/Freetext-Data.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
---
title: Freetext Data
id: Freetext Data
description: An SOP for processing freetext information in clinical notes
---


**Purpose**

This document is for data contributing sites to better understand CHoRUS obligations with respect to Unstructured EHR Data


## **Unstructured EHR data**

As specified in the exploratory calls with sites, CHoRUS is requesting a tokenized version of the notes, not the notes or reports themselves. Here “tokens” refer to clinical entities (e.g., Diagnoses, Procedures, Medications, etc.) mentioned in text along with associated metadata (e.g., the Document ID, OMOP Athena Concept Code associated with the clinical entity, negation, certainty)

- Final deliverable

The expected final product is the table NOTE\_NLP of the OMOP5.4 Common Data Model.
The specific output should follow the [following standard](http://ohdsi.github.io/CommonDataModel/cdm54.html#NOTE_NLP).

Please note, that although CHoRUS is not requiring actual notes, only the NOTE table includes linkage information to the PERSON table. Thus, sites also have to produce a NOTE table including, at a minimum, the `person\_id` and `note\_id` fields so the NOTE\_NLP table can be appropriately linked to the PERSON table.

- Initial extraction

The initial extraction will be site dependent but will follow the general process. A list of patients will be generated from a structured EHR data. To each admission corresponds admission and discharge dates. The initial step is to request/perform an extraction of all reports and notes generated during the visit corresponding to these dates. The type of notes/reports to generate pertain

|Tier 1 (suggested OMOP concept-ID; LOINC code)|Tier 2|
| :- | :- |
|**Notes:**<p>- History and Physical/admission (3030023; 34117-2)</p><p>- Progress notes (3000735; 11506-3)</p><p>- Medical Consults\* (3020785;11488-4)</p><p>- OR, procedure reports (3018897;28570-0)</p><p>- Discharge summaries (3020091; 18842-5)</p><p></p>**Reports:**<p>- Radiology\* (40771183; 68604-8 – 45879817; LA16679-5)</p><p>- Cardiac echo\*\* (3018897; 28570-0)</p><p>- EEG\*\* (3018897; 28570-0)</p><p>- Surgical pathology/Cytology (3031451; 34819-3)</p><p></p>|**Notes:**<p>- Nursing assessments</p><p>- Social work</p><p>- Behavioral health</p><p>- PT</p><p>- OT</p><p>- Nutrition</p><p>- Pharmacy</p><p>- Wound care</p><p></p>**Reports:**<p>- PFTs</p><p>- EMGs</p><p>- Other</p><p></p><p></p>|

> Finer concept\_id are defined in LOINC OMOP regarding consults from specific services. Feel free to use these finer concept\_id
***No specific OMOP concept\_id’s exist for these types of reports. Your site might have adopted a different concept\_id. Please share through discussions***

- De-identification (Optional)

The second step in the process is to produce a de-identified version of the UEHR data. The necessity of this step is dictated by local data governance rules and is therefore site-dependent. The process of de-identification may be transparent to investigators, as institutional data scientist may already provide a de-identified version of all UEHR. More likely though, it will be the investigators task to produce this de-identified version. De-identification is the process of producing a version of the note devoid of PHI. If elements of dates are preserved, the result is a “limited” dataset. If elements of dates are not preserved, but dates are time shifted, the resulting dataset is “safe harbor”. This is a general definition of those terms and relevant to all data, not only UEHR.

Investigators are encouraged to find out whether there is an institution-favored method of de-identifying text, or an institutional policy pertaining to verify the result of text de-identification. We note that CHoRUS is not requesting de-identified text. Free software to de-identify text is available. Examples include [NLM Scrubber](https://lhncbc.nlm.nih.gov/scrubber/), [UCSF’s Philter ](https://github.com/BCHSI/philter-ucsf), and [Microsoft’s Presidio](https://microsoft.github.io/presidio/). As these approaches are computational/algorithmic, additional evaluation is required. Such an evaluation would need gold standard annotation of a subset of notes (the i2b2 de-identification task would be a good reference for such an annotation task) and comparison of results. In the absence of a specific institutional policy, it is generally recommended that at least 25 notes of each type be manually reviewed to verify the results of the de-identification method.

At this stage, it is important to become familiar with the OMOP NOTE and NOTE\_NLP table structures. Every note de-identified will require a unique “note\_id” , to be linked with a date, person and an occurrence, as well as a note type and possibly subtypes, at the very least. Thus, in the process of de-identification, enough information must be preserved to link each note with the other data domains.

Importantly, the basic contract data generation sites executed with CHoRUS specified that a limited dataset, with elements of dates preserved, should be delivered to CHoRUS. However, many institutions will not allow this to happen. Please, communicate with CHoRUS whether your contract specifies, and you will be delivering a limited or safe harbor dataset. See additional information on this topic under the structured EHR rubric.

- Producing standard concepts (tokens) from notes – the OHNLP pipeline

The ultimate UEHR data deliverable to CHoRUS is an OMOP NOTE\_NLP table, which, for each note\_id, will produce possible a large number of concepts. There is an excellent series of office hours put forth by Andrew Wen on the installation or the [OHNLP software](https://github.com/chorus-ai/OHNLP4CHoRUS), and verification of its performance. It is strongly encouraged that you carefully review these lectures, as well as provided examples.
8 changes: 8 additions & 0 deletions sop-website/docs/Freetext-Data/_category_.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
{
"label": "Freetext Data",
"position": 7,
"link": {
"type": "generated-index",
"description": "Learn how to handle freetext data properly"
}
}
68 changes: 68 additions & 0 deletions sop-website/docs/Imaging-Data/Imaging-Data.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
---
title: Imaging Data
id: Imaging Data
description: An SOP for processing medical images
---


**Purpose**

This document is for data contributing sites to better understand CHoRUS obligations with respect to Imaging Data

## **Imaging data**


### Data Processing Steps

1. Identify data extraction partners and methods (e.g., CTSI)
2. Identify the sources of the data (e.g., PACS, site research repository)
3. Determine which image modalities would need to be extracted, date range, and size
4. Confirm the type of data format you will receive (e.g., DICOM)
5. Confirm the type of IDs used to link this data to the patient (e.g., name) or other hospital ID (e.g., MRN)
6. Map local relevant metadata with CHORUS metadata using OMOP format
7. Identify metadata not covered by CHORUS that would be relevant
8. Data deidentification and cross-walk to PHI
9. Data quality review (e.g., data missingness)
10. Upload imaging data to the CHoRUS cloud
11. Upload all data tables and metadata files to your site-specific cloud enclave within the CHoRUS cloud environment.

### Imaging Metadata OMOP CDM

Files will be organized hierarchically by patient, modality, study\_UID, and, series\_UID.

- Data Summary by Modality**

We are asking for a summary of radiology occurrence count and radiology image count by imaging modality (e.g., CT, MR). Within those groups, we would like to have a breakdown of manufacturers and body part imaged (e.g., brain, chest).

**Table X. Radiology Occurrence**

|Field|Required|Data type|Description||
| :- | :- | :- | :- | :- |
|image\_occurrence\_id (PK)|Yes|integer |The unique key is given to an imaging study record (often referred to as the accession number or imaging order number) ||
|person\_id (FK)|Yes|integer|The person\_id of the Person for whom the procedure is recorded. This may be a system-generated code.||
|procedure\_occurrence\_id (FK)|Yes|integer |The unique key is given to a procedure record for a person. Link to the Procedure\_occurrence table.||
|visit\_occurrence\_id (FK)|No|integer|The unique key is given to the visit record for a person. Link to the Visit\_occurrence table.||
|anatomic\_site\_concept\_id (FK)|No|integer|Anatomical location of the imaging procedure by the medical acquisition device (gross anatomy). It maps the ANATOMIC\_SITE\_SOURCE\_VALUE to a Standard Concept in the Spec Anatomic Site domain. This should be coded at the lowest level of granularity.||
|wadors\_uri |No |varchar (max) |A Web Access to DICOM Objects via Restful Web Services Uniform Resource Identifier on study level.||
|local\_path|Yes|varchar (max)|Universal Naming Convention (UNC) path to the folder containing the image object file access via a storage block access protocol. (e.g., \\Server\Directory)||
|image\_occurrence\_date |Yes|date |The date the imaging procedure occurred. ||
|image\_study\_UID |Yes|varchar (250) |DICOM Study UID ||
|image\_series\_UID |Yes |varchar (250) |DICOM Series UID||
|modality |Yes |varchar (250) |DICOM-defined value (e.g., US, CT, MR, PT, DR, CR, NM)||

**Table X. Radiology Image**

|Field|Required|Data type|Description|
| :- | :- | :- | :- |
|image\_feature\_id (PK)|Yes |integer |The unique key is given to an imaging feature. |
|person\_id (FK)|Yes |integer|<p>The person\_id of the Person table for whom the </p><p>the procedure is recorded. This may be a system-generated code.</p>|
|image\_occurrence\_id (FK)|Yes|integer|The unique key of the Image\_occurrence table.|
|table\_concept\_id|Yes|integer|The concept\_id of the domain table that feature is stored in Measurement, Observation, etc. This concept should be used with the table\_row\_id.|
|table\_row\_id|Yes|integer|The row\_id of the domain table that feature is stored.|
|image\_feature\_concept\_id|Yes|integer|Concept\_id of standard vocabulary—often a LOINC or RadLex of image features|
|image\_feature\_type\_concept\_id|Yes|integer |<p>This field can be used to determine the provenance of the imaging features (e.g., DICOM SR, algorithms used on </p><p>images)</p>|
|image\_finding\_concept\_id|No|integer|<p>RadLex or other terms of the groupings of image </p><p>feature (e.g., nodule)</p>|
|image\_finding\_num|No|integer |Integer for linking related image features. It should not be interpreted as an order of clinical relevance.|
|anatomic\_site\_concept\_id|No|integer |This is the site on the body where the feature was found. It maps the ANATOMIC\_SITE\_SOURCE\_VALUE to a Standard Concept in the Spec Anatomic Site domain.|
|alg\_system |No |<p>varchar </p><p>(max) </p>|<p>URI of the algorithm that extracted features, </p><p>including version information</p>|
|alg\_datetime |No |datetime |The date and time of the algorithm processing.|
8 changes: 8 additions & 0 deletions sop-website/docs/Imaging-Data/_category_.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
{
"label": "Imaging Data",
"position": 8,
"link": {
"type": "generated-index",
"description": "Learn how to handle imaging data properly"
}
}
7 changes: 7 additions & 0 deletions sop-website/docs/OMOP-Mapping/OMOP-Mapping.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
---
title: OMOP Mapping
id: OMOP Mapping
description: An SOP for general conventions related to mapping data to OMOP CDM format
---


8 changes: 8 additions & 0 deletions sop-website/docs/OMOP-Mapping/_category_.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
{
"label": "OMOP Mapping",
"position": 9,
"link": {
"type": "generated-index",
"description": "Learn how to map terms to OMOP CDM format"
}
}
7 changes: 7 additions & 0 deletions sop-website/docs/Privacy/Privacy.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
---
title: Privacy
id: Privacy
description: An SOP for general conventions related to data privacy
---


8 changes: 8 additions & 0 deletions sop-website/docs/Privacy/_category_.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
{
"label": "Privacy",
"position": 10,
"link": {
"type": "generated-index",
"description": "Learn about best practices related to privacy"
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
---
title: Safe Harboring
id: Safe Harboring
description: An SOP for general conventions related to safe harboring
---
8 changes: 8 additions & 0 deletions sop-website/docs/Safe-Harboring-Approach/_category_.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
{
"label": "Safe Harboring Approach",
"position": 11,
"link": {
"type": "generated-index",
"description": "Learn about best practices related to safe harboring"
}
}
Loading

0 comments on commit 1171172

Please sign in to comment.