GitHub - FHIR-Aggregator/CDA2FHIR: Translating Cancer Data Commons (CDA) to FHIR (Fast Healthcare Interoperability Resources) format.

CDA2FHIR

Translating Cancer Data Commons (CDA) to 🔥 FHIR (Fast Healthcare Interoperability Resources) format.

Usage

Installation

from source

# clone repo & setup virtual env
python3 -m venv venv
. venv/bin/activate
pip install -e .

Transform to FHIR

Data

To run the transformer, ensure that CDA raw data is located in the ./data/raw/ directory. If you need to retrieve the raw data, please contact cancerdataaggregator @ gmail.

Usage: cda2fhir transform [OPTIONS]

Options:
  -s, --save               Save FHIR ndjson to CDA2FHIR/data/META folder.
                           [default: True]
  -v, --verbose
  -ns, --n_samples TEXT    Number of samples to randomly select - max 100.
  -nd, --n_diagnosis TEXT  Number of diagnosis to randomly select - max 100.
  -nf, --n_files TEXT      Number of files to randomly select - max 100.
  -f, --transform_files    Transform CDA files to FHIR DocumentReference and Group.
  -p, --path TEXT          Path to save the FHIR NDJSON files. default is
                           CDA2FHIR/data/META.
  --help                   Show this message and exit.

example

cda2fhir transform

NOTE: in-case of interest in validating your FHIR data with GEN3, you will need to go through the user-guide, setup, and documentation of GEN3 tracker before running the cda2fhir commands.

FHIR data validation

disable gen3-client

mv ~/.gen3/gen3_client_config.ini ~/.gen3/gen3_client_config.ini-xxx
mv ~/.gen3/gen3-client ~/.gen3/gen3-client-xxx

Run validate

 time cda2fhir validate
{'summary': {'Specimen': 721837, 'Observation': 731005, 'ResearchStudy': 423, 'BodyStructure': 163, 'Condition': 95262, 'ResearchSubject': 160649, 'Patient': 138738}}

real    5m
user    5m
sys     0m5.1s

Restore gen3-client

mv ~/.gen3/gen3-client-xxx ~/.gen3/gen3-client
mv ~/.gen3/gen3_client_config.ini-xxx ~/.gen3/gen3_client_config.ini

This command will validate your FHIR entities and their reference relations to each other. It will also generate a summary count of all entities in each ndjson file.

NOTE: This process may take 5 minutes or more, depending on your platform or compute power due to the size of the current data.

Testing

Current integration testing runs on all data and may take approximately 2 hours.

pytest -cov

Name		Name	Last commit message	Last commit date
Latest commit History 84 Commits
cda2fhir		cda2fhir
data		data
img		img
scripts		scripts
tests		tests
.dockerignore		.dockerignore
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CDA2FHIR

Usage

Installation

Transform to FHIR

Data

FHIR data validation

disable gen3-client

Run validate

Restore gen3-client

Testing

About

Releases 2

Packages

Contributors 2

Languages

FHIR-Aggregator/CDA2FHIR

Folders and files

Latest commit

History

Repository files navigation

CDA2FHIR

Usage

Installation

Transform to FHIR

Data

FHIR data validation

disable gen3-client

Run validate

Restore gen3-client

Testing

About

Resources

Stars

Watchers

Forks

Releases 2

Packages 0

Contributors 2

Languages

Packages