Fully connecting the Observational Health Data Science and Informatics (OHDSI) initiative with the world of linked open data
This work was conceptualized for/and (mostly) carried out while at the Biomedical Linked Annotation Hackathon 5 in Kashiwa, Japan.
We are veru grateful for the support on this work.
There are three versions of this utility: OHDSI2RDF_dict.py, OHDSI2RDF.py and OHDSI2RDF_mp.py. The first one uses a dictionary, the second one is single threaded and the third program uses multi-processing. However, the second one seems slower, so be sure to try them.
Assumptions: This program assumes that you have the OHDSI vocabulary CSV files extracted in the folder you are running this code and they have the standard uppercase named files. The second assumption is that you have the Ananke mappings in the standard CSV file provided.
How to run
python OHDSI2RDF_dict.py >> OHDSI2RDF.ttl
The program outputs to the screen, so be sure to capture the output on a file.
NOTE: With enough RAM this runs in about 15 minutes
Version 1.0: Relationships included: Vocabulary, Domain, Concept_class, Concept
- Add ancestor and Synonym relationships
- Improve CUI assigning from Ananke source