Surfacing Semantic Data from Clinical Notes in Electronic Health Records for Tailored Care, Trial Recruitment and Clinical Research
Switch branches/tags
Nothing to show
Clone or download
Latest commit 72e234b Oct 15, 2018
Permalink
Failed to load latest commit information.
.idea new Dec 1, 2016
UI validation ui Aug 29, 2018
analysis merged Sep 24, 2018
index_settings Create semehr_es6_concept_mapping.txt Jul 3, 2018
onto_res Update NCBO_sparql_queries.MD Aug 15, 2018
resources Add files via upload Jul 3, 2018
rnn_ann merged May 14, 2018
studies a bit fixes for the rules Oct 15, 2018
testing new UI for KCH 100k use case Mar 10, 2017
trans_anns dbscan model for classification Sep 11, 2018
umls_api umls authentication Feb 28, 2018
README.md Update README.md Feb 26, 2018
__init__.py umls cardic disorders May 7, 2018
ann_post_rules.py read case_sensitive property from configuration Sep 24, 2018
ann_validation.py annotation validation data preprocessor May 22, 2017
autoimmune.py escape speical charactor Dec 13, 2016
cohortanalysis.py stupid bug of t1 using t0 for time window filtering Jun 29, 2018
concept_mapping.py allow skip certain parent-sub concept relations to be ignored in comp… Jun 12, 2018
concept_processing_100k.py post rule implemented Apr 21, 2017
doc_set_analyzer.py Merge branch 'master' of https://github.com/KHP-Informatics/autoimmun… Nov 23, 2017
entity_centric_es.py default not to overwrite if doc exists already Oct 2, 2018
gb_utils.py utils for phenome model Feb 10, 2018
gel_util.py export mapped study concepts to a flat json Jun 20, 2018
mimic_indexer.py changed setting folder Jan 19, 2018
mimicdao.py merged May 14, 2018
ontotextapi.py merged May 14, 2018
prospector.py fixed bugs Dec 1, 2016
pubmed_data.py merged May 14, 2018
semehr_processor.py support ES6: separate ctx_concept into a separate index Jul 3, 2018
sqldbutils.py remove commented imports May 16, 2018
study_analyzer.py allow skip certain parent-sub concept relations to be ignored in comp… Jun 12, 2018
utils.py merged May 14, 2018

README.md

SemEHR

Surfacing Semantic Data from Clinical Notes in Electronic Health Records for Tailored Care, Trial Recruitment and Clinical Research

updates

  • (26 Feb 2018) An actionable transparency model has been implemented to derive confidence/accuracy value for each annotation on a cohort basis. Such value is based on the syntactic/semantic/contextual characteristics of the containing sentence/document of annotations. (a working paper about this technique will be shared soon.)
  • (9 Feb 2018) Patient Phenome UI implemented - to support 100k Genomics England (GeL) phenome model population for patients recruited for rare disease studies. HPO Phenome Model
  • (22 Dec 2017) An application paper describing SemEHR has been accepted by JAMIA, titled “SemEHR: A General-purpose Semantic Search System to Surface Semantic Data from Clinical Notes for Tailored Care, Trial Recruitment and Clinical Research”.
  • (17 Nov 2017) Documentations for running SemEHR pipeline and API access have been put in the wiki: https://github.com/CogStack/SemEHR/wiki.
  • (14 Sept 2017) An abstract describing SemEHR toolkit has been accepted to present at UK Publich Health Science Conference 2017 and to be published by The Lancet.
  • (24 Apr 2017) An SemEHR instance has been deployed on MIMICIII data on 3 VMs of KCL's Rosalind HPC cluster to facilitate researches on this open ICU EHRs.
  • (19 Apr 2017) The instance of SemEHR on SLaM's CRIS data is supportingt the discovery of associations between liver diseases and addictions.
  • (21 Mar 2017) An SemEHR instance has been deployed at King's College Hospital to support patient recruitments for rare diseases in Genomics England's 100,000 Genome Project.
  • (12 Oct 2016) SemEHR is initated as an effort to make use of EU KConnect results for supporting researches on anonymised EHR data of South London and Maudsley Hospital.

Intro

Built upon off-the-shelf toolkits including a Natural Language Processing (NLP) pipeline (Bio-Yodie) and an enterprise search system (CogStack), SemEHR implements a generic information extraction (IE) and retrieval infrastructure by identifying contextualised mentions of a wide range of biomedical concepts from unstructured clinical notes. Its IE functionality features an adaptive and iterative NLP mechanism where specific requirements and fine-tuning can be fulfilled and realised on a study basis. NLP annotations are further assembled at patient level and extended with clinical and EHR-specific knowledge to populate a panorama for each patient, which comprises a) longitudinal semantic data views and b) structured medical profile(s). The semantic data is serviced via ontology-based search and analytics interfaces to facilitate clinical studies.

System Achitecture

With SemEHR, the clinical variables hidden in clinical notes are surfaced via a set of search interfaces. A typical process to answer a clinical question (e.g. patients with hepatitis c), which previously might involve NLP, turns into one or a few google-style searches, for which SemEHR will pull out the cohort of relevant patients, populate patient-level summaries - numbers of contextualised concept mentions (e.g. 2nd patient has 16 total mentions of the disease, 15 of them were positive and 1 was historical), and link each mention to its original clinical note.

Publications

SemEHR: surfacing semantic data from clinical notes in electronic health records for tailored care, trial recruitment, and clinical research. Honghan Wu, Giulia Toti, Katherine I Morley, Zina Ibrahim, Amos Folarin, Ismail Kartoglu, Richard Jackson, Asha Agrawal, Clive Stringer, Darren Gale, Genevieve M Gorrell, Angus Roberts, Matthew Broadbent, Robert Stewart, Richard J B Dobson. The Lancet , Volume 390 , S97. 10.1016/S0140-6736(17)33032-5

SemEHR: A General-purpose Semantic Search System to Surface Semantic Data from Clinical Notes for Tailored Care, Trial Recruitment and Clinical Research. Honghan Wu, Giulia Toti, Katherine I Morley, Zina Ibrahim, Amos Folarin, Ismail Kartoglu, Richard Jackson, Asha Agrawal, Clive Stringer, Darren Gale, Genevieve M Gorrell, Angus Roberts, Matthew Broadbent, Robert Stewart, Richard J B Dobson.Journal of the American Medical Informatics Association, 2017. 10.1093/jamia/ocx160

Questions?

Email Honghan Wu (honghan.wu@gmail.com)