Skip to content
Surfacing Semantic Data from Clinical Notes in Electronic Health Records for Tailored Care, Trial Recruitment and Clinical Research
Branch: master
Clone or download
Latest commit a60207d May 21, 2019
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
.idea new Dec 1, 2016
UI updated ui Jan 24, 2019
analysis fixed a parameter issue Mar 26, 2019
docker using latest open jdk to replace oracle jdk May 20, 2019
index_settings Create semehr_es6_concept_mapping.txt Jul 3, 2018
onto_res Update NCBO_sparql_queries.MD Aug 15, 2018
resources sample medical records Dec 19, 2018
rnn_ann merged May 14, 2018
studies sepa Apr 24, 2019
testing new UI for KCH 100k use case Mar 10, 2017
trans_anns dbscan model for classification Sep 11, 2018
umls_api umls authentication Feb 28, 2018
LICENSE Create LICENSE Feb 26, 2019
README.md updated Dec 20, 2018
__init__.py umls cardic disorders May 7, 2018
ann_converter.py updated May 21, 2019
ann_post_rules.py autoimmune study version 3 Jan 8, 2019
ann_validation.py annotation validation data preprocessor May 22, 2017
autoimmune.py escape speical charactor Dec 13, 2016
cohort_helper.py ignore empty content cohort docs Apr 14, 2019
cohortanalysis.py support document medata data filtering for es based studies Apr 16, 2019
concept_mapping.py allow skip certain parent-sub concept relations to be ignored in comp… Jun 12, 2018
concept_processing_100k.py post rule implemented Apr 21, 2017
doc_set_analyzer.py Merge branch 'master' of https://github.com/KHP-Informatics/autoimmun… Nov 23, 2017
docanalysis.py name splitext Apr 29, 2019
entity_centric_es.py ignore ssl certificate verification Nov 12, 2018
gb_utils.py utils for phenome model Feb 10, 2018
gel_util.py export mapped study concepts to a flat json Jun 20, 2018
mimic_indexer.py changed setting folder Jan 19, 2018
mimicdao.py merged May 14, 2018
ontotextapi.py merged May 14, 2018
prospector.py fixed bugs Dec 1, 2016
pubmed_data.py merged May 14, 2018
semehr_processor.py remove a special char Apr 17, 2019
sqldbutils.py each process one connection Apr 16, 2019
study_analyzer.py support tsv file based study concept definition Apr 17, 2019
utils.py fixed a bug in multi-process function on large output files Apr 16, 2019

README.md

SemEHR

Surfacing Semantic Data from Clinical Notes in Electronic Health Records for Tailored Care, Trial Recruitment and Clinical Research

updates

  • (20 Dec 2018) Dead easy to run SemEHR NOW - Docker version of SemEHR released! Check docker.
  • (26 Feb 2018) An actionable transparency model has been implemented to derive confidence/accuracy value for each annotation on a cohort basis. Such value is based on the syntactic/semantic/contextual characteristics of the containing sentence/document of annotations. (a working paper about this technique will be shared soon.)
  • (9 Feb 2018) Patient Phenome UI implemented - to support 100k Genomics England (GeL) phenome model population for patients recruited for rare disease studies. HPO Phenome Model
  • (22 Dec 2017) An application paper describing SemEHR has been accepted by JAMIA, titled “SemEHR: A General-purpose Semantic Search System to Surface Semantic Data from Clinical Notes for Tailored Care, Trial Recruitment and Clinical Research”.
  • (17 Nov 2017) Documentations for running SemEHR pipeline and API access have been put in the wiki: https://github.com/CogStack/SemEHR/wiki.
  • (14 Sept 2017) An abstract describing SemEHR toolkit has been accepted to present at UK Publich Health Science Conference 2017 and to be published by The Lancet.
  • (24 Apr 2017) An SemEHR instance has been deployed on MIMICIII data on 3 VMs of KCL's Rosalind HPC cluster to facilitate researches on this open ICU EHRs.
  • (19 Apr 2017) The instance of SemEHR on SLaM's CRIS data is supportingt the discovery of associations between liver diseases and addictions.
  • (21 Mar 2017) An SemEHR instance has been deployed at King's College Hospital to support patient recruitments for rare diseases in Genomics England's 100,000 Genome Project.
  • (12 Oct 2016) SemEHR is initated as an effort to make use of EU KConnect results for supporting researches on anonymised EHR data of South London and Maudsley Hospital.

Intro

Built upon off-the-shelf toolkits including a Natural Language Processing (NLP) pipeline (Bio-Yodie) and an enterprise search system (CogStack), SemEHR implements a generic information extraction (IE) and retrieval infrastructure by identifying contextualised mentions of a wide range of biomedical concepts from unstructured clinical notes. Its IE functionality features an adaptive and iterative NLP mechanism where specific requirements and fine-tuning can be fulfilled and realised on a study basis. NLP annotations are further assembled at patient level and extended with clinical and EHR-specific knowledge to populate a panorama for each patient, which comprises a) longitudinal semantic data views and b) structured medical profile(s). The semantic data is serviced via ontology-based search and analytics interfaces to facilitate clinical studies.

System Achitecture

With SemEHR, the clinical variables hidden in clinical notes are surfaced via a set of search interfaces. A typical process to answer a clinical question (e.g. patients with hepatitis c), which previously might involve NLP, turns into one or a few google-style searches, for which SemEHR will pull out the cohort of relevant patients, populate patient-level summaries - numbers of contextualised concept mentions (e.g. 2nd patient has 16 total mentions of the disease, 15 of them were positive and 1 was historical), and link each mention to its original clinical note.

Publications

SemEHR: surfacing semantic data from clinical notes in electronic health records for tailored care, trial recruitment, and clinical research. Honghan Wu, Giulia Toti, Katherine I Morley, Zina Ibrahim, Amos Folarin, Ismail Kartoglu, Richard Jackson, Asha Agrawal, Clive Stringer, Darren Gale, Genevieve M Gorrell, Angus Roberts, Matthew Broadbent, Robert Stewart, Richard J B Dobson. The Lancet , Volume 390 , S97. 10.1016/S0140-6736(17)33032-5

SemEHR: A General-purpose Semantic Search System to Surface Semantic Data from Clinical Notes for Tailored Care, Trial Recruitment and Clinical Research. Honghan Wu, Giulia Toti, Katherine I Morley, Zina Ibrahim, Amos Folarin, Ismail Kartoglu, Richard Jackson, Asha Agrawal, Clive Stringer, Darren Gale, Genevieve M Gorrell, Angus Roberts, Matthew Broadbent, Robert Stewart, Richard J B Dobson.Journal of the American Medical Informatics Association, 2017. 10.1093/jamia/ocx160

Questions?

Email Honghan Wu (honghan.wu@gmail.com)

You can’t perform that action at this time.