Data and system for family history extraction from a synthetic corpus of Norwegian clinical text. The paper desccribing this work, entitled Iterative development of family history annotation guidelines using a synthetic corpus of clinical text, was presented at LOUHI workshop which is collocated with EMNLP 2018.
The co-authors of the paper are Pål Brekke, Øystein Nytrø, Lilja Øvrelid. The work is funded by BIGMED project.
Code and data for experiments
The results reported in the paper can be replicated by
Train and test SVM 5-fold cross-validation for entity recognition.
python3 svm_ner.py all_sentences.vert.parse.entity 5
Train and test SVM 5-fold cross-validation for relation extraction.
python3 uio2rel.py pal_annotate all_sentences.vert.parse 5