Data and system for family history extraction from a synthetic corpus of Norwegian clinical text. The paper desccribing this work, entitled Iterative development of family history annotation guidelines using a synthetic corpus of clinical text, was presented at LOUHI workshop which is collocated with EMNLP 2018.
The co-authors of the paper are Pål Brekke, Øystein Nytrø, Lilja Øvrelid. The work is funded by BIGMED project.
- Scikit-learn
The results reported in the paper can be replicated by
Train and test SVM 5-fold cross-validation for entity recognition.
python3 svm_ner.py all_sentences.vert.parse.entity 5
Train and test SVM 5-fold cross-validation for relation extraction.
python3 uio2rel.py pal_annotate all_sentences.vert.parse 5