Skip to content
Annotation guidelines, data, and code for family history extraction from synthetic corpus of clinical text
Python
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
20_Paper_final.pdf
README.md
all_sentences.vert.entity
all_sentences.vert.parse
all_sentences.vert.parse.entity
interannotator_agreement.py
ner_predicted.txt
svm_ner.py
synthetic_data.zip
uio2rel.py

README.md

NorSynthClinical

Data and system for family history extraction from a synthetic corpus of Norwegian clinical text. The paper desccribing this work, entitled Iterative development of family history annotation guidelines using a synthetic corpus of clinical text, was presented at LOUHI workshop which is collocated with EMNLP 2018.

The co-authors of the paper are Pål Brekke, Øystein Nytrø, Lilja Øvrelid. The work is funded by BIGMED project.

Requirements

  • Scikit-learn

Code and data for experiments

The results reported in the paper can be replicated by

Train and test SVM 5-fold cross-validation for entity recognition.

  • python3 svm_ner.py all_sentences.vert.parse.entity 5

Train and test SVM 5-fold cross-validation for relation extraction.

  • python3 uio2rel.py pal_annotate all_sentences.vert.parse 5
You can’t perform that action at this time.