Skip to content

Annotation guidelines, data, and code for family history extraction from synthetic corpus of clinical text

Notifications You must be signed in to change notification settings

ltgoslo/NorSynthClinical

Repository files navigation

NorSynthClinical

Data and system for family history extraction from a synthetic corpus of Norwegian clinical text. The paper desccribing this work, entitled Iterative development of family history annotation guidelines using a synthetic corpus of clinical text, was presented at LOUHI workshop which is collocated with EMNLP 2018.

The co-authors of the paper are Pål Brekke, Øystein Nytrø, Lilja Øvrelid. The work is funded by BIGMED project.

Requirements

  • Scikit-learn

Code and data for experiments

The results reported in the paper can be replicated by

Train and test SVM 5-fold cross-validation for entity recognition.

  • python3 svm_ner.py all_sentences.vert.parse.entity 5

Train and test SVM 5-fold cross-validation for relation extraction.

  • python3 uio2rel.py pal_annotate all_sentences.vert.parse 5

About

Annotation guidelines, data, and code for family history extraction from synthetic corpus of clinical text

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages