Sense-annotated corpora from the Semantic Processing Across Domains project. This project pools together the data from several articles related to sense annotation for Danish corpora.
This repository contains three main folders:
supersenses
constains the all-words supersense-annotated corpus. 2. It contains a folderofficial_distribution
with the files used for training and testing in the noted articles, and a folderall_annotations
with all the annotations generated by each annotator, previous to adjucation. 3. It is made up of six domains from the ClarinDK corpus plus the test section of the Danish Dependency Treebank (DDT
).lexicalsample
constains the lexical-sample annotations for a regular, dictionary based sense inventory, and for a supersense-clustered inventory.active_learning
constains the resulting annotation of "Active Learning for Sense Annotation".
The following publications make use or document the construction of this resource.
@inproceedings{olsenetal2015,
title={Coarse-Grained Sense Annotation of Danish across Textual Domains},
author={Olsen, Sussi and Pedersen, Bolette Sandford Mart{\i}nez Alonso, H{\'e}ctor and Johannsen, Anders},
booktitle={Proceedings of the workshop on Semantic resources and semantic annotation for Natural Language Processing and the Digital Humanities at NODALIDA},
pages={37},
year={2015}
}
@inproceedings{martinezalonsoetal2015supersenses,
title={Supersense tagging for Danish},
author={Mart{\i}nez Alonso, H{\'e}ctor and Johannsen, Anders and Olsen, Sussi and Nimb, Sanni and Sørensen, Nicolai Hartvig and Braasch, Anna and Søgaard, Anders and Pedersen, Bolette Sandford},
booktitle={Nordic Conference of Computational Linguistics NODALIDA 2015},
pages={21},
year={2015}
}
@inproceedings{martinezalonsoetal2016,
title={An empirically grounded expansion of the supersense inventory},
author={Mart{\i}nez Alonso, H{\'e}ctor and Johannsen, Anders and Olsen, Sussi and Nimb, Sanni and Pedersen, Bolette Sandford},
booktitle={Global Wordnet Conference 2016 (to appear)},
}
@inproceedings{martinezalonsoetal2015active,
title={Active learning for sense annotation},
author={ Mart{\i}nez Alonso, H{\'e}ctor and Plank, Barbara and Johannsen, Anders and S{\o}gaard, Anders},
booktitle={Nordic Conference of Computational Linguistics NODALIDA 2015},
pages={245},
year={2015}
}