Skip to content

Sense-annotated corpora from the Semantic Processing Across Domains project

License

Notifications You must be signed in to change notification settings

coastalcph/semdax

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SemDaX corpora

Sense-annotated corpora from the Semantic Processing Across Domains project. This project pools together the data from several articles related to sense annotation for Danish corpora.

This repository contains three main folders:

  1. supersenses constains the all-words supersense-annotated corpus. 2. It contains a folder official_distribution with the files used for training and testing in the noted articles, and a folder all_annotations with all the annotations generated by each annotator, previous to adjucation. 3. It is made up of six domains from the ClarinDK corpus plus the test section of the Danish Dependency Treebank (DDT).
  2. lexicalsample constains the lexical-sample annotations for a regular, dictionary based sense inventory, and for a supersense-clustered inventory.
  3. active_learning constains the resulting annotation of "Active Learning for Sense Annotation".

The following publications make use or document the construction of this resource.

@inproceedings{olsenetal2015,
  title={Coarse-Grained Sense Annotation of Danish across Textual Domains},
  author={Olsen, Sussi and Pedersen, Bolette Sandford Mart{\i}nez Alonso, H{\'e}ctor and Johannsen, Anders},
  booktitle={Proceedings of the workshop on Semantic resources and semantic annotation for Natural Language Processing and the Digital Humanities at NODALIDA},
  pages={37},
  year={2015}
}

@inproceedings{martinezalonsoetal2015supersenses,
  title={Supersense tagging for Danish},
  author={Mart{\i}nez Alonso, H{\'e}ctor and Johannsen, Anders and Olsen, Sussi and Nimb, Sanni and Sørensen, Nicolai Hartvig and Braasch, Anna and Søgaard, Anders and Pedersen, Bolette Sandford},
  booktitle={Nordic Conference of Computational Linguistics NODALIDA 2015},
  pages={21},
  year={2015}
}

@inproceedings{martinezalonsoetal2016,
  title={An empirically grounded expansion of the supersense inventory},
  author={Mart{\i}nez Alonso, H{\'e}ctor and Johannsen, Anders and Olsen, Sussi and Nimb, Sanni and Pedersen, Bolette Sandford},
  booktitle={Global Wordnet Conference 2016 (to appear)},
}


  @inproceedings{martinezalonsoetal2015active,
  title={Active learning for sense annotation},
  author={ Mart{\i}nez Alonso, H{\'e}ctor and  Plank, Barbara and Johannsen, Anders and  S{\o}gaard,  Anders},
  booktitle={Nordic Conference of Computational Linguistics NODALIDA 2015},
  pages={245},
  year={2015}
}

About

Sense-annotated corpora from the Semantic Processing Across Domains project

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published