If you use these data please cite
- the original source
List, Johann-Mattis and Jelena Prokić. (2014). A benchmark database of phonetic alignments in historical linguistics and dialectology. In: Proceedings of the International Conference on Language Resources and Evaluation (LREC), 26 — 31 May 2014, Reykjavik. 288-294.
- the derived dataset using the DOI of the particular released version you were using
This dataset is licensed under a CC-BY-4.0 license
Available online at https://zenodo.org/record/11880/files/germanic.zip
- Varieties: 538 (linked to 61 different Glottocodes)
- Concepts: 590 (linked to 297 different Concepticon concept sets)
- Lexemes: 50,095
- Sources: 11
- Synonymy: 1.06
- Cognacy: 50,095 cognates in 750 cognate sets (0 singletons)
- Cognate Diversity: 0.00
- Invalid lexemes: 0
- Tokens: 216,493
- Segments: 753 (2 BIPA errors, 2 CLTS sound class errors, 752 CLTS modified)
- Inventory size (avg): 51.73
Name | GitHub user | Description | Role |
---|---|---|---|
Johann-Mattis List | @LinguList | maintainer | Author |
Jelena Prokić | DataCollector | Author |
The following CLDF datasets are available in cldf:
- CLDF Wordlist at cldf/cldf-metadata.json