Lexibank Analysed
How to cite
If you use these data please cite
- the original source
List, Johann-Mattis; Forkel, Robert; Greenhill, Simon J.; Rzymski, Christoph; Englisch, Johannes; and Russell D. Gray (2021): Lexibank: A publicly available repository of standardized lexical datasets with automatically computed phonological and lexical features for more than 2000 language varieties [Dataset, Version 1.0]. Geneva: Zenodo.
- the derived dataset using the DOI of the particular released version you were using
Description
This dataset is licensed under a CC-BY-4.0 license
Available online at https://lexibank.clld.org
Statistics
- Varieties: 2,029
- Concepts: 3,033
- Lexemes: 709,638
- Sources: 75
- Synonymy: 1.14
- Invalid lexemes: 0
- Tokens: 3,857,425
- Segments: 1,486 (0 BIPA errors, 0 CLTS sound class errors, 1478 CLTS modified)
- Inventory size (avg): 37.46
Possible Improvements:
- Languages linked to bookkeeping languoids in Glottolog:
Contributors
Name | GitHub user | Description | Role |
---|---|---|---|
Johann-Mattis List | @LinguList | maintainer | Author |
Robert Forkel | @xrotwang | maintainer | Author |
Simon J. Greenhill | @simongreenhill | maintainer | Author |
Christoph Rzymski | @chrzyki | maintainer | Author |
Johannes Englisch | @johenglisch | maintainer | Author |
Russell D. Gray | maintainer | Author |
CLDF Datasets
The following CLDF datasets are available in cldf: