Skip to content

lexibank/sagartst

Repository files navigation

CLDF dataset derived from Sagart et al.'s "Sino-Tibetan Database of Lexical Cognates" from 2019

CLDF validation

How to cite

If you use these data please cite

  • the original source

    Laurent Sagart, Jacques, Guillaume, Yunfan Lai, and Johann-Mattis List (2019): Sino-Tibetan Database of Lexical Cognates. Jena: Max Planck Institute for the Science of Human History.

  • the derived dataset using the DOI of the particular released version you were using

Description

This dataset is licensed under a CC-BY-4.0 license

Available online at http://dighl.github.io/sinotibetan/

Conceptlists in Concepticon:

Notes

Statistics

CLDF validation Glottolog: 100% Concepticon: 100% Source: 100% BIPA: 100% CLTS SoundClass: 100%

  • Varieties: 50 (linked to 48 different Glottocodes)
  • Concepts: 250 (linked to 250 different Concepticon concept sets)
  • Lexemes: 12,179
  • Sources: 25
  • Synonymy: 1.06
  • Cognacy: 12,179 cognates in 5,120 cognate sets (3,468 singletons)
  • Cognate Diversity: 0.41
  • Invalid lexemes: 0
  • Tokens: 60,455
  • Segments: 459 (0 BIPA errors, 0 CLTS sound class errors, 454 CLTS modified)
  • Inventory size (avg): 51.26

Contributors

Name GitHub user Description Role
Laurent Sagart cognate coding Author
Guillaume Jacques cognate coding Author
Yunfan Lai data managment Author
Johann-Mattis List @LinguList maintainer Author, Editor

CLDF Datasets

The following CLDF datasets are available in cldf: