Skip to content

Latest commit

 

History

History

cognates

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 

Cognates

A major use case of wordlists in historical linguistics is for identifying and assembling cognate sets. Assigning forms to cognate sets is itself a (rather large) step in analyzing the wordlist data, but also serves as intermediate step before feeding the cognate sets into further analyses e.g. to determine language relatedness. Thus, being able to exchange data on cognate judgements related to wordlists is important, and covered by CLDF.

It is recommended that columns or other metadata describing the method used for the cognacy judgements and the alignments are added, but as yet no clear standard for these has evolved.

Partial Cognates

Like cognates, partial cognates refer to a form in a wordlist. But to make it possible to annotate parts of a form, the segmentation of the form MUSTcontain morpheme boundaries, i.e. the FormTable MUST contain a segments property and the secondary separator of the column description is used to delimit morphemes, e.g.

{
    "name": "Segments",
    "propertyUrl": "http://cldf.clld.org/v1.0/terms.rdf#segments",
    "separator": "+"
}

Then, the scope of a partial cognate judgement can be annotated by enumerating the relevant morphemes, e.g. 1 2 3 to indicate the first three morphemes of a form are assigned to a cognate set, optionally using shortcut notation for ranges like 1:3.

The default description of the cognate table is available in CognateTable-metadata.json.

Example

IE-CoR's CognateTable is described here: https://github.com/lexibank/iecor/blob/v1.0/cldf/cldf-metadata.json#L352-L436

CognateTable: cognates.csv

Name/Property Datatype Cardinality Description
ID string singlevalued

A unique identifier for a row in a table.

To allow usage of identifiers as path components of URLs IDs must only contain alphanumeric characters, underscore and hyphen.

Form_ID string singlevalued References the form which is judged to belong to a cognate set.
References FormTable
Cognateset_ID string singlevalued References the cognate set a form is judged to belong to.
References CognatesetTable
Segment_Slice list of string (separated by ) multivalued Specifies the slice of morphemes of the form in case of partial cognacy.
Alignment list of string (separated by ) multivalued The segments of the form aligned with respect to all other forms in the cognate set
Source list of string (separated by ;) multivalued

List of source specifications, of the form <source_ID>[], e.g. http://glottolog.org/resource/reference/id/318814[34], or meier2015[3-12] where meier2015 is a citation key in the accompanying BibTeX file.