Skip to content

lessersunda/lexirumah-data

with_lexi_data
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

lexiruma-data

The data underlying the LexiRumah CLLD database is maintained and edited here, as well as the pylexirumah python package, which provides an API for accessing, manipulating and publishing the database content.

CLDF

The cldf directory contains the dataset in CLDF 1.0 Wordlist format. Included beyond forms, which are cross-linked to lects (with Glottolog IDs) and concepts (with Concepticon references), are cognate judgements (automatically coded for the time being, but manual changes will be documented) and a borrowing table.

Non-CLDF

In addition to the CLDF dataset, we retain data which has not (yet) been merged into the dataset. The noncldf folder contains the sociolinguistic profile of many of the speakers who contributed word lists as informants.

The keraf subfolder contains the original digitizations of the word lists from Keraf (1978) for reference. The forms in the cldf may have been normalized to IPA and some concepts have been merged with close-but-not-perfect synonyms.

The sulawesi subfolder contains wordlists from South-East Sulawesi, provided by David Mead, as well as the draft for a script to import these lects into LexiRumah.

pylexirumah

Build Status

tests

The tests directory contains tests for functionality in pylexirumah.