ISchema as a shortcut for similar orthographies #18

LinguList · 2016-12-03T13:46:16Z

Lingpy distinguishes "schemas" for sound classes, including:

one routine for segmentation
one routine for conversion to sound classes (and a default sound class model)
one default routine for the scoring function in alignments

Currently, lingpy has two schemas: "ipa" and "asjp", the latter working on ASJP alphabet.

We should add an additional schema in lingpy3, and the possibility to register new schemas by the user:

plain ipa (assuming that orthogrpaphy is more or less regular IPA)
fuzzy ipa (assuming a messy IPA, with aspiration not written as superscript, etc., requiring a segmentation function based on a clean_string strategy)
asjp

More schemas are possible, for example "starling", as the whole data of Tower of Babel is in their own IPA version. The main argument for schemas is that it is too time-consuming to write individual orthography-profiles for all datasets, while on the other hand, many datasets are consistent enough to allow to be analysed by an enhanced function that is simpler than a full-fledged orthography profile.

SimonGreenhill · 2016-12-03T18:22:49Z

a sensible 'broad phonemic' schema would be great too.

LinguList · 2016-12-03T18:26:09Z

I'd assume that we could cover this more or less in "fuzzy" ipa, as this schema will cover cases like:

thoxther > th o x th e r

And phonemic transcriptions are usually much more lazy regarding writing of strange unicode characters than other ones. Or do you have specific other cases in mind?

LinguList added the design questions label Dec 3, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ISchema as a shortcut for similar orthographies #18

ISchema as a shortcut for similar orthographies #18

LinguList commented Dec 3, 2016

SimonGreenhill commented Dec 3, 2016

LinguList commented Dec 3, 2016

ISchema as a shortcut for similar orthographies #18

ISchema as a shortcut for similar orthographies #18

Comments

LinguList commented Dec 3, 2016

SimonGreenhill commented Dec 3, 2016

LinguList commented Dec 3, 2016