Collection of scripts and data for computational phonology
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.

Abstract phoneme manipulation for historical linguistics

Feature system for language phylogeny

In this work I propose a model for the calculation of dimensionless distances between phonemes ("segments") in which sounds are described with mostly binary distinctive features. The feature system extends the one used in Phoible by Moran, McCloy, and Wright (2014), based in the one by Hayes (2009).

It is necessary to state that this articulatory model was developed for the purposes of language phylogeny of my independent research. In no way I have the intention of bringing back the descriptive goals of early XXth century phonetics when some linguists believed that every speech sound could be categorized withing a finite set of distinctive features; I also don't support the idea that the features are adopted are a consequence of some "deep" set of traits or the extension of some "mental representation" of sounds. In short, this is just a way of describing speech sounds for my purposes.

The feature system here presented is in no way definitive or stable, particularly considering the goal of allowing autosegmental descriptions in the future.

File, still under development, is used for the generation of a preliminary table of features for future manual review and extension.

Airstream features

The first subset of features is related to airstream mechanisms, i.e., the method by which airflow is created in the vocal tract.

Work in this subset is still very preliminary, as there is still need to fully integrate non pulmonic sounds (most previous systems offer no support to such sounds, which are needed for a global coverage).


  • pulmonic: pulmonic egressive sounds, where the air is pushed by the ribs and the diaphragm. All sounds with the exception of non-pulmonic consonants (ejectives, implosives, and clicks) are considered pulmonic.
  • glottalic.egressive: where the air is moved by the compression produced by an upward movement of the glottis. Used for ejective and ejective-like consonants, corresponds to Phoible's raisedLarynx feature.
  • glottalic.ingressive: where the air is moved by the rarification caused by a downward movement of the glottis. The name is used for an opposition with the glottalic.egressive feature above, as the air does not necessarily flows inward. Used for implosives and implosive-like consonants, corresponds to Phoible's loweredLarynx feature.
  • lingual.ingressive: where the air is moved by the rarification caused by a downward movement at the tongue, usually at a velar position. Used for clicks, corresponds to Phoible's click feature.

TODO: check if it is better to rename glottalic.egressive and glottalic.ingressive, using names that, like in Phoible, refer to the position of the larynx and not to an abstract description of the air movement (particularly for the confusion regarding "ingressive").

Phonation features

For the time being a single feature is defined, vibration, that should be later extended to allow descriptions such as murmur, slack, modal, stiff, creaky, glottalized and (maybe) ballistic. I might end using scalars (i.e., level of vibration, otherwise we might need two sublevels for dealing with lenis/fortis gradations).

Creaky and glottalized might possibly be defined outside phonation itself, just using the glottis as a passive articulator, with creaky for sonorants (should also consider cases like Danish stød). Must also study how to treat the relation between murmured and aspirated sounds.

TODO: decide how many levels to use (only two, voiced and voiceless? three, with open glottis, sweet spot, and closed glottis? four, with voiceless, breathy, modal, and closed? seven, with voiceless, breathy, slack, modal, stiff, creaky, and closed?) -- spread_glottis is a potential feature for aspirated sounds (and murmur?); for constricted_glottis, I should consider how this interferes with ejectives and glottalized sounds (or should I remove glottalic.egressive?). Should also consider stiff/slack vocal cords.

TODO: Consider on-set?

Laterality feature

For the time being, just a single lateral feature, which might be replace in the future by some extension to the place of articulation (but using just lateral is probably easier and enough). As per HC, "Lateral sounds, the most familiar of which is [l], are produced with the tongue placed in such a way as to prevent the airstream from flowing outward through the centre of the mouth, while allowing it to pass over one or both sides of the tongue; central sounds do not invoke such a constriction."

Manner of articulation features

vowels glides liquids nasals fricatives affricates stops
syllabic + - - - - - -
vocalic + + - - - - -
approximant + + + - - - -
sonorant + + + + - - -
continuant + + + + + - -
delayed release + + + + + + -

Some features are not in line with what is traditionally used, such as vocalic and delayed release, including for reasons of simmetry (we decided that manner of articulation features would be always positive for vowels and negative for oral stops, for consistency). Delayed release might be changed in the future for features relating to sound contour.

Clicks, implosives and ejective stops are equivalent to stops; ejective fricatives are equivalent to fricatives; ejective affricates are equivalent to affricates. The differences between these are specified by airstream features.

Liquids don't include trills and flaps, which we consider a type of stop -- the distinction between the many categories needs further features. We assume that trills and flaps are different, as we understand that trills have a vibration produced by a contact between the articulators which is not strong enough to hold the air turbulence (they are "weak stops"), while in a flap there is no buildup of air pressure and, consequently, no release burst. We are using to temporary features flap and trill to distinguish between plain stops, flaps/taps, and trills.


  • syllabic
  • vocalic: vocalic sounds (or "non consonantal sounds") are produced without any constriction of the airflow, i.e., without complete or partial obstruction of the vocal tract. Vowels and glides (even though they usually present a resonant stricture) are vocalic.
  • sonorant: sonorant sounds are produced with continuous, non-turbulent airflow in the vocal tract, in such a way that the air pressure on both sides of any constriction is approximately equal to the air pressure outside the mouth. Vowels, glides and some consonants (approximants, nasals, flaps and most trills) are sonorant.
  • continuant: continuant sounds are produced with an incomplete closure of the vocal tract, allowing the airstrem to flow through the midsaggital region of the oral tract. Continuant sounds are, essentially, vowels, approximants, and fricatives, non continuant are stops.
  • delayed release: a distinctive feature representing how quickly the closure in a non-continuant consonant is released. It separates stops, which are −delayed release, from affricates, which are +delayed release.

We consider /ʋ/ a glide, following

NOTE: work is needed to differentiate between sibilant and non-sibilant fricatives and affricates.

TODO: Check everything!!!

Place of articulation features

These features are used to inform the place of articulation of a sound, i.e., the one or more (in case of co-articulated sounds, such as clicks) points of contact where one or more obstructions occur in the vocal tract between an active articulator and a passive location. Different level of features allow for different levels of detail in the specification of the articulators.

These features are mostly used with consonants, but the unified model here adapted allows to express some vocal properties, particularly concerning roundedness, with the same set of features (we consider rounding, especially protruded rounding, as the vocalic equivalent of consonantal labialization, also in line with the effect of rounded vowels in the labialization of consonants and vice-versa). It is important to note, in this sense, that labiality presents a deeper level of detail, related to endolabial (the inner surface of the lips, associated with protrusion) and exolabial (the outer surface of the lips, associated with compression) articulators.

The distinction between active and passive articulators is mostly a working one, which follows the tendency to consider the first class those below the vocal tract and the second class those above it; they can also be differentiated by the fact that the first class is made of discrete points and the second of a continuum.

Active articulation

Level one

A collection of non-exclusive binary features, indicating when a given active articulator is in action. One or more (when there is coarticulation) features can be positive.

We assume that, if not explicitly noted, alveolar consonants from IPA (such as /n/, /t/, /d/) are laminal. Both laminal and apical entries should, however, be added to our database (such as /t̺/ and /d̺/ opposed to /t̻/ and /d̻/). We also assume that IPA retroflex consonants are subapical and hard palate (if apical, it should be specified).

An important distinction is between exolabial and endolabial articulators. It is important to allow for this distinction, as there are two types of sound (especially vowel) rounding/labialization: protrusion, when the corners of the mouth are drawn together and the lips protrude like a tube, and compression, when the lips are drawn together horizontally. On the subject,

Catford (1982:172) observes that back and central rounded vowels, such as German /o/ and /u/, are typically protruded, whereas front rounded vowels such as German /ø/ and /y/ are typically compressed. Back or central compressed vowels and front protruded vowels are uncommon,[3] and a contrast between the two types has been found to be phonemic in only one instance.[4]

Articulations of spread, compressed and protruded vowels

Front Central Back
Semivowel j ɥ ɥ̫ j̈ ɥ̈ ẅ ɰ ɰᵝ/wᵝ w
Close i y y̫ ɨ ÿ ʉ ɯ ɯᵝ/uᵝ u
Near Close ɪ ʏ ʏ̫ ɪ̈ ʏ̈ ʊ̈ ʊ͍ ʊᵝ ʊ̫
Close Mid e ø ø̫ ɘ ø̈ ɵ ɤ o
Open Mid ɛ œ œ̫ ɜ œ̈ ɞ ʌ ɔ


  • exolabial
  • endolabial
  • laminal (tongue blade)
  • apical (tongue tip)
  • subapical (underside of tongue)
  • dorsal (tongue body)
  • radical (tongue root)
  • laryngeal (larynx)
Level two

A collection of non-exclusive binary features, indicating when a given class of active articulator is in action. One or more (when there is coarticulation) features can be positive.


  • labial (feature endolabial, exolabial)
  • coronal (features laminal, apical, subapical)
  • guttural (features dorsal, radical, laryngeal)
Level three

A collection of non-exclusive binary features, indicating when a given supraclass of active articulation is in action. One or more (when there is coarticulation) features can be positive.


  • front (features labial, coronal)
  • back (feature guttural)

TODO: check if coronal consonants should be set as front, particularly considering that the feature is also used for vowels.

Passive articulation

Level one

A collection of non-exclusive binary features, indicating when a given passive articulator is in action. One or more (when there is coarticulation) features can be positive.


  • upper.lip
  • upper.teeth
  • alveolar.ridge
  • postalveolar
  • hard.palate
  • soft.palate
  • uvula
  • pharynx
  • epiglottis
  • glottis
Level two


  • high: "High sounds are produced by raising the body of the tongue toward the palate; nonhigh sounds are produced without such a gesture." (HC) [+high] refers to palatals, velars, palatalised consonants, velarised consonants, high vowels, semi-vowels. [-high] refers to all other sounds.
  • low: "Low sounds are produced by drawing the body of the tongue down away from the roof of the mouth; nonlow sounds are produced without such a gesture." [+low] refers to low vowels, pharyngeal consonants, pharyngealised consonants.



  • acute (coronals, except linguolabial, plus alveolo-palatal; front vowels)
  • grave (all others)

NOTE: theoretical work must be done for deciding if secondary articulation changes the acuteness (and, if it does, how it works).

Missing feature subsets

TODO: nasalization (lowering of the vellum), lateralization, contour (particularly diphtongs and tones), etc.


Additional material


VERY, VERY PRELIMINARY -- this is just a personal note on how to proceed, there are many things to review, particularly for low vowels.

+front +front +back +back
-labial +labial -labial +labial -labial +labial
+high +tense i y ɨ ʉ ɯ u
+high -tense ɪ ʏ ɪ̈ ʊ̈ ɯ̽ ʊ
+tense e ø ɘ ɵ ɤ o
-tense ɛ œ ɜ ɞ ʌ ɔ
+low +tense æ æʷ ɐ ɞ̞ ɑ˔ ɒ˔
+low -tense a ɶ ä ɒ̈ ɑ ɒ

TODO: explain decision regarding the schwa.