# Example of Usage of Distinctive Feature Classes

For the project I've created a couple of classes that help us think about distinctive segmental features. I've organized them in a way more close to "feature geometry", meaning that a segment can be thought of as a structure containing three sub-structures, the `Major` class features, the `Laryngeal` class features, and the `Place` class features. I think later, we might add some quantification of the distribution of that sound (the environments that it is present in, and the semantic content of said environments), but that is a job for later!

## Major Class

A common theme you'll see with these classes is that each class is represented as a `TypedDict`, meaning that ultimately, at runtime, they are a `dict` proper. This is up for debate about if that is a good idea, but I chose it at least at this stage because it allows us to have *named fields* while still being able to convert it to a vector through some quick methods on the `dict`.

In [1]:
from distfeat import Major, Laryngeal, Place, Sound, SArray
import numpy as np

# going to describe the high-front, unrounded vowel [i]
i_major: Major = {
            'syll': 1,
            'cons': 0,
            'son': 1,
            'contin': 1,
            'del_rel': -1,
            'lat': -1,
            'nasal': 0
            }
i_major

{'syll': 1,
 'cons': 0,
 'son': 1,
 'contin': 1,
 'del_rel': -1,
 'lat': -1,
 'nasal': 0}

Note the three fold distinction in the possible values of the keys! For these sets of features, we're using a *balanced ternary* distinction, such that some feature can have a value in the set {-1, 0, 1}, where:

- -1 := the *lack of specification* for that feature
- 0 := the *negative* value for that feature, e.g. $[-cons]$
- 1 := the *positive* value for that feature, e.g. $[+cons]$

This distinction ought to hold for all other feature classes. Ultimately, we will be normalizing these vectors to have values between \[0,1\]!

## Laryngeal Class

The laryngeal class is smaller than Major class features, as it only captures laryngeal features of some segment. Thus:

In [2]:
# describing the laryngeal class features of high-front, unrounded vowel
i_lary: Laryngeal = {
        'voice': 1,
        'spr_gl': 0,
        'cons_gl': 0,
        }
i_lary

{'voice': 1, 'spr_gl': 0, 'cons_gl': 0}

The `Laryngeal` class follows the same balanced ternary values as the `Major` class.

## Place Class

The place class represents those features which characterize some segments place of articulation.

Thus:

In [3]:
# describing the place class features of the high-front, unrounded vowel
i_pl: Place = {
        'ant': -1,
        'cor': -1,
        'dist': -1,
        'high': 1,
        'low': 0,
        'back': 0,
        'round': 0,
        'tense': 1,
        'atr': 1,
        'labial': -1
        }
i_pl

{'ant': -1,
 'cor': -1,
 'dist': -1,
 'high': 1,
 'low': 0,
 'back': 0,
 'round': 0,
 'tense': 1,
 'atr': 1,
 'labial': -1}

## Sound Class and SArray Class

The `Sound` class is a class that combines the other classes above into a single structure, which ought to fully describe a single sound. The `Sound` class is paired with the `SArray` class, which is the sound turned into a `np.ndarray`, which allows for vector operations on the sound itself.

In [4]:
# the high-front, unrounded vowel
i: Sound = {'major': i_major, 'laryngeal': i_lary,
            'place': i_pl}
i

{'major': {'syll': 1,
  'cons': 0,
  'son': 1,
  'contin': 1,
  'del_rel': -1,
  'lat': -1,
  'nasal': 0},
 'laryngeal': {'voice': 1, 'spr_gl': 0, 'cons_gl': 0},
 'place': {'ant': -1,
  'cor': -1,
  'dist': -1,
  'high': 1,
  'low': 0,
  'back': 0,
  'round': 0,
  'tense': 1,
  'atr': 1,
  'labial': -1}}

In [5]:
# [i] as an np.ndarray
i_arr = SArray(i)
print(i_arr)

[ 1  0  1  1 -1 -1  0  1  0  0 -1 -1 -1  1  0  0  0  1  1 -1]


# Some Interactions with the distinctive feature classes

I've made some basic functions that interact with these classes that are helpful. Unfortunately, because of of the way that `TypedDict`s work, these are not methods of the classes themselves, but rather better characterized as helper functions

In [6]:
from contrast import *

# Generate a random features with field values in the inclusive range
# [-1,1]
s_m = get_major()
s_l = get_laryngeal()
s_p = get_place()

print(s_m, s_l, s_p)

{'syll': 0.3155209004829065, 'cons': -0.44489899358910834, 'son': 0.44882410715035626, 'contin': 0.9369988330225454, 'del_rel': -0.10070939890026209, 'lat': -0.8279857933630579, 'nas': 0.4622664497728348, 'stri': 0.670664963658449} {'voice': 0.3165470653648128, 'spr_gl': -0.8712717257313585, 'cons_gl': 0.16676658621140494} {'ant': 0.40567853166743406, 'cor': -0.35156675596109266, 'dist': -0.7981605743152864, 'high': 0.10309722937926158, 'low': 0.20480185276132423, 'back': -0.38785082203673493, 'round': -0.7711276014332822, 'lab': 0.9752025545651644, 'atr': 0.3291183202805945}


In [7]:
# generate a random sound with those functions above
s = get_sound()
s

{'major': {'syll': -0.3092064008246027,
  'cons': 0.2882982007074286,
  'son': 0.5961184277874261,
  'contin': 0.7161049551474488,
  'del_rel': -0.7390706110380216,
  'lat': -0.801156309822197,
  'nas': -0.8218930498205794,
  'stri': 0.7743176640565435},
 'laryngeal': {'voice': 0.4051999690616497,
  'spr_gl': 0.02928907991021501,
  'cons_gl': 0.9771942230080641},
 'place': {'ant': -0.3336571049250836,
  'cor': -0.6479587855357527,
  'dist': -0.8504016951791349,
  'high': 0.6890549865797899,
  'low': 0.9861424317295757,
  'back': -0.5661455260681778,
  'round': -0.1349874103025659,
  'lab': -0.45884006435084745,
  'atr': -0.9971633824972423}}

In [8]:
# generate a collection of sounds, length
# is taken as an optional parameter. if not specified,
# then length is a random natural number in the range (1,10)
ws0 = get_word(2)
ws0

[{'major': {'syll': -0.6577418831219006,
   'cons': 0.5735024602908076,
   'son': 0.1655204659582663,
   'contin': 0.03654736847035145,
   'del_rel': -0.7128766517023506,
   'lat': -0.28073768739879745,
   'nas': 0.7302273041735217,
   'stri': 0.44320652766877267},
  'laryngeal': {'voice': 0.48421393412028113,
   'spr_gl': -0.8211473190861971,
   'cons_gl': -0.3962921013130647},
  'place': {'ant': -0.35108319442362523,
   'cor': 0.24276843225371092,
   'dist': 0.6849671948720606,
   'high': 0.9437513624556282,
   'low': 0.8276260704690179,
   'back': 0.5047953185773573,
   'round': 0.15109227232740774,
   'lab': 0.804114507530715,
   'atr': -0.7350275600689693}},
 {'major': {'syll': 0.35864902567657064,
   'cons': -0.4187294919666251,
   'son': 0.2519242351796609,
   'contin': -0.18896688666743322,
   'del_rel': -0.10424084583397852,
   'lat': -0.3074396187568307,
   'nas': 0.0917656236497324,
   'stri': -0.13997531312703715},
  'laryngeal': {'voice': -0.933971918243879,
   'spr_gl': -

In [9]:
# convert a sound to a np.ndarray
w_array = SArray_word(ws0)

  return np.array(l)
