# Example of Usage of Distinctive Feature Classes

For the project I've created a couple of classes that help us think about distinctive segmental features. I've organized them in a way more close to "feature geometry", meaning that a segment can be thought of as a structure containing three sub-structures, the `Major` class features, the `Laryngeal` class features, and the `Place` class features. I think later, we might add some quantification of the distribution of that sound (the environments that it is present in, and the semantic content of said environments), but that is a job for later!

## Major Class

A common theme you'll see with these classes is that each class is represented as a `TypedDict`, meaning that ultimately, at runtime, they are a `dict` proper. This is up for debate about if that is a good idea, but I chose it at least at this stage because it allows us to have *named fields* while still being able to convert it to a vector through some quick methods on the `dict`.

In [2]:
from distfeat import Major, Laryngeal, Place, Sound, SArray
import numpy as np

# going to describe the high-front, unrounded vowel [i]
i_major: Major = {
            'syll': 1,
            'cons': 0,
            'son': 1,
            'contin': 1,
            'del_rel': -1,
            'lat': -1,
            'nasal': 0
            }
i_major

{'syll': 1,
 'cons': 0,
 'son': 1,
 'contin': 1,
 'del_rel': -1,
 'lat': -1,
 'nasal': 0}

Note the three fold distinction in the possible values of the keys! For these sets of features, we're using a *balanced ternary* distinction, such that some feature can have a value in the set {-1, 0, 1}, where:

- -1 := the *lack of specification* for that feature
- 0 := the *negative* value for that feature, e.g. $[-cons]$
- 1 := the *positive* value for that feature, e.g. $[+cons]$

This distinction ought to hold for all other feature classes. Ultimately, we will be normalizing these vectors to have values between \[0,1\]!

## Laryngeal Class

The laryngeal class is smaller than Major class features, as it only captures laryngeal features of some segment. Thus:

In [3]:
# describing the laryngeal class features of high-front, unrounded vowel
i_lary: Laryngeal = {
        'voice': 1,
        'spr_gl': 0,
        'cons_gl': 0,
        }
i_lary

{'voice': 1, 'spr_gl': 0, 'cons_gl': 0}

The `Laryngeal` class follows the same balanced ternary values as the `Major` class.

## Place Class

The place class represents those features which characterize some segments place of articulation.

Thus:

In [4]:
# describing the place class features of the high-front, unrounded vowel
i_pl: Place = {
        'ant': -1,
        'cor': -1,
        'dist': -1,
        'high': 1,
        'low': 0,
        'back': 0,
        'round': 0,
        'tense': 1,
        'atr': 1,
        'labial': -1
        }
i_pl

{'ant': -1,
 'cor': -1,
 'dist': -1,
 'high': 1,
 'low': 0,
 'back': 0,
 'round': 0,
 'tense': 1,
 'atr': 1,
 'labial': -1}

## Sound Class and SArray Class

The `Sound` class is a class that combines the other classes above into a single structure, which ought to fully describe a single sound. The `Sound` class is paired with the `SArray` class, which is the sound turned into a `np.ndarray`, which allows for vector operations on the sound itself.

In [5]:
# the high-front, unrounded vowel
i: Sound = {'major': i_major, 'laryngeal': i_lary,
            'place': i_pl}
i

{'major': {'syll': 1,
  'cons': 0,
  'son': 1,
  'contin': 1,
  'del_rel': -1,
  'lat': -1,
  'nasal': 0},
 'laryngeal': {'voice': 1, 'spr_gl': 0, 'cons_gl': 0},
 'place': {'ant': -1,
  'cor': -1,
  'dist': -1,
  'high': 1,
  'low': 0,
  'back': 0,
  'round': 0,
  'tense': 1,
  'atr': 1,
  'labial': -1}}

In [11]:
# [i] as an np.ndarray
i_arr = SArray(i)
print(i_arr)

[ 1  0  1  1 -1 -1  0  1  0  0 -1 -1 -1  1  0  0  0  1  1 -1]


# Some Interactions with the distinctive feature classes

I've made some basic functions that interact with these classes that are helpful. Unfortunately, because of of the way that `TypedDict`s work, these are not methods of the classes themselves, but rather better characterized as helper functions

In [19]:
from contrast import *

# Generate a random features with field values in the inclusive range
# [-1,1]
s_m = get_major()
s_l = get_laryngeal()
s_p = get_place()

print(s_m, s_l, s_p)

{'syll': 0.5944886322483747, 'cons': -0.32209673877364065, 'son': -0.6460358278551759, 'contin': -0.5172388751404655, 'del_rel': -0.6233677331893781, 'lat': -0.31630359569446753, 'nas': 0.5575090736553139, 'stri': -0.8114990102845376} {'voice': -0.885039736327877, 'spr_gl': 0.476752086207316, 'cons_gl': -0.4476272314811749} {'ant': 0.2501481129274419, 'cor': -0.26360104013656116, 'dist': 0.6232864420505086, 'high': -0.7865328699470413, 'low': 0.8604793945204725, 'back': 0.14424669408637358, 'round': -0.8026032326130048, 'lab': -0.2788866751922243, 'atr': -0.09808466658421766}


In [20]:
# generate a random sound with those functions above
s = get_sound()
s

{'major': {'syll': 0.3389194169057621,
  'cons': -0.8503699803253189,
  'son': -0.8492850255244191,
  'contin': -0.5656520662263931,
  'del_rel': 0.7818845642355654,
  'lat': 0.23253599868979835,
  'nas': 0.5432520141299644,
  'stri': -0.9135319357046807},
 'laryngeal': {'voice': -0.852999512484016,
  'spr_gl': 0.17423600485959678,
  'cons_gl': -0.8751218959341012},
 'place': {'ant': -0.35851296151771916,
  'cor': 0.7193375048472699,
  'dist': 0.8045572509691035,
  'high': -0.6323360999982552,
  'low': 0.2442296592485358,
  'back': 0.6453694390863822,
  'round': 0.13236483742816785,
  'lab': 0.1856416603298734,
  'atr': -0.4299742888825886}}

In [21]:
# generate a collection of sounds, length
# is taken as an optional parameter. if not specified,
# then length is a random natural number in the range (1,10)
ws0 = get_word(2)
ws0

[{'major': {'syll': -0.97562350980242,
   'cons': -0.8666981177332695,
   'son': 0.7640975530544623,
   'contin': 0.07782818944514269,
   'del_rel': 0.9057092007205871,
   'lat': 0.7054786669670641,
   'nas': 0.14639486331012042,
   'stri': -0.3406383697575184},
  'laryngeal': {'voice': 0.9746168386772831,
   'spr_gl': -0.6091844038342582,
   'cons_gl': 0.26052232209126314},
  'place': {'ant': 0.6203505994604575,
   'cor': -0.5154643922463169,
   'dist': -0.2358227631089571,
   'high': -0.3674471036050748,
   'low': 0.30137673030704826,
   'back': 0.16471150202883167,
   'round': 0.18024545020612748,
   'lab': 0.8467085437564017,
   'atr': -0.4379142373692946}},
 {'major': {'syll': -0.35166483058090936,
   'cons': -0.7777841887018868,
   'son': -0.5626011600341512,
   'contin': -0.9130617691612466,
   'del_rel': 0.9004600723330087,
   'lat': 0.6423148001720105,
   'nas': 0.6062100085482633,
   'stri': -0.6143283030237219},
  'laryngeal': {'voice': -0.4947563429291941,
   'spr_gl': 0.07

In [None]:
# convert a sound to a np.ndarray
w_array = SArray_word(ws0)