In [127]:
# https://stackoverflow.com/questions/30312566/python-how-to-get-values-from-a-dictionary-from-pandas-series
# https://towardsdatascience.com/apply-and-lambda-usage-in-pandas-b13a1ea037f7
import pandas as pd
import json

In [128]:
with open('features.json') as json_file:
    feature_matrix = json.load(json_file)
feature_names = {feature for segment,features in feature_matrix.items() for feature in features}
print(feature_matrix['a'])
print(feature_matrix['a']['Syllabic'])

{'Syllabic': '1', 'Sonorant': '1', 'High': '0', 'Low': '1', 'Back': '1'}
1


In [129]:
df = pd.DataFrame()
df['token1'] = ['a', 'e', 'i']
df['token2'] = ['b', 'd', 's']

Series cannot be used directly as dictionary keys because they are not hashable. This method first uses `map()` to select the dictionary that corresponds to the segment from **feature_matrix** and then uses `apply()` with a lambda function to get the feature of interest. We specify a default value to avoid a key error with features not specified for the given segment (e.g., consonantal features and vowels).

**TODO**: Create different features depending on whether the column is consonantal or vocalic, that is, 10 features for consonantal columns and only 5 for vocalic.

In [129]:
for column_label in ['token1', 'token2']:
    for feature in feature_names:
        df[column_label + '_' + feature] = df['token1'].map(feature_matrix).apply(lambda x: x.get(feature, 'missing'))
df

Unnamed: 0,token1,token2,token1_Sonorant,token1_Back,token1_Palatalized,token1_Nasal,token1_Voiced,token1_Syllabic,token1_Anterior,token1_Continuant,...,token2_Nasal,token2_Voiced,token2_Syllabic,token2_Anterior,token2_Continuant,token2_High,token2_Lateral,token2_Low,token2_Delayedrelease,token2_Coronal
0,a,b,1,1,missing,missing,missing,1,missing,missing,...,missing,missing,1,missing,missing,0,missing,1,missing,missing
1,e,d,1,0,missing,missing,missing,1,missing,missing,...,missing,missing,1,missing,missing,0,missing,0,missing,missing
2,i,s,1,0,missing,missing,missing,1,missing,missing,...,missing,missing,1,missing,missing,1,missing,0,missing,missing
