For this notebook to run correctly you need to have completed the first steps of the scone-phobia tutorial (up to and including "Prepare minimal-pair scores"). In particular, you need a local folder containing minimal-pair scores for the `AMnnet1_tri2_smbr_LMmonomodel__BUCtrain__WSJtest__KLdis.txt` and `AMnnet1_tri2_smbr_LMmonomodel__CSJtrain__WSJtest__KLdis.txt` ABXpy results file from https://osf.io/jpd74/.

`AMnnet1_tri2_smbr_LMmonomodel__BUCtrain__WSJtest__KLdis.txt` contains discrimination scores for an Automatic Speech Recognition (ASR) system trained on the Buckeye corpus of American English and tested on the Wall Street Journal corpus of American English. `AMnnet1_tri2_smbr_LMmonomodel__CSJtrain__WSJtest__KLdis.txt` contains scores for the same ASR system tested on the same American English corpus, but trained on the Corpus of Spontaneous Japanese.

In this example notebook, we look at the discrimination of American English /r/ and /l/ by American English-trained vs Japanese-trained models. If our models are [anything like humans](https://en.wikipedia.org/wiki/Perception_of_English_/r/_and_/l/_by_Japanese_speakers#Perception), Japanese-trained models should have a much harder time making this distinction than American English trained ones.

To test, this we apply the [RL_AmEnglish](https://github.com/Thomas-Schatz/scone-phobia/blob/master/scone_phobia/analyses/RL_AmEnglish.py) analysis. For each model, the discriminability of American English /r/ and /l/ is computed as well as two controls: the discriminability of American English /w/ and /y/, which Japanese listeners are not expected to have trouble with, and the average discriminability of all American English consonant contrasts. Results are plotted as a seaborn catplot.

In [2]:
%matplotlib inline
from scone_phobia import apply_analysis
from scone_phobia.analyses.RL_AmEnglish import RL_AmEnglish as AE_RL
import scone_phobia.metadata.add_metadata as add_metadata
import seaborn

# Local folder where minimal-pair scores have been computed.
# Change as appropriate.
mp_folder = '../../mpscores'

# select relevant models among all those potentially in mp_folder
mAE = 'AMnnet1_tri2_smbr_LMmonomodel__BUCtrain__WSJtest__KLdis'
mJ = 'AMnnet1_tri2_smbr_LMmonomodel__CSJtrain__WSJtest__KLdis'
filt = lambda mp_fname: mAE in mp_fname or mJ in mp_fname

# we launch analysis without resampling in this example.
df_rl = apply_analysis(AE_RL,
                       mp_folder,
                       filt=filt,
                       add_metadata=add_metadata.language_register,
                       resampling=False)

# display the results Dataframe
df_rl

Unnamed: 0,contrast,dissimilarity,error,model type,test language,test register,test set,training language,training register,training set
0,L-R,KL,24.020161,AMnnet1_tri2_smbr_LMmono,American English,Read,WSJ,Japanese,Spontaneous,CSJ
1,W-Y,KL,5.773129,AMnnet1_tri2_smbr_LMmono,American English,Read,WSJ,Japanese,Spontaneous,CSJ
2,L-R,KL,0.953476,AMnnet1_tri2_smbr_LMmono,American English,Read,WSJ,American English,Spontaneous,BUC
3,W-Y,KL,0.803419,AMnnet1_tri2_smbr_LMmono,American English,Read,WSJ,American English,Spontaneous,BUC
4,all_C,KL,2.472895,AMnnet1_tri2_smbr_LMmono,American English,Read,WSJ,American English,Spontaneous,BUC
5,all_C,KL,8.834495,AMnnet1_tri2_smbr_LMmono,American English,Read,WSJ,Japanese,Spontaneous,CSJ


In [None]:
# plot the results
g = seaborn.catplot(x="training language",
                    y="error",
                    hue="contrast",
                    data=df_rl,
                    kind="bar",
                    order=['American English', 'Japanese'],
                    hue_order=['L-R', 'W-Y', 'all_C'],
                    legend=True)