Skip to content

VoxClamantisProject/analysis

Repository files navigation

analysis

R scripts

voxclamantis_vowels_epitran_wikipron.R: analysis script for vowel midpoint F1 and F2

This script takes as input the Epitran and WikiPron midpoint F1 and F2 files, the per-utt-mcd scores files, reading_info.csv (inventory), and returns counts of tokens, families, and languages presented in the paper, correlation tables, and the correlation scatterplot. It optionally saves text files of the midpoint F1 and F2 following outlier exclusion that serves as input to the Python dispersion analysis.

voxclamantis_sibilants_epitran_wikipron.R: analysis script for sibilant mid-frequency peak

This script takes as input the Epitran and WikiPron sibilant info.csv and sibilant.csv files, the per-utt-mcd scores files, reading_info.csv (inventory), and returns counts of tokens, families, and languages presented in the paper, the correlation of mean mid-frequency peak /s/ and /z/, and the correlation scatterplot.

Python scripts

The python scripts in this folder have a number of dependencies detailed in environment.yml file. To install dependencies using conda run:

$ conda env create -f environment.yml

And to activate the environment run:

$ conda activate wild

extract_info.py: preprocessing script for vowel dispersion analysis

This script takes as input the folder where epiwiki dispersion files are (with file names ending in formants_mid_fin.csv) and it outputs a preprocessed tsv file with per language--vowel dispersion entropies. Run it with command:

$ python extract_info.py --src-path <src-path> --tgt-file <tgt-file>

In this command, <src-path> is the data source path, while <tgt-file> is the extracted info output filename. For example:

$ python extract_info.py --src-path data/vowels/midpoint_f1_f2/dispersion_input_epiwiki/ --tgt-file data/vowels/midpoint_f1_f2/preprocessed.tsv

analyse_dispersion.py: analysis script for dispersion entropy vs number of vowel categories correlations

This script takes as input the preprocessed file generated by extract_info.py and prints in the terminal Pearson and Spearman correlations (with their respective p-values). Run it with command:

$ python analyse_dispersion.py --info-file <info-file> --formants-type <formants-type>

where <formants-type> is either erb or hz. And <info-file> is the extract_info.py output. For example:

$ python analyse_dispersion.py --info-file data/vowels/midpoint_f1_f2/preprocessed.tsv --formants-type erb

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published