Skip to content

Processing scripts and datasets for a project on medial vowels in Kaytetye

License

Notifications You must be signed in to change notification settings

fauxneticien/kaytetye-medial-vowels

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Kaytetye medial vowels: dataset and processing scripts

DOI

Nay San, Michael Proctor, Mark Harvey, Myfany Turpin, Alison Ross

This repository contains the datasets, processing, and analysis scripts for an analysis on medial vowels in Kaytetye, an Australian Aboriginal language.

Directory structure

The repository follows a 'Cookiecutter Data Science' directory structure.

├── data
│   ├── external                             <- Metadata to be joined onto primary data
|   |   |── consonants.csv                   <- Categorized IPA consonant labels
|   |   |── headwords.csv                    <- Identities of Kaytetye words analyzed
|   |   |── headwords.csv                    <- Identities of Kaytetye words analyzed
|   |   |── raw-data_hashes.csv              <- Commit hashes of raw data sub-repositories at time of analysis
|   |   |── vowels.csv                       <- Categorized IPA vowel labels
│   ├── processed                            <- Intermediate data that has been transformed.
|   |   |── micl_results.csv                 <- Results of Gaussian Mixture Model fits
|   |   |── vowels_med_analysis.csv          <- Primary data set of project
│   └── raw                                  <- Empty. Please get in contact for access to raw audio/TextGrid data, hosted in private Git repos.
├── reports
│   ├── figures                              <- PDF figures used in manuscript
│   ├── manuscript                           
|   |   |── method.nb.html                   <- Method section, rendered to HTML
|   |   |── method.Rmd                       <- RMarkdown code used to generated method section
|   |   |── results+discussion.nb.html       <- Results and Discussion sections, rendered to HTML
|   |   |── results+discussion.Rmd           <- RMarkdown code used to generated Methods and Discussion sections
├── src                                      <- Source code for helper scripts
│   ├── visualization                        <- Helper scripts for figures
│   ├── data                                 <- Helper scripts for data analysis
|   |   ├── git                              <- Fetches/updates sub-repos in data/raw if you have access to them.
|   |   ├── make-micl_results.R              <- generates data/processed/micl_results.csv
|   |   ├── make-vowels_med_analysis.nb.html <- Outline of how data/processed/vowels_med_analysis.csv was generated
|   |   ├── make-vowels_med_analysis.Rmd     <- generates data/processed/vowels_med_analysis.csv from TextGrid data in data/raw
|   |   ├── mclust_helpers.R                 <- helper functions for make-micl_results.R
|   ├── setup.R                              <- Installs necessary R packages
├── .gitignore                               <- Files to be left untracked
├── .gitmodules                              <- Git URLs of linked private sub-repos in data/raw (use src/data/git/pull_all_submodules.sh to pull)
├── kayt_vowels.Rproj                        <- R Project file.
├── LICENSE                                  <- MIT license
├── README.md                                <- This README file

Recommended set up

For reproducing any part of the analyses, we recommend using the rocker/verse Docker image, and then running src/setup.R to install the remaining required packages.

  1. Install Docker, and launch Docker

  2. Clone/download this repository, e.g. to ~/Downloads/kaytetye-medial-vowels (n.b. file path)

  3. Open up a Terminal instance and run:

    docker run -it --rm \
       -p 8787:8787 \
       -e PASSWORD=kayv \
       -v ~/Downloads/kaytetye-medial-vowels:/home/rstudio rocker/verse
    
  4. Go to localhost:8787 in your browser, log in with username rstudio, password kayv

  5. Run source("src/setup.R"). All analyses except those requiring access to data/raw are reproducible in this environment.