No description, website, or topics provided.
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.
.gitignore pass in langlist to command line Apr 4, 2016 took out BP, to get accuracy results for 3 languages like GMMclassifier Nov 4, 2015

A Computational Approach to Foreign Accent Classification

Emily Ahn

Wellesley College Undergraduate Senior Thesis // May 2016

Full write-up in pdf form can be found in the writeup/ folder or in the Wellesley College repositories.

1. Folders in this repository:


  • Contains forced alignents and transcriptions for the text-dependent classifier
  • The 7 subfolders correspond to each of the 7 transcribed accents
  • Simple transcriptions of speech files are organized by accent. The format is a .csv file compiled via releasing transcription tasks on Amazon Mechanical Turk, then personally cleaned up by the author. *Note: errors still exist in some transcriptions.

trans-results/ and untrans-results/

  • Console print logs of results from the text-dependent "trans" (transcribed) classifier and the text-independent "untrans" (untranscribed) classifier


  • Contains lists of filenames that were split into train and test data, via a randomized 75:25 split


  • Script and data (in csv format) to test GMM classifcation based on 3 vowel formants for AR, CZ, and IN accents

2. Main scripts

  • Text-independent (untranscribed) Classifier
    • || full script; loads data, trains models, classify test data
    • || modularizes training only, stores models in directory
    • || modularizes testing only
  • Text-dependent (transcribed) Classifier
    • || prepares data by converting forced alignments of speech into plp features (sorted by accent and phoneme)
    • || full script; gmm Classification of transcribed phonemes

3. Miscellaneous scripts

  • || takes average of each dimension of PLP vector across all time windows from a given sound file
  • || does univariate GMM classification of AR, HI, MA