Skip to content

ldamatch example: matching regular and irregular verbs on standard covariates

Notifications You must be signed in to change notification settings

cslu-nlp/ldamatch_verb_example

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Verb matching example

match.R is an example of attempting to match regular and irregular English verbs on frequency and several related attributes. There are more than an order of magnitude more regular verbs in the data set, and irregulars tend to be significantly higher frequency, so this is somewhat challenging.

Results

  1. Heuristic 1: doesn't work (convergence failure).
  2. Heuristic 2: doesn't work (stack limit error).
  3. Heuristic 3: produces good results (10/1203).
  4. Heuristic 4: produces good results
  5. Exhaustive (bigbird14): not feasible.

WORD PRONUNCIATIONS

The verbs.tsv file contains all the verbs from elp_words_merged.csv and english_irregulars.csv, sorted alphabetically. The verb_pronunciations_1.tsv contains the same verbs together with their pronunciations (separated by a tab character) created by https://tophonetics.com/ Missing ones were taken from https://www.lexico.com/en/definition/abolish and https://www.macmillandictionary.com/dictionary/american/ and made to confirm to other entries manually. We updated the n.syll variable by counting the number of phones which are vowels (which is equal to the number of syllables, naturally).

About

ldamatch example: matching regular and irregular verbs on standard covariates

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages