Skip to content

manexagirrezabal/athenarhythm

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AthenaRhythm

In this project we address the task of automatically assigning primary stress to out-of-vocabulary words in English. This work forms a necessary component in a scansion system for English poetry (https://github.com/manexagirrezabal/zeuscansion). We have developed three different approaches based on (1) word similarity, (2) hand- written linguistic rules and (3) machine learning. The first and last approach require stress-annotated corpora to train a model for stress assignment, while the linguistic approach relies on grapheme to phoneme conversion, a syllabification procedure and hand-written stress assignment rules. The linguistic approach proves to be the most effective, but the machine learning approach is not far behind.

Dependencies

Closest Word Finding approach

Linguistic approach

Machine learning approach

Training corpus

We have used the NETtalk pronunciation dictionary to train and test our models. The dictionary has information concerning to words, pronunciations, number of syllables, possible part of speech and and primary and secondary stress location. It can be downloaded from http://dingo.sbs.arizona.edu/~hammond/lsasummer11/newdic

References

Agirrezabal, M., Heinz, J., Hulden, M., & Arrieta, B. (2014, January). Assigning stress to out-of-vocabulary words: three approaches. In Proceedings on the International Conference on Artificial Intelligence (ICAI) (p. 1). The Steering Committee of The World Congress in Computer Science, Computer Engineering and Applied Computing (WorldComp).

About

In this project we have developed three different approaches to assign primary stress to English out-of-vocabulary words.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published