Skip to content

Commit

Permalink
Add mention of Persephone
Browse files Browse the repository at this point in the history
Persephone is designed for situations where training data is limited, perhaps as little as an hour of transcribed speech. Such limitations on data are common in the documentation of low-resource languages. It is possible to use such small amounts of data to train a transcription model that can help aid transcription, yet such technology has not been widely adopted. The goal of Persephone is thus to make state-of-the-art phonemic transcription accessible to people involved in language documentation. It is more flexible than CMU-Sphinx in that it handles a wider range of phenomena (including linguistic tone) and yields good results, as reported in recent work: Adams, Oliver, Trevor Cohn, Graham Neubig, Hilaria Cruz, Steven Bird & Alexis Michaud. 2018. Evaluating phonemic transcription of low-resource tonal languages for language documentation. Proceedings of LREC 2018 (Language Resources and Evaluation Conference), 3356–3365. Miyazaki. https://halshs.archives-ouvertes.fr/halshs-01709648.
  • Loading branch information
alexis-michaud committed Sep 15, 2018
1 parent 5377427 commit e0bb28c
Showing 1 changed file with 1 addition and 0 deletions.
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -246,6 +246,7 @@ Looking for resources for code languages? Take a look at [the awesome lists coll
* [pathway](https://github.com/sillsdev/pathway) - Preparing language data for publication.
* [pdfdroplet](https://github.com/sillsdev/pdfdroplet) - Library and GUI for imposition of PDF pages (e.g. 2-up) http://software.sil.org/pdfdroplet/.
* [pepper](https://github.com/korpling/pepper) - Pepper is a pluggable, Java-based, open source converter framework for linguistic data.
* [Persephone](https://github.com/persephone-tools/persephone) - Persephone aims to make state-of-the-art phonemic transcription accessible to people involved in language documentation, who have a training corpus of about one to four hours of transcribed speech.
* [phonology-assistant](https://github.com/sillsdev/phonology-assistant) - Phonology Assistant is a discovery tool. Provided with a corpus of phonetic data, it automatically charts the sounds and through its searching capabilities, helps a user discover and test the rules of sound in a language.
* [pressagio](https://github.com/cidles/pressagio) - Pressagio is a library that predicts text based on n-gram models. For example, you can send a string and the library will return the most likely word completions for the last token in the string.
* [PrimerPro](https://github.com/sillsdev/PrimerPro) - The purpose of PrimerPro is to assist the literacy worker in the development of primers for a given language.
Expand Down

0 comments on commit e0bb28c

Please sign in to comment.