Fetching latest commit…
Cannot retrieve the latest commit at this time.
|Type||Name||Latest commit message||Commit time|
|Failed to load latest commit information.|
To compile the rules and generate the final FSTs, just run ./make.sh. The exported FSTs are defined in the grammar file g2p.grm. The two FST files G2P1 and G2P2 are defined for being used with the Python transcriber script. They are split in two parts to minimize size on disk, but you may just export a single FST file, if you wish. After succesful compilation, you might test the rules using either: - tester.sh: a very simple interactive loop. Just write in or paste the words/sentences you want to test. - transcribe.sh: which will transcribe a file with words or sentences given as input. Please, take the following in account: - you will be able only to test words or sentences that have previously been normalized. "Normalization" means in this case that the set of characters used in the input string must be contained in the input alphabet defined in the first FST (in_feeder in alphabets.grm). This is done already when you use scripts/tts_transcriber.py - second, the strings should contain a stress marker (the "+" char) for optimal accuracy. This information comes either from the exceptions lexicon or from the stress prediction model. Thanks: The implementation of the rules would not have been possible without the Russian language advice from my colleague at Yandex Anastasiya Polkanova. Sources: - Chew, Peter A. (2003): A Computational Phonology of Russian. Dissertation.com - Jones, Daniel & Ward, Dennis (1969): The Phonetics of Russian. Cambridge University Press