-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add icelandic g2p #4384
Add icelandic g2p #4384
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your contribution!
Could you fix CI and add unit test? Just extending this part:
espnet/test/espnet2/text/test_phoneme_tokenizer.py
Lines 65 to 68 in 047d0c4
def test_text2tokens(phoneme_tokenizer: PhonemeTokenizer): | |
if phoneme_tokenizer.g2p_type is None: | |
input = "HH AH0 L OW1 W ER1 L D" | |
output = ["HH", "AH0", "L", "OW1", " ", "W", "ER1", "L", "D"] |
Tests are failing for some other feature, but linter and tests are fine for my code: |
@simpleoier, it seems that this error comes from s3prl. |
Sure. I'll take a look at it. |
@G-Thor could you merge master? CI is fixed. |
Codecov Report
@@ Coverage Diff @@
## master #4384 +/- ##
==========================================
- Coverage 82.58% 82.56% -0.02%
==========================================
Files 469 469
Lines 40196 40209 +13
==========================================
+ Hits 33194 33197 +3
- Misses 7002 7012 +10
Flags with carried forward coverage won't be shown. Click here to find out more.
📣 Codecov can now indicate which changes are the most critical in Pull Requests. Learn more |
This pull request adds an optional icelandic g2p, installable via tools/installers/install_ice_g2p.sh
Information and code for the g2p system can be found here and here.
This g2p is used for phonemization in my recipe for the Talromur Icelandic TTS corpus, which is in a separate pull request.