New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CFSD, SECS metrics for TTS #5235
Conversation
for more information, see https://pre-commit.ci
Codecov Report
@@ Coverage Diff @@
## master #5235 +/- ##
=======================================
Coverage 74.43% 74.43%
=======================================
Files 642 642
Lines 57611 57611
=======================================
Hits 42885 42885
Misses 14726 14726
Flags with carried forward coverage won't be shown. Click here to find out more. 📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very cool!
Could you update the doc (adding example command)?
https://github.com/espnet/espnet/tree/master/egs2/TEMPLATE/tts1#evaluation
@kan-bayashi |
Can you add a test script (in later PR?)? |
@sw005320 |
Thanks! |
Hi, I tried to implement following TTS objective metrics (#1665 ) by simply utilizing related espnet script and the pretrained models.
(Example TTS paper where those metrics are used: ADAPTER-BASED EXTENSION OF MULTI-SPEAKER TEXT-TO-SPEECH MODEL FOR NEW SPEAKERS)
Example results on LJSPEECH eval set:
How about this?