-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merge of coqui voices broke stuff #126
Comments
Hi again, Error: ❗ XTTS can only generate text with a maximum of 400 tokens. ... Retrying (0 retries left) |
Hmm, I might need to fully revert this then. I have not tested with really
long text, so have not run into exceeding the tokens.
I think it's fair to keep it under this issue as the merge of coqui voices
sure did break stuff!
For what it's worth though, it is working for me with current chunk size
with epubs, maybe I need to test with text longer than what I have sent to
it so far.
…On Sat, Dec 23, 2023 at 2:37 PM danielw97 ***@***.***> wrote:
Hi again,
If you'd rather have a separate issue for this let me know, although in my
testing just now after pulling your most recent commit I'm getting the
following error, as xtts is getting sent a bigger text chunk than it can
handle I believe.
This is using one of the coqui studio voices, btw.
Error: ❗ XTTS can only generate text with a maximum of 400 tokens. ...
Retrying (0 retries left)
This is a longer paragraph, although using a finetuned model last week
with the same book didn't have this problem.
Thanks for all of your work.
—
Reply to this email directly, view it on GitHub
<#126 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAFBJGOAY54TN3RDBNQK6R3YK5MI5AVCNFSM6AAAAABBBBXDVWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNRYGM4DEMJTGE>
.
You are receiving this because you modified the open/close state.Message
ID: ***@***.***>
|
Other than that everything seems to be working, I wonder is it possible to encorperate the same segmenting code that is used with xtts as I assume the limits are the same (if it isn't already)? |
Also, the same text processing at least in my mind should be used, as I believe this is basically xtts under the hood unless I am incorrect. |
I'm going to reopen this one until things are sorted out. The difference has to do with how I'm calling XTTS between these two ways. The way I use the xtts cloning model is with their inference streaming approach but the docs don't indicate how you can use that with their voices. |
Okay, thanks. |
Haha yes I have asked on discord, no answer yet though (and someone else just asked the same question today). I put a potential fix in the branch "fixes" if you want to try it out when you get a chance. As far as the holidays, it's OK, this is relaxing and I always sleep better after fixing some bugs :) Thanks, and happy holidays to you too! |
Thanks, I've got some time this evening and will test this now. |
Thanks, I appreciate your testing and all your feedback! You should see this, indicating the right xtts version:
|
Yes, that's what I got in the end. |
Excellent! I'll figure out what's going on with other languages hopefully tonight and merge this branch. |
Found the problem with Coqui voices, reading plain text (rather than epub), and specifying a language other than english. On line 164 I replace all periods with commas if language != en, and that seems to break something along the way (maybe it confuses the segmenter that breaks everything up into individual sentences). Replacing periods with commas did seem to help for non-english languages where it would seem to always pronounce the period at the end of sentences as "dot" or some variation of that. I changed that to happen now just before the sentence is sent to TTS, hopefully it is still effective for other languages. |
I believe things are all fixed now, please log bugs as always :) |
Due to lack of complete testing, the merge that made studio voices work also broke a bunch of other stuff. This merge fixes that, and also includes a test script that I will use in the future to validate a few common use cases. It would be nice to add some real tests into CI, but the test runners do not have GPU, so would not be that useful.
The text was updated successfully, but these errors were encountered: