Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Portuguese TTS model on XTTS is pronouncing the "." (dot) character when it happens in a text #2952

Closed
Subarasheese opened this issue Sep 15, 2023 · 19 comments
Assignees
Labels
bug Something isn't working

Comments

@Subarasheese
Copy link

Subarasheese commented Sep 15, 2023

Describe the bug

Hello,

It seem a bit of a "oopsie" was made when handling the Portuguese dataset as now the PTBR pronounces the "." character as ponto every time we insert sentences like:

"Olá, sou seu novo clone de voz. Faça o possível para carregar um áudio de qualidade."

Here is the output: https://vocaroo.com/1404xnr0Vkmc

It was not supposed to say "ponto"...

It goes like:

"Olá, sou seu novo clone de voz ponto Faça o possível para carregar um áudio de qualidade ponto"

But it should not be like that.

To Reproduce

Set the client to portuguese (pt) then type anything including "." (dot)

Expected behavior

Not pronouncing dot. The purpose of "." is to indicate the end of a declarative sentence or to separate certain elements in written text.

Logs

None

Environment

git clone https://huggingface.co/spaces/coqui/xtts
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt
python app.py

Additional context

No response

@Subarasheese Subarasheese added the bug Something isn't working label Sep 15, 2023
@Subarasheese
Copy link
Author

@Edresson

@Edresson
Copy link
Contributor

Hi @Subarasheese, thanks for reporting this bug. We plan to fix this issue soon. As work around I noticed that if you add a space between the word and the point it will fix the issue.

@Subarasheese
Copy link
Author

Subarasheese commented Sep 15, 2023

Hi @Subarasheese, thanks for reporting this bug. We plan to fix this issue soon. As work around I noticed that if you add a space between the word and the point it will fix the issue.

Thank you.
I have a question, out of curiosity: can the dataset used to train the Portuguese model be found online, or did Coqui use a private/internal dataset for Portuguese?

@stale
Copy link

stale bot commented Oct 17, 2023

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. You might also look our discussion channels.

@stale stale bot added the wontfix This will not be worked on but feel free to help. label Oct 17, 2023
@Edresson Edresson self-assigned this Oct 20, 2023
@stale stale bot removed the wontfix This will not be worked on but feel free to help. label Oct 20, 2023
@Inc44
Copy link

Inc44 commented Oct 28, 2023

A similar error exists in other languages, such as French, Russian and Japanese.
The problem appears in model xtts_v1.1, coqui 0.19.0, python 3.11.5.

@Subarasheese
Copy link
Author

Subarasheese commented Nov 7, 2023

@Edresson The workaround (space before dot) is not working on xtts v2... It is still saying "dot" (ponto)
Previusly the workaround worked every time, if I recall correctly

@erogol
Copy link
Member

erogol commented Nov 8, 2023

We don't actually know why it happens. If anyone has any ideas, let us know

@Dhrog
Copy link

Dhrog commented Nov 9, 2023

I experienced the same problem with xtts-v2 using the german language.

@Subarasheese
Copy link
Author

We don't actually know why it happens. If anyone has any ideas, let us know

Are you guys sure there isn't an issue with the dataset? What were your sources?

@brambox
Copy link

brambox commented Nov 9, 2023

I'm also getting 'ponto' when fine tunning.

@Dhrog
Copy link

Dhrog commented Nov 9, 2023

I used the example code and read the text from a file. I installed Coqui TTS yesterday, so it is still overwhelming right now.
The sound file is attached. At one point you can hear: "Punkt dot"
It quite often happens that there are long gaps between sentences. Not sure if there is a connection to this issue?


# -*- coding: utf-8 -*-
import sys
from pathlib import Path
import torch
from TTS.api import TTS

f = open(sys.argv[1], 'rb').read()
f = f.decode('unicode_escape').encode('latin-1').decode('utf-8')
print (f)

file_output = sys.argv[2]

# Get device
device = "cuda" if torch.cuda.is_available() else "cpu"

# List available 🐸TTS models
#print(TTS().list_models())

# Init TTS
tts = TTS("tts_models/multilingual/multi-dataset/xtts_v2").to(device)

# Run TTS
# ❗ Since this model is multi-lingual voice cloning model, we must set the target speaker_wav and language
# Text to speech list of amplitude values as output
wav = tts.tts(text= f, speaker_wav="Data/RefClips/4.wav", language="de")
# Text to speech to a file
tts.tts_to_file(text=f, speaker_wav="Data/RefClips/4.wav", language="de", file_path=file_output)

umlaut.zip

@Inc44
Copy link

Inc44 commented Nov 9, 2023

Temporarily it is possible to fix this problem by replacing dots "." with exclamations "!"

@Edresson
Copy link
Contributor

Temporarily it is possible to fix this problem by replacing dots "." with exclamations "!"

In general, the use of ".." instead of ".", also works for Portuguese language.

@erogol erogol closed this as completed Nov 23, 2023
@wonka929
Copy link

wonka929 commented Dec 9, 2023

Italian has the same issue.
Except for workarounds, did you find a stable fix?

".." method does not work. Neither "!".

Thanks

PS: with italian works replacing "." with "\n"

@fcrescio
Copy link

This bug is still present at least for italian. Another workaround is to replace . with ;

@Fgabz
Copy link

Fgabz commented Mar 15, 2024

We have the same issue in french

@danielmzak
Copy link

In Czech (xtts_v2 model) try replacing "." with ";\n" - this will make the ends of sentences sound more natural.

@lincoln157nascimento
Copy link

Does anyone have a solution to the problem?.

@abhisirka2001
Copy link

abhisirka2001 commented Aug 6, 2024

Solution : Replacing the full stops(.) in the text with "|" works for the portuguese language also it adds a pause after the sentence ends. Using space instead of full stop doesnt add a pause.
However using a text with "|" instead of full stops won't work for longer text so use shorter text prompt less than 400 tokens with "|".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests