Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

xttsv2 model sometimes(almost 10%)produce extra noise.[Bug] #3598

Closed
seetimee opened this issue Feb 21, 2024 · 9 comments
Closed

xttsv2 model sometimes(almost 10%)produce extra noise.[Bug] #3598

seetimee opened this issue Feb 21, 2024 · 9 comments
Labels
bug Something isn't working wontfix This will not be worked on but feel free to help.

Comments

@seetimee
Copy link

Describe the bug

For example,If it generates a 30-second audio, for the first 15 seconds it generates normal audio, and for the last 15 seconds it generates noise

To Reproduce

tts_to_file(text=text,
language=lang_code,
speaker_wav=speaker_wav,
speed=speed,
file_path=temp_file
,split_sentences=split_sentences)

Expected behavior

No response

Logs

No response

Environment

{
    "CUDA": {
        "GPU": [
            "NVIDIA GeForce RTX 3080"
        ],
        "available": true,
        "version": "11.8"
    },
    "Packages": {
        "PyTorch_debug": false,
        "PyTorch_version": "2.1.2+cu118",
        "TTS": "0.22.0",
        "numpy": "1.23.0"
    },
    "System": {
        "OS": "Linux",
        "architecture": [
            "64bit",
            "ELF"
        ],
        "processor": "x86_64",
        "python": "3.10.8",
        "version": "#148-Ubuntu SMP Mon Oct 17 16:02:06 UTC 2022"
    }
}

Additional context

No response

@seetimee seetimee added the bug Something isn't working label Feb 21, 2024
@TangHaitao1994
Copy link

training data default length is about 15 seconds

@seetimee
Copy link
Author

Is my training reference audio too long?(lower than 5 minutes will be better)

@kaveenkumar
Copy link

Most often occurs in FR language than others.

You synthesize 10 samples of a "text" for a given "voice", 2-3 out of those 10 samples contain the "text" + "some other blabbering"

@Sharrnah
Copy link

Sharrnah commented Mar 4, 2024

same with german.
English seems fine, so my guess is the training for some languages resulted in these issues.

For german, i think i sometimes could hear sentences like Punkt. Neues Kapitel or similar.
So maybe the training data included some verbose audio the readers added and the AI "learned" this for these languages.

(just some guessing here)

Copy link

stale bot commented Apr 22, 2024

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. You might also look our discussion channels.

@stale stale bot added the wontfix This will not be worked on but feel free to help. label Apr 22, 2024
@seetimee
Copy link
Author

keep watching

@stale stale bot removed the wontfix This will not be worked on but feel free to help. label Apr 24, 2024
@Sharrnah
Copy link

i think something that confirms my guess.
In Italian, if you add a . (dot) at the end of a sentence, it very often speaks that dot "punkto".

it should not pronounce punctuation marks in my opinion.

@CRochaVox
Copy link

I also have a problem with noise after punctuations and in long sentences it is common for him to say the 'ponto' for the period in Portuguese

Does anyone know if it is possible to reduce this by finetunning the model?

Copy link

stale bot commented Jul 7, 2024

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. You might also look our discussion channels.

@stale stale bot added the wontfix This will not be worked on but feel free to help. label Jul 7, 2024
@stale stale bot closed this as completed Jul 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working wontfix This will not be worked on but feel free to help.
Projects
None yet
Development

No branches or pull requests

5 participants