Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dataset licence #31

Open
C00reNUT opened this issue May 6, 2022 · 7 comments
Open

Dataset licence #31

C00reNUT opened this issue May 6, 2022 · 7 comments

Comments

@C00reNUT
Copy link

C00reNUT commented May 6, 2022

Hello,
thank you for this amazing TTS model public. It is by far the best quality tts model I have tried so far.

I would like to ask you about the licensing of the dataset you have used for the training - Am I guessing correctly that you have used your own selection of librivox recordings?

I'm asking just to be sure that I can use the outputs in commercial setting, since all librivox recordings are in the public domain.

@neonbjb
Copy link
Owner

neonbjb commented May 6, 2022

The dataset consists of thousands of audiobooks and podcasts that were scraped from the web. Many are copywritten, which is why I am not releasing the dataset.

If you know or believe the laws in your jurisdiction will consider ML models as extensions of their datasets, then you should consider Tortoise license encumbered and you should not use it for commercial purposes.

@neonbjb neonbjb closed this as completed May 6, 2022
@C00reNUT
Copy link
Author

C00reNUT commented May 6, 2022

Thank you for the clarification.

Just one more thing, I am asking because I would like to use the train_ voices

This repo comes with several pre-packaged voices. Voices prepended with "train_" came from the training set and perform far better than the others. If your goal is high quality speech, I recommend you pick one of them. If you want to see what Tortoise can do for zero-shot mimicing, take a look at the others.

I just want to be sure that they are not 'exact' 1:1 copy of the original voice, because maybe the generalization of the model could be fine according to the law, but I wouldn't be so sure with the exact voice match

@neonbjb
Copy link
Owner

neonbjb commented May 6, 2022

This is a good point. You should not use any of the pre-packaged voices for business purposes for the time being. I will re-open t his and investigate which voices have copywrites attached to them and remove them.

@neonbjb neonbjb reopened this May 6, 2022
@neonbjb
Copy link
Owner

neonbjb commented May 6, 2022

FYI: LibriTTS and HiFiTTS datasets were used to train Tortoise. If you are looking for license-free voices that will work very well with this program, use one of those.

@C00reNUT
Copy link
Author

C00reNUT commented May 6, 2022

FYI: LibriTTS and HiFiTTS datasets were used to train Tortoise. If you are looking for license-free voices that will work very well with this program, use one of those.

Excellent, that is a very valuable information. There shall be plenty of public domain options, it will be just a bit of hit or miss trials

@Aspie96
Copy link

Aspie96 commented Jul 11, 2022

Just as a (probably) dumb (related) question: is there any reason to favour those datasets over LibriSpeech or some other dataset based on LibriVox (maybe a public domain one, since LibriSpeech is not exactly public domain)?

@neonbjb
Copy link
Owner

neonbjb commented Jul 11, 2022

Not a dumb question, this is something that took me some pain to figure out. ASR-focused datasets are often poor for TTS because they are missing punctuation and have bad splitting (e.g. not split on sentences). These are both important cues for a TTS system. Both of these applies to LibriSpeech.

I believe LibriSpeech intersects with LibriTTS, so the model should work equally well with voices from either datasets.

zachwe pushed a commit to zachwe/tortoise-tts that referenced this issue Sep 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants