Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

German Voice for RHVoice #24

Open
winman3000 opened this issue Dec 9, 2015 · 45 comments
Open

German Voice for RHVoice #24

winman3000 opened this issue Dec 9, 2015 · 45 comments
Labels
Data: Languages/Pronunciation Data: Voices <Documentation> internal info, manuals and help (P5 - Long-term) Long-term WIP, may stay on the list for a while.

Comments

@winman3000
Copy link

It would be nice to have RH Voice in German. What do you Need for German language?

@abitrolly
Copy link
Contributor

Training data: text, audio files that read that text, and mapping that maps audio to the text.

@winman3000
Copy link
Author

If I understand you right, you need a nativ speaker who reads a text an
an audio file, right?

And what you mean with:

[...] and mapping that maps audio to the text.

You need the spoken text as a text file?

What criteria should have the read text?

Is it necessary that this text is recorded in a studio or could it be
with a headset too?

Sorry for the beginner questions, but I am a beginner in this aria

@abitrolly
Copy link
Contributor

If I understand you right, you need a nativ speaker who reads a text an
an audio file, right?

Yes.

You need the spoken text as a text file?

Yes.

What criteria should have the read text?

That depends on a language. Basically the text should be chosen in a way that audio for it contains all possible combinations of sounds, or at least cover most popular.

Is it necessary that this text is recorded in a studio or could it be
with a headset too?

It is better to record in studio with highest quality, because then you can convert audio to different formats. But.. it is possible to record with headset too.

Mind you that software is dumb - it doesn't know where words starts in your audio, so you will have to mark audio files with text manually (mapping between text and audio).

@winman3000
Copy link
Author

OK, now I am closer to understand you. :)

Mind you that software is dump - it doesn't know where words starts in
your audio, so you will have to mark audio files with text.

How can I do this? I am a blind person. If I give you a spoken text and
the text file, whould it be possible that you can create the voice? I am
a beginner so I am not able to create a German Voice database.

If it is too difficult to create a German Voice Database, we need
someone who can create such database. I am not a right person to create
such one I think.

@abitrolly
Copy link
Contributor

@winman3000 how do you use RHVoice primarily? Maybe there is already a voice for your platform or there might be available training bases. I think that any training material for HTS-type synthesizer will do.

And answering your question, me personally is unlikely to create the voice, because this project is not funded and I am afraid there is still a lot of missing bits to fill in that take time. But if we get training data, it is at least possible to see what is next.

@winman3000
Copy link
Author

winman3000 commented Dec 10, 2015 via email

@abitrolly
Copy link
Contributor

abitrolly commented Dec 10, 2015 via email

@winman3000
Copy link
Author

winman3000 commented Dec 10, 2015 via email

@nshmyrev
Copy link

It's better to give a link to marytts/marytts#440

You can also contact marytts developers on the mailing list http://www.dfki.de/mailman/cgi-bin/listinfo/mary-users

Voice building for rhvoice is not trivial, you could probably better just use nvda plugin for openmary with openmary voices. Openmary also has better synthesis quality due to mixed excitation vocoder and advanced NLP components.

@winman3000
Copy link
Author

winman3000 commented Dec 10, 2015 via email

@winman3000
Copy link
Author

winman3000 commented Dec 10, 2015 via email

@abitrolly
Copy link
Contributor

abitrolly commented Dec 10, 2015 via email

@abitrolly
Copy link
Contributor

abitrolly commented Dec 10, 2015 via email

@winman3000
Copy link
Author

As for Voxforge files, I need to check if audio files are annotated,

As far as I know, these files are annotated.

and if
RHVoice training pipeline can handle this format.

It would be very great when you can check this.

Thank you very much for your help!

@abitrolly
Copy link
Contributor

Some analysis. Prompts.tgz contains text that should be dictated. Filename master_prompts_train_16kHz-16bit contains strings like:

ralfherzog-20080131-de71/mfc/de71-62 DIE AUSGABEN KONNTEN GESPART WERDEN
ralfherzog-20080131-de71/mfc/de71-63 MAN WIRD AUF DEN NÄCHSTEN ABSCHWUNG WARTEN MÜSSEN
ralfherzog-20080131-de71/mfc/de71-64 DA MUSS MAN AUF ANDERE EREIGNISSE WARTEN

I am not really sure what this label means ralfherzog-20080131-de71/mfc/de71-63. ralfherzog looks like a name of text and everything else is still a mystery.

@abitrolly
Copy link
Contributor

Okay. http://www.repository.voxforge1.org/downloads/de/Trunk/Audio/Original/48kHz_16bit/ contains ralfherzog-20080131-de71.tgz Download is very slow, so I don't yet see what's inside. Looks like it should be audio for the text and name is just identifier what-when-shortid. mfc/de71-64 is still unclear. 64 looks like line number, de71 a short text identifier, but what mfc is - it is not clear.

@winman3000
Copy link
Author

winman3000 commented Dec 13, 2015 via email

@nshmyrev
Copy link

I would not be that pessimistic and try well documented openmary voice import procedure first.

@abitrolly
Copy link
Contributor

@nshmyrev
Copy link

@abitrolly latest wiki is on github:

https://github.com/marytts/marytts/wiki/VoiceImportToolsTutorial

that one on opendfki might be slightly outdated.

@abitrolly
Copy link
Contributor

@nshmyrev thanks. Just need to get some free time now.

@alex19EP alex19EP added (P5 - Long-term) Long-term WIP, may stay on the list for a while. <Documentation> internal info, manuals and help Data: Languages/Pronunciation Data: Voices labels Jun 16, 2020
@rugk
Copy link

rugk commented Oct 16, 2020

Why not just use Common Voice?
https://commonvoice.mozilla.org/de/datasets has 19GB German-spoken data.

@maniyax
Copy link
Member

maniyax commented Oct 16, 2020

@rugk Hello!
Sorry, but it's very bad quality.
Besides the text is missing in dataset.

I'm not sure about the German dataset, but russian dataset include a lot of different dictors.

@beqabeqa473
Copy link
Contributor

beqabeqa473 commented Oct 17, 2020 via email

@rugk
Copy link

rugk commented Oct 17, 2020

Ah okay, thanks, this makes sense. Of course, yeah, their idea, was to collect many voices…

@winman3000
Copy link
Author

Is there any progress on this in the meantime? Has anyone tried to import the voices from MaryTTS for RHVoice?

If there is no existing data we can use for this project, we will have to do everything from scratch. Unfortunately I have to repeat my questions:

  1. How long does the audio need to be?
  2. How exactly do you do the mapping if you have the text to go with it?
  3. Is there really no material we could use for a German voice?

I would love to finish the project so that we finally have a German voice as well...

@citizenserious
Copy link

I have found some voices including German, I use them with VocalizerEx2 TTS. They are working fine, but I do not trust VocalizerEx2. Would be nice to have these voices in RHVoice (:

https://vocalizer-nvda.com/downloads

@zstanecic
Copy link
Contributor

zstanecic commented Dec 7, 2021 via email

@uncle-ben-devel
Copy link

Any progress?
I'd like to get involved as I very much like the voice output from this application and it's openness.
I am a german native speaker. How can I contribute voice / training data? I have decent recording equipment also.

@winman3000
Copy link
Author

As far as I know, there is no progress on this. Apparently it is not enough to make language recordings, modules must also be adapted in C++, i.e. libraries must be developed for the German language.

@nshmyrev
Copy link

nshmyrev commented Aug 14, 2022

You need to ask Torsten https://github.com/thorstenMueller/Thorsten-Voice, he will make the voice for you.

@winman3000
Copy link
Author

Maybe we should contact Torsten Müller and ask if he would be willing to port his voice for RHVoice. We will definitely need someone to write the modules for German in C++. Unfortunately I do not have programming skills myself.

@zstanecic
Copy link
Contributor

zstanecic commented Aug 15, 2022 via email

@thorstenMueller
Copy link

Hi and sorry for joing late to discussion.

I tried to figure out required steps to add german language based on my Thorsten-Voice dataset. The "Polish" language was referenced but honestly i'm not sure what to do.
@zstanecic Would you mind helping me with the first steps?

@zstanecic
Copy link
Contributor

zstanecic commented Jan 7, 2023 via email

@thorstenMueller
Copy link

Hi @zstanecic ,

what i did so far:

  • Installed foma in version 0.9.18
  • Created folder /data/languages/German
  • In that "German" folder i've copied the following files from "Polish" directory: graph.txt, labelling.xml, language.conf, language.info, locale.info, phonemes.xml

Am i right, that ...

  • i have to adjust all these files (or less or more) to german?
  • running "foma" (do not know this tool at all, yet) will create several.fst and dt files?

Once this is done i guess i've to create a "Thorsten" folder in the /data/voices folder? And do whatever way of magic there ;-).

@zstanecic
Copy link
Contributor

zstanecic commented Mar 11, 2023 via email

@thorstenMueller
Copy link

Thanks for your quick reply and support offer. As i've never worked with foma this is really highly appreciated.
Github mail doesn't show your mail please contact me here and we can talk by mail then.
https://www.thorsten-voice.de/en/contact/

@zstanecic
Copy link
Contributor

zstanecic commented Mar 11, 2023 via email

@thorstenMueller
Copy link

Hi,
thanks @zstanecic for the really nice and helpful videochat this sunday. Based on that i forked the repo and created a "german" branch. Inside that /src/scripts/German folder i copied some foma files from Polish language that have to be changed or created from scratch for German.
https://github.com/thorstenMueller/RHVoice/tree/german/src/scripts/German

  • g2p.foma
  • gpos.foma
  • lseq.foma
  • spell.foma
  • stress.foma

This basic work has nothing to do with the actual voice, it's more a basic grammar setup and rules for german language. Honestly i cannot work this out all by my self. I'll read foma regex reference in foma wiki and contribute my actual voice. But i need help (probably by german speaking people) to adjust the basic grammar and rules.

So any help on this is appreciate if we'd like to add german / Thorsten-Voice to RHVoice.

@BluePixel4k
Copy link

@thorstenMueller thx for your effort. Maybe I can help a bit with adjusting the basic grammar and rules. But I don't know how exactly.

@svnpsc
Copy link

svnpsc commented Aug 26, 2023

@Foexle11
Copy link

any news?

@thorstenMueller
Copy link

Not from my side yet. Topic is absolutely interesting but remaining free time is (as mostly) is limited.

@citizenserious
Copy link

What about the Piper voices?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Data: Languages/Pronunciation Data: Voices <Documentation> internal info, manuals and help (P5 - Long-term) Long-term WIP, may stay on the list for a while.
Projects
None yet
Development

No branches or pull requests