Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spanish Language #3

Open
ryanleesipes opened this issue Feb 10, 2016 · 17 comments
Open

Spanish Language #3

ryanleesipes opened this issue Feb 10, 2016 · 17 comments

Comments

@ryanleesipes
Copy link

Need to develop the ability to support the Spanish language.

@javiercani
Copy link

Hi, I'm also interested in this. Could you find something recently ?. I am a software developer and I want to contribute to the project

Hola, tambien estoy interesado en esto. ¿Pudiste encontrar algo recientemente?. Soy desarrollador de software y quiero contribuir en el proyecto

@forslund
Copy link
Collaborator

@javiercani any help is appreciated. If you feel up to the challenge adding more languages is a good research area, there are some resources around the web for getting started (such as http://homepages.inf.ed.ac.uk/jyamagis/software/page54/page54.html).

There are currently work updating the scripts for building pronunciation lexicons (Pull request #17) and once they are fully working it might be something to build upon for more languages.

@javiercani
Copy link

OK @forslund , i will read more and i will come back later. Thanks

@AnderRasoVazquez
Copy link

Spain Spanish and Latin American Spanish are different, supporting both would be great. Keep up the good work!

@adocampo
Copy link

I've just installed mimic and played with the english voices. Is there any improvement on spanish or perhaps some guidances to help to improve?

@forslund
Copy link
Collaborator

@zeehio is working on spanish support and has made some significant progress in the architecture. Last I heard he was looking into phonetic dictionaries.

@zeehio, are there any suitable tasks for contributors that can be split from the main task?

@forslund
Copy link
Collaborator

@malevolent @zeehio outlined some improvements that might be good to work towards in #86 when that PR is merged.

@adocampo
Copy link

Perfect! If there is something I can do, I would like to contribute... if there is guidance it surely will help

@forslund
Copy link
Collaborator

@malevolent @zeehio has begun the wiki entry on the issue. Other than that it's the old flite documentation. I try to keep an eye on the mimic channel at the mycroft slack so if you've got questions I can try to answer them

@adocampo
Copy link

Well, I'll wait for the wiki to help. I keep an eye here .

@jenavarro
Copy link

Hello, question: this thread has been open since march '16, is there any update on supporting spanish language? Seems to not be ready yet, no documentation and a couple of issues are still open here.

@forslund
Copy link
Collaborator

The thread opened right after the work was started on mimic and work slowly moving forward. @zeehio has done a couple of large chunks of work, updated the model-builder script, added utf-8 support and has a pending PR for a tokenizer for Spanish. In addition to this he's working on a phonetic dictionary. Unfortunately he seem to have less time for mimic work these days and I totally understand him.

Any help is appreciated, there are pending PR's that no one has had the opportunity to review. Directly relating to this issue is PR #86. I think there's a possible memory leak, but I might be wrong and a set of extra eyes (and a brain sharper than mine behind them) to confirm would be good.

@zeehio
Copy link
Contributor

zeehio commented Feb 20, 2017

Hi,

Unfortunately I am trapped under a lot of PhD related work. I don't think I will be able to commit time to mimic in the following months, but I may assist if someone else wants to do the work.

I have done the fixes suggested by @forslund (hi... thanks for your understanding... I wish I could work more on this...) to the last PR I submitted some months ago. Once that is merged someone with Spanish knowledge could work on improving the token to words rules (See the code in #86 and write here if you are still interested (@javiercani @adocampo)).

Converting words (that come out of the tokenizer) to phonemes can be done through the saga library. This is more memory efficient than building a whole lexicon for each Spanish dialect and covers several dialects.

More pieces are needed but our only choice is to work little by little (as our time allows us) on each of them.

@albertosgz
Copy link

Hello @zeehio

I would like to help you with the code me too. I will have some month free, so something I think I can do.

@zeehio
Copy link
Contributor

zeehio commented Mar 27, 2017

Spanish HTS voice for Festival: http://homepages.inf.ed.ac.uk/jyamagis/software/page54/page54.html

@forslund
Copy link
Collaborator

Mycroft may, but mimic still hasn't got it (except for my own very poor attempt and zeehio's top secret state of the art technology)

@zeehio
Copy link
Contributor

zeehio commented May 15, 2017

Since my PR would use a GPL library for the phonetic transcription it would need to change mimic license to GPL... While some cleaning may be needed in the code this is the major blocking issue for Spanish support right now. Pinging @penrods to know how things are in that end

Oh, and the Spanish voice I have was the state of the art some years ago... It has room for improvement... but it's a start! :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants