Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Training a Japanese model, pitch accent and IPA #186

Closed
NielsVandenEynde opened this issue Jan 10, 2024 · 1 comment
Closed

Training a Japanese model, pitch accent and IPA #186

NielsVandenEynde opened this issue Jan 10, 2024 · 1 comment

Comments

@NielsVandenEynde
Copy link

NielsVandenEynde commented Jan 10, 2024

First of all, thanks for this awesome research, voice cloning desperately needs to be open sourced.

I'm interested in training a Japanese model, I have over a thousand hours of speech data.

However I'm a bit concerned about having to convert my transcriptions to IPA. Japanese has a pitch accent, with pitches possibly changing throughout a word. For example 橋、箸 are both pronounced as "hashi", but the pitch change is different for them. However when converting text to IPA, such as in this topic, this information is lost. Is there a way you can train a model with just the "raw" text? Besides from that, I just need to train/find a Japanese Bert model right? Any other things I should be aware of?

Thanks in advance

Akito-UzukiP pushed a commit to Akito-UzukiP/StyleTTS2 that referenced this issue Jan 13, 2024
* 修复多机训练问题

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* 更新并完善分布式训练功能

近期融合V2版本代码时发现之前修改的多机功能并不正确,仍会报错,只不过单机多卡情况下local_rank即相当于rank,感知不出
1. 修复train_ms.py中DDP初始化及.cuda绑定到local_rank上
2. 在default_config.yml配置文件中添加env变量 LOCAL_RANK,否则默认情况下会key error
3. 添加run_MnodesAndMgpus.sh,更新分布式相关说明

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
@yl4579
Copy link
Owner

yl4579 commented Mar 7, 2024

Sorry for the late reply because I am very busy recently, but for the pitch in Japanese you may refer to yl4579/StyleTTS#10 (comment). The pitch can easily be extracted from OpenJTalk return: yl4579/PL-BERT#6 (comment)

@yl4579 yl4579 closed this as completed Mar 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants