Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TTS]Cantonese FastSpeech2 Training, test=tts #2907

Merged
merged 26 commits into from Feb 14, 2023

Conversation

WongLaw
Copy link
Contributor

@WongLaw WongLaw commented Feb 10, 2023

Cantonese FastSpeech2 Training, test=tts

@WongLaw WongLaw added the T2S label Feb 10, 2023
@WongLaw WongLaw added this to the r1.4.0 milestone Feb 10, 2023
@WongLaw WongLaw self-assigned this Feb 10, 2023
@yt605155624 yt605155624 changed the title Cantonese FastSpeech2 Training, test=tts [TTS]Cantonese FastSpeech2 Training, test=tts Feb 13, 2023

### Training details can refer to the script of examples/aishell3/tts3.

## Pretrained Model(Waiting========)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

现在模型已经训练完了吧?感觉预训练模型可以放上来了

└── speech_stats.npy # statistics used to normalize spectrogram when training fastspeech2
```
You can use the following scripts to synthesize for `${BIN_DIR}/../sentences.txt` using pretrained fastspeech2 and parallel wavegan models.
```bash
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里等待前端写好之后再更新

examples/canton/tts3/path.sh Outdated Show resolved Hide resolved
str(fp), sr=config.fs,
mono=False) if "canton" in str(fp) else librosa.load(
str(fp), sr=config.fs)
if len(wav.shape) == 2 and "canton" in str(fp):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mfa 的时候要求数据集直接放在 datasets/ 里面,但是此处又要求放在 datasets/canton_all 里面,此处统一下吧,如果这里不好改就改 mfa

examples/canton/tts3/README.md Outdated Show resolved Hide resolved
examples/canton/tts3/README.md Outdated Show resolved Hide resolved
examples/canton/tts3/README.md Outdated Show resolved Hide resolved
examples/canton/tts3/README.md Outdated Show resolved Hide resolved
examples/canton/tts3/README.md Show resolved Hide resolved
examples/canton/tts3/README.md Outdated Show resolved Hide resolved
examples/canton/tts3/local/train.sh Outdated Show resolved Hide resolved
@yt605155624
Copy link
Collaborator

yt605155624 commented Feb 13, 2023


# Only used for feats_type != raw

fmin: 80 # Minimum frequency of Mel basis.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fmin 是不是也要改成 110,不然感觉会包含了噪声

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

就现在合成效果来看这块问题不大,改的话和预训练的 voc 参数不匹配,可能影响合成效果,这个之后再看

@yt605155624
Copy link
Collaborator

yt605155624 commented Feb 14, 2023

resolved

CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize_e2e.sh ${conf_path} ${train_output_path} ${ckpt_name} || exit -1
fi

if [ ${stage} -le 4 ] && [ ${stop_stage} -ge 4 ]; then
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

加入文本前端之后后面这些 stage 可以删掉

Copy link
Collaborator

@yt605155624 yt605155624 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@yt605155624 yt605155624 merged commit acfa057 into PaddlePaddle:develop Feb 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: No status
Development

Successfully merging this pull request may close these issues.

None yet

2 participants