Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PaddleSpeech 快乐开源活动 (2025 H1) #3997

Open
zxcd opened this issue Feb 27, 2025 · 4 comments
Open

PaddleSpeech 快乐开源活动 (2025 H1) #3997

zxcd opened this issue Feb 27, 2025 · 4 comments
Assignees

Comments

@zxcd
Copy link
Collaborator

zxcd commented Feb 27, 2025

📣PaddleSpeech 快乐开源活动

旨在鼓励更多的开发者参与到飞桨大模型套件的开源建设中,帮助社区修复 bug 或贡献 feature,共建飞桨。

任务目标

目前由于版本问题,文档已经跟不上代码啦!

  • 按照readme操作可以完全跑通
  • 文档与代码一致
  • 文档书写错误

任务一:修正合成vocoder中的synthesize_e2e.sh中参数错误

序号 待修改文件 认领人/状态/PR 号
1 examples/csmsc/voc1/local/synthesize_e2e.sh  @ZJhorseloudly
2 examples/csmsc/voc3/local/synthesize_e2e.sh @ZJhorseloudly
3 examples/csmsc/voc5/local/synthesize_e2e.sh @ZJhorseloudly

任务二:补全合成系列中的脚本中参数缺失

序号 待修改文件 认领人/状态/PR 号
4 examples/aishell3/tts3/run.sh
examples/aishell3/tts3/README.md
 @enkilee #3998
5 examples/aishell3_vctk/ernie_sat/run.sh
examples/aishell3_vctk/ernie_sat/README.md
 @rich04lin #4005
6 examples/canton/tts3/run.sh
examples/canton/tts3/README.md
 @Echo-Nie #4004
7 examples/csmsc/tts0/run.sh
examples/csmsc/tts0/README.md
 @Echo-Nie #4008
@rich04lin #4007
8 examples/csmsc/tts2/run.sh
examples/csmsc/tts2/README.md
 @Echo-Nie #4008
@rich04lin #4009
9 examples/csmsc/tts3/run.sh
examples/csmsc/tts3/README.md
 @Echo-Nie #4008
10 examples/csmsc/tts3_rhy/run.sh
examples/csmsc/tts3_rhy/README.md
 @Echo-Nie #4008
11 examples/ljspeech/tts3/run.sh
examples/ljspeech/tts3/README.md
 @Echo-Nie #4010
12 examples/opencpop/svs1/run.sh
examples/opencpop/svs1/README.md
 @Echo-Nie #4012
13 examples/vctk/ernie_sat/run.sh
examples/vctk/ernie_sat/README.md
 @Echo-Nie #4013
14 examples/vctk/tts3/run.sh
examples/vctk/tts3/README.md
 @Echo-Nie #4013

任务三:修正文本书写错误(随时更新)

序号 待修改文件 认领人/状态/PR 号
15 examples/csmsc/voc3/README.md  @Echo-Nie #4012
@rich04lin #4011

任务一修改示例

修正目标:examples/*/voc*/local/synthesize_e2e.sh 例如:examples/csmsc/voc1/local/synthesize_e2e.sh

synthesize_e2e.sh 中代码如下:
python3 ${BIN_DIR}/../synthesize.py \
    --am=tacotron2_aishell3 \
    --am_config=${config_path} \
    --am_ckpt=${train_output_path}/checkpoints/${ckpt_name} \
    --am_stat=dump/train/speech_stats.npy \
    --voc=pwgan_aishell3 \
    --voc_config=pwg_aishell3_ckpt_0.5/default.yaml \
    --voc_ckpt=pwg_aishell3_ckpt_0.5/snapshot_iter_1000000.pdz \
    --voc_stat=pwg_aishell3_ckpt_0.5/feats_stats.npy \
    --test_metadata=dump/test/norm/metadata.jsonl \
    --output_dir=${train_output_path}/test \
    --phones_dict=dump/phone_id_map.txt \
    --speaker_dict=dump/speaker_id_map.txt \
    --voice-cloning=True

由于合成时训练的是 voc 而非 am, 因此包含train_output_path的应该是 --voc, --voc_config 等 voc 相关部分,--am 相关部分按照 examples/csmsc/voc1/README.md 中的描述修改为 fastspeech2_nosil_baker_ckpt_0.4 文件夹下的相关文件。

任务二修改示例

修正目标:examples/*/*/local/run.shexamples/*/*/README.md
在部分 synthesize_e2e.shsynthesize.sh 中,通过对 stage 的修改支持多种模型的推理,但该参数未在对应的 run.shREADME.md 中暴露,需要将参数和对应的说明添加补充全。
例如 :examples/aishell3/tts3/local/synthesize_e2e.sh 中通过 stage 控制分别使用 pwgan,hifigan 进行推理。

  • run.sh 中修改:
if [ ${stage} -le 2 ] && [ ${stop_stage} -ge 2 ]; then
    # synthesize, vocoder is pwgan by default stage 0, stage 1 will use hifigan as vocoder
    CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize.sh --stage 0 ${conf_path} ${train_output_path} ${ckpt_name} || exit -1
fi

if [ ${stage} -le 3 ] && [ ${stop_stage} -ge 3 ]; then
    # synthesize_e2e, vocoder is pwgan by default stage 0, stage 1 will use hifigan as vocoder
    CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize_e2e.sh --stage 0 ${conf_path} ${train_output_path} ${ckpt_name} || exit -1
fi
  • README.md 中修改:
`./local/synthesize.sh` calls `${BIN_DIR}/../synthesize.py`, which can synthesize waveform from `metadata.jsonl`. 

CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize.sh --stage 0 ${conf_path} ${train_output_path} ${ckpt_name}

`--stage` controls the vocoder model during synthesis, which can be `0` or `1`, use `pwgan` or `hifigan` model as vocoder.

任务三修改示例

修改examples/csmsc/voc3/README.md

HiFiGAN checkpoint contains files listed below.
mb_melgan_csmsc_ckpt_0.1.1
├── default.yaml                    # default config used to train MultiBand MelGAN
├── feats_stats.npy                 # statistics used to normalize spectrogram when training MultiBand MelGAN
└── snapshot_iter_1000000.pdz       # generator parameters of MultiBand MelGAN

README.md 中模型下载 MultiBand MelGAN 模型,但文件列表写的是 HiFiGAN 。

认领方式

请大家以 comment 的形式认领任务,如:

【报名】:1、3、2-3
  • 多个任务之间需要使用中文顿号分隔,报名多个连续任务可用横线表示,如 1-2
  • PR 提交格式:在 PR 的标题中以 【PaddleSpeech No.xxx】 开头,注明任务编号

看板信息

任务方向 任务数量 提交作品 / 任务认领 提交率 完成 完成率
PaddleSpeech 快乐开源活动 15 12 / 15 80.0% 3 20.0%

统计信息

排名不分先后 @enkilee (1) @Echo-Nie (1) @rich04lin (1)

@enkilee
Copy link
Contributor

enkilee commented Mar 3, 2025

【报名】: 4

@Echo-Nie
Copy link
Contributor

Echo-Nie commented Mar 14, 2025

【报名】: 6-15

@rich04lin
Copy link
Contributor

rich04lin commented Mar 14, 2025

【报名】: 5

@ZJhorseloudly
Copy link

【报名】: 1-3

This was referenced Mar 26, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: In Progress
Development

No branches or pull requests

6 participants