PaddleSpeech 快乐开源活动 (2025 H1) #3997

zxcd · 2025-02-27T07:44:23Z

📣PaddleSpeech 快乐开源活动

旨在鼓励更多的开发者参与到飞桨大模型套件的开源建设中，帮助社区修复 bug 或贡献 feature，共建飞桨。

任务目标

目前由于版本问题，文档已经跟不上代码啦！

按照readme操作可以完全跑通
文档与代码一致
文档书写错误

任务一：修正合成vocoder中的synthesize_e2e.sh中参数错误

序号	待修改文件	认领人/状态/PR 号
1	examples/csmsc/voc1/local/synthesize_e2e.sh	@ZJhorseloudly
2	examples/csmsc/voc3/local/synthesize_e2e.sh	@ZJhorseloudly
3	examples/csmsc/voc5/local/synthesize_e2e.sh	@ZJhorseloudly

任务二：补全合成系列中的脚本中参数缺失

序号	待修改文件	认领人/状态/PR 号
4	examples/aishell3/tts3/run.sh examples/aishell3/tts3/README.md	@enkilee #3998
5	examples/aishell3_vctk/ernie_sat/run.sh examples/aishell3_vctk/ernie_sat/README.md	@rich04lin #4005
6	examples/canton/tts3/run.sh examples/canton/tts3/README.md	@Echo-Nie #4004
7	examples/csmsc/tts0/run.sh examples/csmsc/tts0/README.md	@Echo-Nie #4008 @rich04lin #4007
8	examples/csmsc/tts2/run.sh examples/csmsc/tts2/README.md	@Echo-Nie #4008 @rich04lin #4009
9	examples/csmsc/tts3/run.sh examples/csmsc/tts3/README.md	@Echo-Nie #4008
10	examples/csmsc/tts3_rhy/run.sh examples/csmsc/tts3_rhy/README.md	@Echo-Nie #4008
11	examples/ljspeech/tts3/run.sh examples/ljspeech/tts3/README.md	@Echo-Nie #4010
12	examples/opencpop/svs1/run.sh examples/opencpop/svs1/README.md	@Echo-Nie #4012
13	examples/vctk/ernie_sat/run.sh examples/vctk/ernie_sat/README.md	@Echo-Nie #4013
14	examples/vctk/tts3/run.sh examples/vctk/tts3/README.md	@Echo-Nie #4013

任务三：修正文本书写错误（随时更新）

序号	待修改文件	认领人/状态/PR 号
15	examples/csmsc/voc3/README.md	@Echo-Nie #4012 @rich04lin #4011

任务一修改示例

修正目标：examples/*/voc*/local/synthesize_e2e.sh 例如：examples/csmsc/voc1/local/synthesize_e2e.sh

synthesize_e2e.sh 中代码如下：
python3 ${BIN_DIR}/../synthesize.py \
    --am=tacotron2_aishell3 \
    --am_config=${config_path} \
    --am_ckpt=${train_output_path}/checkpoints/${ckpt_name} \
    --am_stat=dump/train/speech_stats.npy \
    --voc=pwgan_aishell3 \
    --voc_config=pwg_aishell3_ckpt_0.5/default.yaml \
    --voc_ckpt=pwg_aishell3_ckpt_0.5/snapshot_iter_1000000.pdz \
    --voc_stat=pwg_aishell3_ckpt_0.5/feats_stats.npy \
    --test_metadata=dump/test/norm/metadata.jsonl \
    --output_dir=${train_output_path}/test \
    --phones_dict=dump/phone_id_map.txt \
    --speaker_dict=dump/speaker_id_map.txt \
    --voice-cloning=True

由于合成时训练的是 voc 而非 am，因此包含train_output_path的应该是 --voc， --voc_config 等 voc 相关部分，--am 相关部分按照 examples/csmsc/voc1/README.md 中的描述修改为 fastspeech2_nosil_baker_ckpt_0.4 文件夹下的相关文件。

任务二修改示例

修正目标：examples/*/*/local/run.sh，examples/*/*/README.md
在部分 synthesize_e2e.sh 和 synthesize.sh 中，通过对 stage 的修改支持多种模型的推理，但该参数未在对应的 run.sh 和 README.md 中暴露，需要将参数和对应的说明添加补充全。
例如：examples/aishell3/tts3/local/synthesize_e2e.sh 中通过 stage 控制分别使用 pwgan，hifigan 进行推理。

在 run.sh 中修改：

if [ ${stage} -le 2 ] && [ ${stop_stage} -ge 2 ]; then
    # synthesize, vocoder is pwgan by default stage 0, stage 1 will use hifigan as vocoder
    CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize.sh --stage 0 ${conf_path} ${train_output_path} ${ckpt_name} || exit -1
fi

if [ ${stage} -le 3 ] && [ ${stop_stage} -ge 3 ]; then
    # synthesize_e2e, vocoder is pwgan by default stage 0, stage 1 will use hifigan as vocoder
    CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize_e2e.sh --stage 0 ${conf_path} ${train_output_path} ${ckpt_name} || exit -1
fi

在 README.md 中修改：

`./local/synthesize.sh` calls `${BIN_DIR}/../synthesize.py`, which can synthesize waveform from `metadata.jsonl`. 

CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize.sh --stage 0 ${conf_path} ${train_output_path} ${ckpt_name}

`--stage` controls the vocoder model during synthesis, which can be `0` or `1`, use `pwgan` or `hifigan` model as vocoder.

任务三修改示例

修改examples/csmsc/voc3/README.md

HiFiGAN checkpoint contains files listed below.
mb_melgan_csmsc_ckpt_0.1.1
├── default.yaml                    # default config used to train MultiBand MelGAN
├── feats_stats.npy                 # statistics used to normalize spectrogram when training MultiBand MelGAN
└── snapshot_iter_1000000.pdz       # generator parameters of MultiBand MelGAN

该 README.md 中模型下载 MultiBand MelGAN 模型，但文件列表写的是 HiFiGAN 。

认领方式

请大家以 comment 的形式认领任务，如：

【报名】：1、3、2-3

多个任务之间需要使用中文顿号分隔，报名多个连续任务可用横线表示，如 1-2
PR 提交格式：在 PR 的标题中以 【PaddleSpeech No.xxx】 开头，注明任务编号

看板信息

任务方向	任务数量	提交作品 / 任务认领	提交率	完成	完成率
PaddleSpeech 快乐开源活动	15	12 / 15	80.0%	3	20.0%

统计信息

排名不分先后 @enkilee (1) @Echo-Nie (1) @rich04lin (1)

The text was updated successfully, but these errors were encountered:

enkilee · 2025-03-03T03:48:37Z

【报名】: 4

Echo-Nie · 2025-03-14T12:28:46Z

【报名】: 6-15

rich04lin · 2025-03-14T13:05:18Z

【报名】: 5

ZJhorseloudly · 2025-03-26T02:58:47Z

【报名】: 1-3

luotao1 assigned luotao1 and zxcd Feb 27, 2025

luotao1 added this to Call for Contributions Feb 27, 2025

luotao1 moved this to In Progress in Call for Contributions Feb 27, 2025

luotao1 pinned this issue Feb 28, 2025

sunzhongkai588 mentioned this issue Mar 7, 2025

【HACKATHON 预备营】飞桨启航计划集训营（第五期） PaddlePaddle/Paddle#71491

Open

This was referenced Mar 26, 2025

task_1_voc1 #4035

Closed

【PaddleSpeech No.1-3】 #4036

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PaddleSpeech 快乐开源活动 (2025 H1) #3997

PaddleSpeech 快乐开源活动 (2025 H1) #3997

zxcd commented Feb 27, 2025 •

edited by luotao1

Loading

enkilee commented Mar 3, 2025 •

edited by luotao1

Loading

Echo-Nie commented Mar 14, 2025 •

edited by luotao1

Loading

rich04lin commented Mar 14, 2025 •

edited by luotao1

Loading

ZJhorseloudly commented Mar 26, 2025

PaddleSpeech 快乐开源活动 (2025 H1) #3997

PaddleSpeech 快乐开源活动 (2025 H1) #3997

Comments

zxcd commented Feb 27, 2025 • edited by luotao1 Loading

📣PaddleSpeech 快乐开源活动

任务目标

任务一：修正合成vocoder中的synthesize_e2e.sh中参数错误

任务二：补全合成系列中的脚本中参数缺失

任务三：修正文本书写错误（随时更新）

任务一修改示例

任务二修改示例

任务三修改示例

认领方式

看板信息

统计信息

enkilee commented Mar 3, 2025 • edited by luotao1 Loading

Echo-Nie commented Mar 14, 2025 • edited by luotao1 Loading

rich04lin commented Mar 14, 2025 • edited by luotao1 Loading

ZJhorseloudly commented Mar 26, 2025

zxcd commented Feb 27, 2025 •

edited by luotao1

Loading

enkilee commented Mar 3, 2025 •

edited by luotao1

Loading

Echo-Nie commented Mar 14, 2025 •

edited by luotao1

Loading

rich04lin commented Mar 14, 2025 •

edited by luotao1

Loading