TTS微调完后，如何使用style_fs2改变语速，我试着按照style_fs2的文档把fastspeech2模型换成自己的，但是一直报OOM #2504

kq-cheng · 2022-10-08T07:52:56Z

python3 style_syn.py
--fastspeech2-config
=/opt/ml/server/PaddleSpeechserver/demos/speech_web/speech_server/tmp_dir/finetune/trans/exp/default.yaml
--fastspeech2-checkpoint
=/opt/ml/server/PaddleSpeechserver/demos/speech_web/speech_server/tmp_dir/finetune/trans/exp/checkpoints/snapshot_iter_120292.pdz
--fastspeech2-stat=download/fastspeech2_nosil_baker_ckpt_0.4/speech_stats.npy
--fastspeech2-pitch-stat=download/fastspeech2_nosil_baker_ckpt_0.4/pitch_stats.npy
--fastspeech2-energy-stat=download/fastspeech2_nosil_baker_ckpt_0.4/energy_stats.npy
--pwg-config=download/pwg_baker_ckpt_0.4/pwg_default.yaml
--pwg-checkpoint=download/pwg_baker_ckpt_0.4/pwg_snapshot_iter_400000.pdz
--pwg-stat=download/pwg_baker_ckpt_0.4/pwg_stats.npy
--text=/opt/ml/server/PaddleSpeechserver/demos/speech_web/speech_server/tmp_dir/finetune/trans/sentences.txt
--output-dir=output --phones-dict=download/fastspeech2_nosil_baker_ckpt_0.4/phone_id_map.txt

其中fastspeech2-config和fastspeech2-checkpoint是我自己微调的模型，其他的没变。那几个npy文件不知道怎么生成，是npy引起的OOM吗。设置GPU自增长不管用，改用cpu也是内存错误

yt605155624 · 2022-10-08T08:02:17Z

“那几个npy文件” 以及 phone_id_map.txt 应该使用预训练模型里面的文件，此处应该是 fastspeech2_aishell3_ckpt_1.1.0.zip 里面的文件
微调出来的模型是多说话人的模型，style_fs2 本身是单说话人的模型，说话人的模型多了一个 spk_id 的输入，而且需要输入 speaker_id_map.txt，具体参考 csmsc/tts3 和 aishell3/tts3 里面 synthesize_e2e.sh 的区别，需要相应修改代码
另外想要改变语速可以直接参考这个 #2383

kq-cheng · 2022-10-09T06:47:35Z

好的，我试试，谢谢

Christophy · 2022-11-25T07:51:42Z

大概这个样子，给后来人参考，speaker_id_map.txt和spk_id加上就可以了，不然跑出来的声音很奇怪。

with open("/home/test/speaker_id_map.txt", 'rt') as f: spk_id = [line.strip().split() for line in f.readlines()] spk_num = len(spk_id) odim = am_config.n_mels model = FastSpeech2( idim=vocab_size, odim=odim, spk_num= spk_num, **am_config["model"],)

am_inference = StyleFastSpeech2Inference( fastspeech2_normalizer, model, pwd + "/fastspeech2_mix_ckpt_1.2.0/pitch_stats.npy", pwd + "/fastspeech2_mix_ckpt_1.2.0/energy_stats.npy") am_inference.eval()

spk_id = paddle.to_tensor(0) mel = am_inference(part_phone_ids, durations=None, durations_scale=1.2, durations_bias=None, pitch=None, pitch_scale=1.3, pitch_bias=None, energy=None, energy_scale=1.3, energy_bias=None, robot=False, spk_id=spk_id)

SG-XM · 2022-12-08T15:14:05Z

大概这个样子，给后来人参考，speaker_id_map.txt和spk_id加上就可以了，不然跑出来的声音很奇怪。

with open("/home/test/speaker_id_map.txt", 'rt') as f: spk_id = [line.strip().split() for line in f.readlines()] spk_num = len(spk_id) odim = am_config.n_mels model = FastSpeech2( idim=vocab_size, odim=odim, spk_num= spk_num, **am_config["model"],)

am_inference = StyleFastSpeech2Inference( fastspeech2_normalizer, model, pwd + "/fastspeech2_mix_ckpt_1.2.0/pitch_stats.npy", pwd + "/fastspeech2_mix_ckpt_1.2.0/energy_stats.npy") am_inference.eval()

spk_id = paddle.to_tensor(0) mel = am_inference(part_phone_ids, durations=None, durations_scale=1.2, durations_bias=None, pitch=None, pitch_scale=1.3, pitch_bias=None, energy=None, energy_scale=1.3, energy_bias=None, robot=False, spk_id=spk_id)

您好,方便参考一下您完整的代码吗, #2722 我修改后还是GPU OOM

kq-cheng added the Question label Oct 8, 2022

yt605155624 self-assigned this Oct 8, 2022

yt605155624 added the T2S label Oct 8, 2022

kq-cheng closed this as completed Oct 9, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TTS微调完后，如何使用style_fs2改变语速，我试着按照style_fs2的文档把fastspeech2模型换成自己的，但是一直报OOM #2504

TTS微调完后，如何使用style_fs2改变语速，我试着按照style_fs2的文档把fastspeech2模型换成自己的，但是一直报OOM #2504

kq-cheng commented Oct 8, 2022

yt605155624 commented Oct 8, 2022 •

edited

kq-cheng commented Oct 9, 2022

Christophy commented Nov 25, 2022 •

edited

SG-XM commented Dec 8, 2022

TTS微调完后，如何使用style_fs2改变语速，我试着按照style_fs2的文档把fastspeech2模型换成自己的，但是一直报OOM #2504

TTS微调完后，如何使用style_fs2改变语速，我试着按照style_fs2的文档把fastspeech2模型换成自己的，但是一直报OOM #2504

Comments

kq-cheng commented Oct 8, 2022

yt605155624 commented Oct 8, 2022 • edited

kq-cheng commented Oct 9, 2022

Christophy commented Nov 25, 2022 • edited

SG-XM commented Dec 8, 2022

yt605155624 commented Oct 8, 2022 •

edited

Christophy commented Nov 25, 2022 •

edited