Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

请问 trainset_preprocess_pipeline_print.py 文件是什么功能? #124

Closed
TinaChen95 opened this issue Apr 22, 2023 · 6 comments
Closed
Labels
documentation 📄文档说明 help wanted 🚸请求协助

Comments

@TinaChen95
Copy link

up主好!信号处理小白想请教一下,这个文件完成了什么功能呢?
似乎是把音频分割成好多小段、去除了长静音。

另外请教下您的模型框架是怎么样的呢,是否有相关资料可以学习下?
感谢回答!

@fumiama fumiama added documentation 📄文档说明 help wanted 🚸请求协助 labels Apr 22, 2023
@RVC-Boss
Copy link
Member

还有响度归一化

@RVC-Boss
Copy link
Member

目前有的资料https://www.bilibili.com/video/BV1pm4y1z7Gm

@TinaChen95
Copy link
Author

谢谢!请问是否可以这样理解:

  1. 按照rms阈值对音频进行VAD,分割成多个小片段
  2. 以3.7s为窗长、overlap为0.3s,截取音频片段
  3. 每个音频片段内,按照幅值做归一化,rescale到0.8,并保存原始采样率和16K采样率两个版本

请问这个VAD和webrtcvad的性能差别如何呢?

@RVC-Boss
Copy link
Member

你的123理解是正确的

没有对比过,不太重要。主要是去静音,这样来精简训练集,加快训练速度。

@RVC-Boss
Copy link
Member

漏点没查出来的,或者多切一刀,影响都不会很大。

@TinaChen95
Copy link
Author

明白了,感谢解答!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation 📄文档说明 help wanted 🚸请求协助
Projects
None yet
Development

No branches or pull requests

3 participants