Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about the effect of vc #10

Open
tobefans opened this issue Mar 21, 2022 · 1 comment
Open

Question about the effect of vc #10

tobefans opened this issue Mar 21, 2022 · 1 comment

Comments

@tobefans
Copy link

I ran the code once using vctk, but the conversion didn't work well.
Is there any data preprocessing needed? Like VAD? I often see the warning: "PraatWarning: There were no voiced segments found."

@dhchoi99
Copy link
Owner

There are 2 major preprocessing that the authors used in the original paper:

  1. Information perturbation using Parselmouth

To this end, we propose to perturb the information included in input waveform x by using three functions that are 1. formant shifting (fs), 2. pitch randomization (pr), and 3. random frequency shaping using a parametric equalizer (peq)

  1. Dataset filtering

The speakers of train-clean-360 were included to the training set only when the total length of speech samples
exceeds 15 minutes.

For process 2, I've not done any work considering that, so those filtering might help.
For process 1, where the warning "PraatWarning: There were no voiced segments found." comes from, the problem is quite complex. During the process(with my implementation), many different praat and parselmouth errors popped out and I couldn't really find out what the exact reasons were.
As an example, for "PraatWarning: There were no voiced segments found.", some wavfiles definitely had human voice, but throwed such warning during perturbation :(
So I ignored and forced to train with the warning, but it might help if you remove audio files throwing those warnings.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants