Skip to content

zengchang233/xiaoicesing2

Repository files navigation

The source code for the paper XiaoiceSing2 (interspeech2023)

Demo page

Notice

I am busy with job-hunting now. I will update other modules, including the HiFi-WaveGAN after my final decision.

Implementation (developping)

  • fastspeech2-based generator
  • discriminator group, including segment discriminators and detail discriminators
  • ConvFFT block

Dataset and preparation

  • opencpop cn
  • kiritan jp
  • CSD kr
  • m4singer cn
  • NUS48E

Kaldi style preparation

  • wav.scp
  • utt2spk
  • spk2utt
  • text
./run.sh --start-stage 1 --stop-stage 1 # extract melspectrogram, f0, energy, and statistical value

Training

./run.sh --start-stage 2 --stop-stage 2

Real and generated melspectrogram (145600 training steps)

Real(left) XiaoiceSing(middle) XiaoiceSing2(right)

real xs1 xs2

L2 loss curve for melspectrogram

L2 loss before post-processing(left) L2 loss after post-processing(right)

before after

Inference

./run.sh --start-stage 3 --stop-stage 3

About

The source code for the paper XiaoiceSing2 (interspeech2023)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published