Skip to content

yoyololicon/golf

Repository files navigation

GlOttal-flow LPC Filter (GOLF)

arXiv

The source code of the paper Singing Voice Synthesis Using Differentiable LPC and Glottal-Flow-Inspired Wavetables, accepted at ISMIR 2023.

Training

  1. Install python requirements.
pip install requirements.txt
  1. Download the MPop600 dataset. The dataset is conducted in a download-by-request manner. Please contact their third author Yi-Jhe Lee to get the raw files.

  2. Resample the data to 24 kHz.

python scripts/resample_dir.py **/f1/ output_dir --sr 24000
  1. Generate F0 labels (stored as .pv files).
python scripts/wav2f0.py output_dir
  1. Train with the configurations config.yaml we used in the paper (available under ckpts/).
python main.py fit --config config.yaml --dataset.init_args.wav_dir output_dir

Evaluation

Objective Evaluation

python main.py test --config config.yaml --ckpt_path checkpoint.ckpt --data.init_args.duration 6 --data.init_args.overlap 0 --data.init_args.batch_size 16

Real-Time Factor

python test_rtf.py config.yaml checkpoint.ckpt test.wav

Notebooks

  • MOS: compute MOS score given the rating file from GO Listen.
  • time-domain l2 experiment: the notebook used to conduct the time-domain L2 loss ablation study in the paper.

Pre-trained Checkpoints

Female(f1)

Male(m1)