Unofficial implementation of VISinger

main分支预训练模型是44100hz训练的，消耗算力过大已停止训练

并且存在巨大节奏问题，dev分支改进了部分问题，但节奏问题依然较大。

目前合成一长段歌声时效果极差，只能采用分段合成

VISinger: Variational Inference with Adversarial Learning for End-to-End Singing Voice Synthesis paper

In this paper, we propose VISinger, a complete end-to-end high-quality singing voice synthesis (SVS) system that directly generates audio waveform from lyrics and musical score. Our approach is inspired by VITS, which adopts VAE-based posterior encoder augmented with normalizing flow-based prior encoder and adversarial decoder to realize complete end-to-end speech generation. VISinger follows the main architecture of VITS, but makes substantial improvements to the prior encoder based on the characteristics of singing. First, instead of using phoneme-level mean and variance of acoustic features, we introduce a length regulator and a frame prior network to get the frame-level mean and variance on acoustic features, modeling the rich acoustic variation in singing. Second, we further introduce an F0 predictor to guide the frame prior network, leading to stabler singing performance. Finally, to improve the singing rhythm, we modify the duration predictor to specifically predict the phoneme to note duration ratio, helped with singing note normalization. Experiments on a professional Mandarin singing corpus show that VISinger significantly outperforms FastSpeech+Neural-Vocoder two-stage approach and the oracle VITS; ablation study demonstrates the effectiveness of different contributions.

Pre-requisites

Python >= 3.6
Clone this repository
Install python requirements. Please refer requirements.txt
1. You may need to install espeak first: apt-get install espeak
Download datasets
1. Download and extract the Opencpop datasets, then rename or create a link to the dataset folder: ln -s /path/to/opencpop
Build Monotonic Alignment Search and run preprocessing if you use your own datasets.

# Cython-version Monotonoic Alignment Search
cd monotonic_align
python setup.py build_ext --inplace

Training Exmaple

# Opencpop
python train.py -c configs/ljs_base.json -m ljs_base

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
.idea		.idea
configs		configs
filelists		filelists
monotonic_align		monotonic_align
resources		resources
text		text
LICENSE		LICENSE
README.md		README.md
attentions.py		attentions.py
commons.py		commons.py
data_utils.py		data_utils.py
frame_prior_network.py		frame_prior_network.py
infer_utils.py		infer_utils.py
inference.ipynb		inference.ipynb
losses.py		losses.py
mel_processing.py		mel_processing.py
models.py		models.py
modules.py		modules.py
preprocess.py		preprocess.py
requirements.txt		requirements.txt
train.py		train.py
train_ms.py		train_ms.py
transforms.py		transforms.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Unofficial implementation of VISinger

Pre-requisites

Training Exmaple

About

Releases

Packages

Languages

License

innnky/VISinger

Folders and files

Latest commit

History

Repository files navigation

Unofficial implementation of VISinger

Pre-requisites

Training Exmaple

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages