Skip to content

kdrkdrkdr/JK-VITS

Repository files navigation

JK-VITS: Bilingual-TTS (Japanese and Korean)

This Repository can speak Japanese even if you train with Korean dataset, and can speak Korean even if you train with Japanese dataset.
By transcribing pronunciation from Japanese to Korean and Korean to Japanese, the unstable voice produced when using the existing multilingual ipa cleaners has been improved.

Pre-requisites

  • A Windows/Linux system with a minimum of 16GB RAM.
  • A GPU with at least 12GB of VRAM.
  • Python >= 3.8
  • Anaconda installed.
  • PyTorch installed.
  • CUDA 11.7 installed.

Pytorch install command:

pip install torch==1.13.1+cu117 torchvision==0.14.1+cu117 torchaudio==0.13.1 --extra-index-url https://download.pytorch.org/whl/cu117

CUDA 11.7 Install: https://developer.nvidia.com/cuda-11-7-0-download-archive

CUDNN 11.x Install: https://developer.nvidia.com/rdp/cudnn-archive


Installation

  1. Create an Anaconda environment:
conda create -n jk python=3.8
  1. Activate the environment:
conda activate jk
  1. Clone this repository to your local machine:
git clone https://github.com/kdrkdrkdr/JK-VITS.git
  1. Navigate to the cloned directory:
cd JK-VITS
  1. Install the necessary dependencies:
pip install -r requirements.txt
pip install -U pyopenjtalk==0.2.0 --no-build-isolation

Preparing Dataset Example

  • Place the audio files as follows. .wav files are okay. The sample rate of the audio must be 44100 Hz.

  • Set configs.

    • If you train with japanese dataset, refer configs/ja.json
    • If you train with korean dataset, refer configs/ko.json
    • Make a config file by referring to these two files.
  • Write Transcripts.

    • If you train with japanese dataset / reference
    path/to/XXX.wav|[JA]こんいちわ。[JA]
    
    • If you train with korean dataset / reference
    path/to/XXX.wav|[KO]안녕하세요.[KO]
    
  • Preprocessing (g2p) for your own datasets. Preprocessed phonemes for your dataset.

python preprocess.py --filelists filelists/train.txt filelists/val.txt

Training Exmaple

python train.py -c configs/ko.json -m ko

Inference Exmaple

See inference.ipynb

Also, You can listen korean samples and japanese samples.


References

For more information, please refer to the following repositories: