**Maintained by [justinjohn03](https://github.com/justinjohn0306)**

## Before training

This program saves the last 3 generations of models to Google Drive. Since 1 generation of models is >1GB, you should have at least 3GB of free space in Google Drive. If you do not have such free space, it is recommended to create another Google Account.

Training requires >10GB VRAM. (T4 should be enough) Inference does not require such a lot of VRAM.

## Installation

In [None]:
#@title Check GPU
!nvidia-smi

In [None]:
#@title Mount Google Drive
from google.colab import drive
drive.mount('/content/drive')

In [None]:
#@title Install dependencies
#@markdown pip may fail to resolve dependencies and raise ERROR, but it can be ignored.
!python -m pip install -U pip wheel
%pip install -U ipython 

#@markdown Branch (for development)
BRANCH = "none" #@param {"type": "string"}
if BRANCH == "none":
    %pip install -U so-vits-svc-fork
else:
    %pip install -U git+https://github.com/34j/so-vits-svc-fork.git@{BRANCH}

## Training

In [None]:
#@title Make dataset directory
%cd /content

!mkdir -p "dataset_raw"

In [None]:
#@title Create the dataset directory on yor gdrive
#@markdown Upload you dataset zip file inside the ``so-vits-svc-fork/dataset`` folder on your gdrive 

#@markdown Example:

#@markdown ``test.zip`` being your singer/speaker: 👇 

#@markdown ``MyDrive/so-vits-svc-fork/dataset/test.zip``

import os

create_dir = True #@param {type: "boolean"}

if create_dir:
      if not os.path.exists('/content/drive/MyDrive/so-vits-svc-fork/'):
          !mkdir /content/drive/MyDrive/so-vits-svc-fork/
          
          !mkdir /content/drive/MyDrive/so-vits-svc-fork/dataset/

print('Done!')

In [None]:
#@title Copy your dataset
#@markdown **We assume that you've uploaded your dataset zip in your Google Drive's `so-vits-svc-fork/dataset/` directory.**
DATASET_NAME = "test.zip" #@param {type: "string"}
!unzip /content/drive/MyDrive/so-vits-svc-fork/dataset/{DATASET_NAME} -d /content/dataset_raw/

In [None]:
#@title Download dataset (Tsukuyomi-chan JVS)
#@markdown You can download this dataset if you don't have your own dataset.
#@markdown Make sure you agree to the license when using this dataset.
#@markdown https://tyc.rei-yumesaki.net/material/corpus/#toc6
# !wget https://tyc.rei-yumesaki.net/files/sozai-tyc-corpus1.zip
# !unzip sozai-tyc-corpus1.zip
# !mv "/content/つくよみちゃんコーパス Vol.1 声優統計コーパス（JVSコーパス準拠）/おまけ：WAV（+12dB増幅＆高音域削減）/WAV（+12dB増幅＆高音域削減）" "dataset_raw/tsukuyomi"

# **Preprocessing**

In [None]:
#@markdown ## Resample audio
!svc pre-resample

In [None]:
#@markdown ## Generate config and filelists
!svc pre-config -t so-vits-svc-4.0v1

In [None]:
#@markdown ## Copy configs file to your gdrive

!cp configs/44k/config.json drive/MyDrive/so-vits-svc-fork

In [None]:
#@markdown ## Extract F0

F0_METHOD = "crepe" #@param ["crepe", "crepe-tiny", "parselmouth", "dio", "harvest"]
!svc pre-hubert -fm {F0_METHOD}

# **Training**

In [None]:
#@markdown ## Start training
%load_ext tensorboard
%tensorboard --logdir drive/MyDrive/so-vits-svc-fork/logs/44k
!svc train --model-path drive/MyDrive/so-vits-svc-fork/logs/44k

## Training Cluster model

In [None]:
#@markdown ## Start training cluster model

!svc train-cluster --output-path drive/MyDrive/so-vits-svc-fork/logs/44k/kmeans.pt

# **Inference**

In [None]:
#@markdown ## Get the author's voice as a source
import random
NAME = str(random.randint(1, 100))
!wget -N "https://github.com/34j/34j/raw/main/jvs-parallel100/{NAME}.wav"
from IPython.display import Audio, display
display(Audio(f"{NAME}.wav"))

In [None]:
#@markdown ## Use trained model
!svc infer {NAME}.wav -m drive/MyDrive/so-vits-svc-fork/logs/44k/ -c drive/MyDrive/so-vits-svc-fork/logs/44k/config.json
display(Audio(f"{NAME}.out.wav", autoplay=True))

In [None]:
#@markdown ## Use trained model (with cluster)
!svc infer {NAME}.wav -s speaker -r 0.1 -m drive/MyDrive/so-vits-svc-fork/logs/44k/ -c drive/MyDrive/so-vits-svc-fork/logs/44k/config.json -k drive/MyDrive/so-vits-svc-fork/logs/44k/kmeans.pt
display(Audio(f"{NAME}.out.wav", autoplay=True))

### Pretrained models

In [None]:
#@markdown ## https://huggingface.co/TachibanaKimika/so-vits-svc-4.0-models/tree/main
!wget -N "https://huggingface.co/TachibanaKimika/so-vits-svc-4.0-models/resolve/main/riri/G_riri_220.pth"
!wget -N "https://huggingface.co/TachibanaKimika/so-vits-svc-4.0-models/resolve/main/riri/config.json"

In [None]:
!svc infer {NAME}.wav -c config.json -m G_riri_220.pth
display(Audio(f"{NAME}.out.wav", autoplay=True))

In [None]:
#@markdown ## https://huggingface.co/therealvul/so-vits-svc-4.0/tree/main
!wget -N "https://huggingface.co/therealvul/so-vits-svc-4.0/resolve/main/Pinkie%20(speaking%20sep)/G_166400.pth"
!wget -N "https://huggingface.co/therealvul/so-vits-svc-4.0/resolve/main/Pinkie%20(speaking%20sep)/config.json"

In [None]:
!svc infer {NAME}.wav --speaker "Pinkie {neutral}" -c config.json -m G_166400.pth
display(Audio(f"{NAME}.out.wav", autoplay=True))