## Before training

This program saves the last 3 generations of models to Google Drive. Since 1 generation of models is >1GB, you should have at least 3GB of free space in Google Drive. If you do not have such free space, it is recommended to create another Google Account.

Training requires >10GB VRAM. (T4 should be enough) Inference does not require such a lot of VRAM.

NOTE: THIS IS NOT OFFICIAL COLAB. THIS IS [Meldone](https://github.com/Meldoner/so-vits-colab) VERSION OF COLAB

## Installation

In [None]:
#@title Check GPU
!nvidia-smi

In [None]:
#@title Mount Google Drive
from google.colab import drive
drive.mount('/content/drive')

In [None]:
#@title Install dependencies
#@markdown pip may fail to resolve dependencies and raise ERROR, but it can be ignored.
!python -m pip install -U pip wheel
%pip install -U ipython 

#@markdown Branch (for development)
BRANCH = "none" #@param {"type": "string"}
if BRANCH == "none":
    %pip install -U so-vits-svc-fork
else:
    %pip install -U git+https://github.com/34j/so-vits-svc-fork.git@{BRANCH}

## Training

In [None]:
#@title Copy your dataset
#@markdown **We assume that your dataset is in your Google Drive's `so-vits-svc-fork/dataset/(speaker_name)` directory.**
!mkdir -p "dataset_raw"
DATASET_NAME = "kiritan" #@param {type: "string"}
!cp -R /content/drive/MyDrive/so-vits-svc-fork/dataset/{DATASET_NAME}/ -t "dataset_raw/"

In [None]:
#@title Automatic preprocessing
!svc pre-resample

In [None]:
!svc pre-config

In [None]:
#@title Customize your config
import json
with open('configs/44k/config.json', 'r') as f:
    config = json.loads(f.read()) # Reading config file

log_interval = 200 #@param {type:"integer"}
eval_interval = 800 #@param {type:"integer"}
epochs =  10000 #@param {type:"integer"}
batch_size =  16 #@param {type:"integer"}
keep_ckpts = 3 #@param {type:"integer"}
ckpt_name_by_step =  True #@param {type:"boolean"}

for i in ['log_interval','eval_interval','epochs', 'batch_size','keep_ckpts','ckpt_name_by_step']:
  config['train'][i] = eval(i)

with open('configs/44k/config.json', 'w') as f:
  json.dump(config, f, indent=2) # Writing to the config file

In [None]:
#@title Copy configs file to drive
!cp configs/44k/config.json drive/MyDrive/so-vits-svc-fork

In [None]:
F0_METHOD = "crepe" #@param ["crepe", "crepe-tiny", "parselmouth", "dio", "harvest"]
!svc pre-hubert -fm {F0_METHOD}

In [None]:
#@title Train
%load_ext tensorboard
%tensorboard --logdir drive/MyDrive/so-vits-svc-fork/logs/44k
!svc train --model-path drive/MyDrive/so-vits-svc-fork/logs/44k

## Training Cluster model

In [None]:
!svc train-cluster --output-path drive/MyDrive/so-vits-svc-fork/logs/44k/kmeans.pt

## Inference

In [None]:
#@title Get the author's voice as a source
import random
NAME = str(random.randint(1, 100))
!wget -N "https://github.com/34j/34j/raw/main/jvs-parallel100/{NAME}.wav"
from IPython.display import Audio, display
display(Audio(f"{NAME}.wav"))

In [None]:
#@title Use trained model
#@markdown **Put your .wav file in `so-vits-svc-fork/audio` directory**
from IPython.display import Audio, display
NAME = 'your audio file name' #@param {type: "string"}
!svc infer drive/MyDrive/so-vits-svc-fork/audio/{NAME}.wav -m drive/MyDrive/so-vits-svc-fork/logs/44k/ -c drive/MyDrive/so-vits-svc-fork/logs/44k/config.json
display(Audio(f"drive/MyDrive/so-vits-svc-fork/audio/{NAME}.out.wav", autoplay=True))

In [None]:
#@title Use trained model (with cluster)
#@markdown **Put your .wav file in `so-vits-svc-fork/audio` directory**
from IPython.display import Audio, display
NAME = 'your audio file name' #@param {type: "string"}
!svc infer drive/MyDrive/so-vits-svc-fork/audio/{NAME}.wav -s speaker -r 0.1 -m drive/MyDrive/so-vits-svc-fork/logs/44k/ -c drive/MyDrive/so-vits-svc-fork/logs/44k/config.json -k drive/MyDrive/so-vits-svc-fork/logs/44k/kmeans.pt
display(Audio(f"drive/MyDrive/so-vits-svc-fork/audio/{NAME}.out.wav", autoplay=True))

### Pretrained models

In [None]:
#@title https://huggingface.co/TachibanaKimika/so-vits-svc-4.0-models/tree/main
!wget -N "https://huggingface.co/TachibanaKimika/so-vits-svc-4.0-models/resolve/main/riri/G_riri_220.pth"
!wget -N "https://huggingface.co/TachibanaKimika/so-vits-svc-4.0-models/resolve/main/riri/config.json"

In [None]:
!svc infer {NAME}.wav -c config.json -m G_riri_220.pth
display(Audio(f"{NAME}.out.wav", autoplay=True))

In [None]:
#@title https://huggingface.co/therealvul/so-vits-svc-4.0/tree/main
!wget -N "https://huggingface.co/therealvul/so-vits-svc-4.0/resolve/main/Pinkie%20(speaking%20sep)/G_166400.pth"
!wget -N "https://huggingface.co/therealvul/so-vits-svc-4.0/resolve/main/Pinkie%20(speaking%20sep)/config.json"

In [None]:
!svc infer {NAME}.wav --speaker "Pinkie {neutral}" -c config.json -m G_166400.pth
display(Audio(f"{NAME}.out.wav", autoplay=True))