# <font color="ffc800"> **[Piper](https://github.com/rhasspy/piper) training notebook.**
## ![Piper logo](https://contribute.rhasspy.org/img/logo.png)

---

- Notebook made by [rmcpantoja](http://github.com/rmcpantoja)
- Collaborator: [Xx_Nessu_xX](http://github.com/Xx_Nessu_xX)

---

# Notes:

- <font color="orange">**Things in orange mean that they are important.**

# Credits:

* [Feanix-Fyre fork](https://github.com/Feanix-Fyre/piper) with some improvements.
* [Tacotron2 NVIDIA training notebook](https://github.com/justinjohn0306/FakeYou-Tacotron2-Notebook) - Dataset duration snippet.
* [🐸TTS](https://github.com/coqui-ai/TTS) - Resampler and XTTS formater demo.

# <font color="ffc800">🔧 ***First steps.*** 🔧

In [None]:
#@markdown ## <font color="ffc800"> **Google Colab Anti-Disconnect.** 🔌
#@markdown ---
#@markdown #### Avoid automatic disconnection. Still, it will disconnect after <font color="orange">**6 to 12 hours**</font>.

import IPython
js_code = '''
function ClickConnect(){
console.log("Working");
document.querySelector("colab-toolbar-button#connect").click()
}
setInterval(ClickConnect,60000)
'''
display(IPython.display.Javascript(js_code))

In [None]:
#@markdown ## <font color="ffc800"> **Check GPU type.** 👁️
#@markdown ---
#@markdown #### A higher capable GPU can lead to faster training speeds. By default, you will have a <font color="orange">**Tesla T4**</font>.
!nvidia-smi

In [None]:
!pip install resemble-enhance --upgrade
print("Done install resemble-enhance!")

In [None]:
#@markdown # <font color="ffc800"> **Install resemble-enhance.** 📦
#@markdown ---
#@markdown ####In this cell the synthesizer and its necessary dependencies to execute the training will be installed. (this may take a while)

#!pip install resemble-enhance --upgrade
#print("Done install resemble-enhance!")

#@markdown # <font color="ffc800"> **1. Extract dataset.** 📥
#@markdown ---
#@markdown ####Important: the audios must be in <font color="orange">**wav format, (16000 or 22050hz, 16-bits, mono), and, for convenience, numbered. Example:**

#@markdown * <font color="orange">**1.wav**</font>
#@markdown * <font color="orange">**2.wav**</font>
#@markdown * <font color="orange">**3.wav**</font>
#@markdown * <font color="orange">**.....**</font>

#@markdown ---
import os
import wave
import zipfile
import datetime

#%cd /content
#if not os.path.exists("/content/drive/MyDrive/TTS_UZB"):
#    os.makedirs("/content/drive/MyDrive/TTS_UZB")
#    os.makedirs("/content/drive/MyDrive/TTS_UZB/prepwavs")
#    os.makedirs("/content/drive/MyDrive/TTS_UZB/prepwavs/wavs")
%cd /content/drive/MyDrive/TTS_UZB/
#@markdown ### Audio dataset path to unzip:
# Указываем путь к ZIP-архиву с аудио
zip_path = "/kaggle/input/uzaudio/prepwavs.zip"  # @param {type:"string"}
zip_path = zip_path.strip()

# Разархивируем аудиофайлы
if zip_path:
    if os.path.exists(zip_path) and zipfile.is_zipfile(zip_path):
        print("Unzipping audio content...")
        with zipfile.ZipFile(zip_path, 'r') as zip_ref:
            zip_ref.extractall("/kaggle/working/")
    else:
        raise Exception("The path provided is not correct or not a zip file. Please provide a valid path.")
else:
    raise Exception("You must provide a path to the zip file.")

# Конвертируем аудиофайлы с помощью resemble-enhance
#!resemble-enhance --parallel_mode /content/drive/MyDrive/TTS_UZB/prepwavs/prepwavs /content/drive/MyDrive/TTS_UZB/prepwavs/wavs

In [None]:
# Перемещаемся в директорию с обработанными файлами для архивации
%cd /content/drive/MyDrive/TTS_UZB/prepwavs/wavs

# Архивируем папку с обработанными аудиофайлами
new_zip_path = "/content/drive/MyDrive/TTS_UZB/enhanced_wavs.zip"
with zipfile.ZipFile(new_zip_path, 'w') as zip_f:
    for root, dirs, files in os.walk("."):
        for file in files:
            zip_f.write(os.path.join(root, file))

# Копируем архив в Google Drive
destination_path = "/content/drive/MyDrive/enhanced_wavs.zip"
shutil.copy(new_zip_path, destination_path)

print(f"Enhanced dataset has been archived and copied to {destination_path}.")

In [None]:
import os
import shutil
from math import ceil

# Исходная директория, откуда будут копироваться файлы
source_dir = "/kaggle/input/uzaudio/prepwavs"
# Базовая целевая директория, куда будут помещаться новые подпапки
base_target_dir = "/kaggle/working/"

# Создание целевых папок prepwavs1 до prepwavs5
num_folders = 4
for i in range(1, num_folders + 1):
    target_dir = os.path.join(base_target_dir, f"prepwavs{i}")
    if not os.path.exists(target_dir):
        os.makedirs(target_dir)

# Получаем список всех файлов в исходной папке
files = [f for f in os.listdir(source_dir) if os.path.isfile(os.path.join(source_dir, f))]

# Рассчитываем, сколько файлов должно быть в каждой папке
files_per_folder = ceil(len(files) / num_folders)

# Распределение файлов по папкам
for index, file_name in enumerate(files):
    # Определение целевой папки для текущего файла
    target_folder_index = index // files_per_folder + 1  # +1 чтобы начать с prepwavs1, а не с prepwavs0
    target_dir = os.path.join(base_target_dir, f"prepwavs{target_folder_index}")

    # Путь к текущему файлу в исходной папке
    file_path = os.path.join(source_dir, file_name)
    # Путь к целевому файлу в целевой папке
    target_path = os.path.join(target_dir, file_name)

    # Копирование файла
    shutil.copy(file_path, target_path)

print("Files have been successfully distributed among the folders.")


In [None]:
!mkdir /kaggle/working/wavs

In [None]:
!ls

In [None]:
!resemble-enhance --device cpu --nfe 128 --lambd 0.5  --solver rk4 /content/sample_data/aud /content/sample_data/aud/out

In [None]:
!sudo apt-get update && sudo apt-get install git-lfs
!git lfs install



In [None]:
import subprocess
import threading
import os
import shutil
from math import ceil
def run_command(command):
    subprocess.run(command, shell=True)

# Команды, которые вы хотите выполнить параллельно
commands = [
    "resemble-enhance --parallel_mode /kaggle/working/prepwavs1 /kaggle/working/wavs",
    "resemble-enhance --parallel_mode /kaggle/working/prepwavs2 /kaggle/working/wavs",
    "resemble-enhance --parallel_mode /kaggle/working/prepwavs3 /kaggle/working/wavs",
    "resemble-enhance --parallel_mode /kaggle/working/prepwavs4 /kaggle/working/wavs"
]

# Запускаем каждую команду в отдельном потоке
threads = []
for cmd in commands:
    thread = threading.Thread(target=run_command, args=(cmd,))
    threads.append(thread)
    thread.start()

# Ждем завершения всех потоков
for thread in threads:
    thread.join()

print("All commands have been executed.")


In [None]:
!zip -r /kaggle/working/wavs/wavs_archive.zip /kaggle/working/wavs/*


In [None]:
!ls
!cp /kaggle/working/wavs/wavs_archive.zip /kaggle/working/

In [None]:
!ls

In [1]:
!mkdir /kaggle/temp/
%cd /kaggle/temp/
!mkdir content
!ls

/kaggle/temp
content


In [2]:
%cd /kaggle/temp/content
!ls

/kaggle/temp/content


In [3]:
#@markdown # <font color="ffc800"> **Install software.** 📦
#@markdown ---
#@markdown ####In this cell the synthesizer and its necessary dependencies to execute the training will be installed. (this may take a while)
# Создаем виртуальное окружение
!python -m venv synthesizer_env

# Активация виртуального окружения зависит от ОС
# Для Unix или MacOS:
!source synthesizer_env/bin/activate
# clone:

!git clone -q https://github.com/rmcpantoja/piper
%cd /kaggle/temp/content/piper/src/python
!wget -q "https://raw.githubusercontent.com/coqui-ai/TTS/dev/TTS/bin/resample.py"
#!pip install -q -r requirements.txt
!pip install -q cython>=0.29.0 piper-phonemize==1.1.0 librosa>=0.9.2 numpy==1.24 onnxruntime>=1.11.0 pytorch-lightning==1.7.7 torch==1.13.0+cu117 --extra-index-url https://download.pytorch.org/whl/cu117
!pip install -q torchtext==0.14.0 torchvision==0.14.0
# fixing recent compativility isswes:
!pip install -q torchaudio==0.13.0 torchmetrics==0.11.4 faster_whisper
!bash build_monotonic_align.sh
# Useful vars:
use_whisper = False
print("Done!")

/kaggle/temp/content/piper/src/python
[33mDEPRECATION: pytorch-lightning 1.7.7 has a non-standard dependency specifier torch>=1.9.*. pip 24.0 will enforce this behaviour change. A possible replacement is to upgrade to a newer version of pytorch-lightning or contact the author to suggest that they release a version with a conforming dependency specifiers. Discussion can be found at https://github.com/pypa/pip/issues/12063[0m[33m
[0m[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
cudf 23.8.0 requires cubinlinker, which is not installed.
cudf 23.8.0 requires cupy-cuda11x>=12.0.0, which is not installed.
cudf 23.8.0 requires ptxcompiler, which is not installed.
cuml 23.8.0 requires cupy-cuda11x>=12.0.0, which is not installed.
dask-cudf 23.8.0 requires cupy-cuda11x>=12.0.0, which is not installed.
keras-nlp 0.8.1 requires keras-core, which is not installed

# <font color="ffc800"> 🤖 ***Training.*** 🤖

In [4]:
#@markdown # <font color="ffc800"> **1. Extract dataset.** 📥
#@markdown ---
#@markdown ####Important: the audios must be in <font color="orange">**wav format, (16000 or 22050hz, 16-bits, mono), and, for convenience, numbered. Example:**

#@markdown * <font color="orange">**1.wav**</font>
#@markdown * <font color="orange">**2.wav**</font>
#@markdown * <font color="orange">**3.wav**</font>
#@markdown * <font color="orange">**.....**</font>

#@markdown ---
import os
import wave
import zipfile
import datetime

def get_dataset_duration(wav_path):
    totalduration = 0
    for file_name in [x for x in os.listdir(wav_path) if os.path.isfile(x)]:
        with wave.open(file_name, "rb") as wave_file:
            frames = wave_file.getnframes()
            rate = wave_file.getframerate()
            duration = frames / float(rate)
            totalduration += duration
    wav_count = len(os.listdir(wav_path))
    duration_str = str(datetime.timedelta(seconds=round(totalduration, 0)))
    return wav_count, duration_str

if not os.path.exists("/kaggle/temp/content/dataset"):
    os.makedirs("/kaggle/temp/content/dataset")
    os.makedirs("/kaggle/temp/content/dataset/wavs")
%cd /kaggle/temp/content
#@markdown ### Audio dataset path to unzip:


!find /kaggle/input/uzvoice3/wavs/home/alx/Загрузки/wavs3/wavs -type f -name '*.wav' -print0 | xargs -0 cp -t /kaggle/temp/content/dataset/wavs/

!cp /kaggle/input/uzvoice3/metadata.txt /kaggle/temp/content/dataset
!mv /kaggle/temp/content/dataset/metadata.txt /kaggle/temp/content/dataset/metadata.csv
%cd /kaggle/temp/content/dataset/wavs
audio_count, dataset_dur = get_dataset_duration("/kaggle/temp/content/dataset/wavs")
print(f"Opened dataset with {audio_count} wavs with duration {dataset_dur}.")
%cd ..
#@markdown ---

/kaggle/temp/content
/kaggle/temp/content/dataset/wavs
Opened dataset with 10000 wavs with duration 10:16:53.
/kaggle/temp/content/dataset


In [5]:
!ls /kaggle/temp/content/dataset

metadata.csv  wavs


In [5]:
!rm /kaggle/temp/content/dataset/metadata.csv
!cp /kaggle/input/metada/metadata.txt /kaggle/temp/content/dataset
!mv /kaggle/temp/content/dataset/metadata.txt /kaggle/temp/content/dataset/metadata.csv


In [6]:
#@markdown # <font color="ffc800"> **3. Preprocess dataset.** 🔄
#@markdown ---
import os
use_whisper = False
#@markdown ### First of all, select the language of your dataset.
language = "English (U.S.)" #@param ["ألعَرَبِي", "Català", "čeština", "Dansk", "Deutsch", "Ελληνικά", "English (British)", "English (U.S.)", "Español (Castellano)", "Español (Latinoamericano)", "Suomi", "Français", "Magyar", "Icelandic", "Italiano", "ქართული", "қазақша", "Lëtzebuergesch", "नेपाली", "Nederlands", "Norsk", "Polski", "Português (Brasil)", "Português (Portugal)", "Română", "Русский", "Српски", "Svenska", "Kiswahili", "Türkçe", "украї́нська", "Tiếng Việt", "简体中文"]
#@markdown ---
# language definition:
languages = {
    "ألعَرَبِي": "ar",
    "Català": "ca",
    "čeština": "cs",
    "Dansk": "da",
    "Deutsch": "de",
    "Ελληνικά": "el",
    "English (British)": "en",
    "English (U.S.)": "en-us",
    "Español (Castellano)": "es",
    "Español (Latinoamericano)": "es-419",
    "Suomi.": "fi",
    "Français": "fr",
    "Magyar": "hu",
    "Icelandic": "is",
    "Italiano": "it",
    "ქართული": "ka",
    "қазақша": "kk",
    "Lëtzebuergesch": "lb",
    "नेपाली": "ne",
    "Nederlands": "nl",
    "Norsk": "nb",
    "Polski": "pl",
    "Português (Brasil)": "pt-br",
    "Português (Portugal)": "pt-pt",
    "Română": "ro",
    "Русский": "ru",
    "Српски": "sr",
    "Svenska": "sv",
    "Kiswahili": "sw",
    "Türkçe": "tr",
    "украї́нська": "uk",
    "Tiếng Việt": "vi",
    "简体中文": "zh"
}

def _get_language(code):
    return languages[code]

final_language = _get_language(language)
#@markdown ### Choose a name for your model:
model_name = "uzbekfemalehq22k" #@param {type:"string"}
#@markdown ---
# output:
#@markdown ### Choose the working folder: (recommended to save to Drive)

#@markdown The working folder will be used in preprocessing, but also in training the model.
!mkdir /kaggle/working/content/
!mkdir /kaggle/working/content/drive
!mkdir /kaggle/working/content/drive/MyDrive
!mkdir /kaggle/working/content/drive/MyDrive/colab
!mkdir /kaggle/working/content/drive/MyDrive/colab/piper
output_path = "/kaggle/working/content/drive/MyDrive/colab/piper" #@param {type:"string"}
output_dir = output_path+"/"+model_name
if not os.path.exists(output_dir):
  os.makedirs(output_dir)
#@markdown ---
#@markdown ### Choose dataset format:
dataset_format = "ljspeech" #@param ["ljspeech", "mycroft"]
#@markdown ---
#@markdown ### Is this a single speaker dataset? Otherwise, uncheck:
single_speaker = True #@param {type:"boolean"}
if single_speaker:
  force_sp = " --single-speaker"
else:
  force_sp = ""
#@markdown ---
#@markdown ### Select the sample rate of the dataset:
sample_rate = "22050" #@param ["16000", "22050"]
#@markdown ---
# creating paths:
if not os.path.exists("/kaggle/temp/content/audio_cache"):
    os.makedirs("/kaggle/temp/content/audio_cache")
%cd /kaggle/temp/content/piper/src/python
#@markdown ### Do you want to train using this sample rate, but your audios don't have it?
#@markdown The resampler helps you do it quickly!
resample = False #@param {type:"boolean"}
if resample:
  !python resample.py --input_dir "/kaggle/temp/content/dataset/wavs" --output_dir "/kaggle/temp/content/dataset/wavs_resampled" --output_sr {sample_rate} --file_ext "wav"
  !for file in /kaggle/temp/content/dataset/wavs_resampled/*; do mv "$file" /kaggle/temp/content/dataset/wavs/; done
#@markdown ---
# check transcription:
if use_whisper:
    print("Transcript file hasn't been uploaded. Transcribing these audios using Whisper...")
    make_dataset("/kaggle/temp/content/dataset/wavs", final_language[:2])
    print("Transcription done! Pre-processing...")
!python -m piper_train.preprocess \
  --language {final_language} \
  --input-dir "/kaggle/temp/content/dataset" \
  --cache-dir "/kaggle/temp/content/audio_cache" \
  --output-dir "{output_dir}" \
  --dataset-name "{model_name}" \
  --dataset-format {dataset_format} \
  --sample-rate {sample_rate} \
  {force_sp}

/kaggle/temp/content/piper/src/python
[0;93m2024-03-20 22:09:12.291418604 [W:onnxruntime:, graph.cc:3593 CleanUnusedInitializersAndNodeArgs] Removing initializer '131'. It is not used by any node and should be removed from the model.[m
[0;93m2024-03-20 22:09:12.297696968 [W:onnxruntime:, graph.cc:3593 CleanUnusedInitializersAndNodeArgs] Removing initializer '136'. It is not used by any node and should be removed from the model.[m
[0;93m2024-03-20 22:09:12.297752745 [W:onnxruntime:, graph.cc:3593 CleanUnusedInitializersAndNodeArgs] Removing initializer '139'. It is not used by any node and should be removed from the model.[m
[0;93m2024-03-20 22:09:12.297771796 [W:onnxruntime:, graph.cc:3593 CleanUnusedInitializersAndNodeArgs] Removing initializer '140'. It is not used by any node and should be removed from the model.[m
[0;93m2024-03-20 22:09:12.297788329 [W:onnxruntime:, graph.cc:3593 CleanUnusedInitializersAndNodeArgs] Removing initializer '134'. It is not used by any node and

In [None]:
!cat /kaggle/temp/content/piper/notebooks/pretrained_models.json


In [9]:
!pip install ipywidgets

Collecting ipywidgets
  Downloading ipywidgets-8.1.2-py3-none-any.whl (139 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m139.4/139.4 kB[0m [31m1.5 MB/s[0m eta [36m0:00:00[0ma [36m0:00:01[0m
[?25hCollecting jupyterlab-widgets~=3.0.10
  Downloading jupyterlab_widgets-3.0.10-py3-none-any.whl (215 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m215.0/215.0 kB[0m [31m4.6 MB/s[0m eta [36m0:00:00[0ma [36m0:00:01[0m
Collecting widgetsnbextension~=4.0.10
  Downloading widgetsnbextension-4.0.10-py3-none-any.whl (2.3 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.3/2.3 MB[0m [31m22.4 MB/s[0m eta [36m0:00:00[0ma [36m0:00:01[0m
Installing collected packages: widgetsnbextension, jupyterlab-widgets, ipywidgets
Successfully installed ipywidgets-8.1.2 jupyterlab-widgets-3.0.10 widgetsnbextension-4.0.10
[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m23.0.1[0m[39;

In [8]:
#@markdown # <font color="ffc800"> **4. Settings.** 🧰
#@markdown ---
import json
import ipywidgets as widgets
from IPython.display import display
import os
#@markdown ### <font color="orange">**Select the action to train this dataset: (READ CAREFULLY)**

#@markdown * The option to <font color="orange">continue a training</font> is self-explanatory. If you've previously trained a model with free colab, your time is up and you're considering training it some more, this is ideal for you. You just have to set the same settings that you set when you first trained this model.
#@markdown * The option to <font color="orange">convert a single-speaker model to a multi-speaker model</font> is self-explanatory, and for this it is important that you have processed a dataset that contains text and audio from all possible speakers that you want to train in your model.
#@markdown * The <font color="orange">finetune</font> option is used to train a dataset using a pretrained model, that is, train on that data. This option is ideal if you want to train a very small dataset (more than five minutes recommended).
#@markdown * The <font color="orange">train from scratch</font> option builds features such as dictionary and speech form from scratch, and this may take longer to converge. For this, hours of audio (8 at least) are recommended, which have a large collection of phonemes.

action = "finetune" #@param ["Continue training", "convert single-speaker to multi-speaker model", "finetune", "train from scratch"]
#@markdown ---
if action == "Continue training":
    if os.path.exists(f"{output_dir}/lightning_logs/version_0/checkpoints/last.ckpt"):
        ft_command = f'--resume_from_checkpoint "{output_dir}/lightning_logs/version_0/checkpoints/last.ckpt" '
        print(f"\033[93mContinuing {model_name}'s training at: {output_dir}/lightning_logs/version_0/checkpoints/last.ckpt")
    else:
        raise Exception("Training cannot be continued as there is no checkpoint to continue at.")
elif action == "finetune":
    if os.path.exists(f"{output_dir}/lightning_logs/version_0/checkpoints/last.ckpt"):
        raise Exception("Oh no! You have already trained this model before, you cannot choose this option since your progress will be lost, and then your previous time will not count. Please select the option to continue a training.")
    else:
        ft_command = '--resume_from_checkpoint "/kaggle/temp/content/pretrained.ckpt" '
elif action == "convert single-speaker to multi-speaker model":
    if not single_speaker:
        ft_command = '--resume_from_single_speaker_checkpoint "/kaggle/temp/content/pretrained.ckpt" '
    else:
        raise Exception("This dataset is not a multi-speaker dataset!")
else:
    ft_command = ""
if action == "convert single-speaker to multi-speaker model" or action == "finetune":
    def download_model(btn):
        model_url = "https://huggingface.co/datasets/rhasspy/piper-checkpoints/resolve/main/en/en_US/lessac/medium/epoch%3D2164-step%3D1355540.ckpt"  # Замените это вашей ссылкой
        print("\033[93mDownloading pretrained model...")
        !wget -q "{model_url}" -O "/kaggle/temp/content/pretrained.ckpt"

        if os.path.exists("/kaggle/temp/content/pretrained.ckpt"):
            print("\033[93mModel downloaded!")
        else:
            raise Exception("Couldn't download the pretrained model!")
    download_model(None)  # Вызовите функцию напрямую без нажатия кнопки

else:
    print("\033[93mWarning: this model will be trained from scratch. You need at least 8 hours of data for everything to work decent. Good luck!")
#@markdown ### Choose batch size based on this dataset:
batch_size = 12 #@param {type:"integer"}
#@markdown ---

#@markdown ### Choose the quality for this model:

#@markdown * x-low - 16Khz audio, 5-7M params
#@markdown * medium - 22.05Khz audio, 15-20 params
#@markdown * high - 22.05Khz audio, 28-32M params
quality = "medium" #@param ["high", "x-low", "medium"]
#@markdown ---
#@markdown ### For how many epochs to save training checkpoints?
#@markdown The larger your dataset, you should set this saving interval to a smaller value, as epochs can progress longer time.
checkpoint_epochs = 5 #@param {type:"integer"}
#@markdown ---
#@markdown ### Interval to save best k models:
#@markdown Set to 0 if you want to disable saving multiple models. If this is the case, check the checkbox below. If set to 1, models will be saved with the file name epoch=xx-step=xx.ckpt, so you will need to empty Drive's trash every so often.
num_ckpt = 0 #@param {type:"integer"}
#@markdown ---
#@markdown ### Save latest model:
#@markdown This checkbox must be checked if you want to save a single model (last.ckpt). Saving a single model is applied only if num_ckpt is equal to 0. If so, the interval parameter of epochs to save is ignored, since the last model per epoch is saved; also, you won't have to worry about storage. Being equal to 1, last.ckpt will be saved, but another model (model_vVersion.ckpt, the latter takes into account the epoch range you set), so you would have to empty the trash often.

#@markdown **It's not recommended to use this option in extremely small datasets, since by saving the last model each epoch, this process will be very fast and the trainer will not be able to save the complete model, which would result in a corrupt last.ckpt.**
save_last = True # @param {type:"boolean"}
#@markdown ---
#@markdown ### Step interval to generate model samples:
log_every_n_steps = 1000 #@param {type:"integer"}
#@markdown ---
#@markdown ### Training epochs:
max_epochs = 10000 #@param {type:"integer"}
#@markdown ---

[93mDownloading pretrained model...
[93mModel downloaded!


In [9]:
#@markdown # <font color="orange"> **5. Run the TensorBoard extension.** 📈
#@markdown ---
#@markdown The TensorBoard is used to visualize the results of the model while it's being trained such as audio and losses.

%load_ext tensorboard
%tensorboard --logdir {output_dir}

In [11]:
!pip install numpy --upgrade


Collecting numpy
  Downloading numpy-1.26.4-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (61 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m61.0/61.0 kB[0m [31m945.9 kB/s[0m eta [36m0:00:00[0m [36m0:00:01[0m
[?25hDownloading numpy-1.26.4-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (18.2 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m18.2/18.2 MB[0m [31m50.5 MB/s[0m eta [36m0:00:00[0m00:01[0m00:01[0m
[?25h[33mDEPRECATION: pytorch-lightning 1.7.7 has a non-standard dependency specifier torch>=1.9.*. pip 24.0 will enforce this behaviour change. A possible replacement is to upgrade to a newer version of pytorch-lightning or contact the author to suggest that they release a version with a conforming dependency specifiers. Discussion can be found at https://github.com/pypa/pip/issues/12063[0m[33m
[0mInstalling collected packages: numpy
  Attempting uninstall: numpy
    Found existing installation

In [None]:
#@markdown # <font color="ffc800"> **6. Train.** 🏋️‍♂️
#@markdown ---
#@markdown ### Run this cell to train your final model!
import os
os.environ['PYTORCH_CUDA_ALLOC_CONF'] = "caching_allocator"

#@markdown ---
#@markdown ### <font color="orange">**Disable validation?**

#@markdown By uncheck this checkbox, this will allow to train the full dataset, without using any audio files or examples as a validation set. So, it will not be able to generate audios on the tensorboard while it's training. It is recommended to disable validation on extremely small datasets.
validation = True #@param {type:"boolean"}
if validation:
    validation_split = 0.01
    num_test_examples = 1
else:
    validation_split = 0
    num_test_examples = 0
if not save_last:
    save_last_command = ""
else:
    save_last_command = "--save_last True "
get_ipython().system(f'''
python -m piper_train \
--dataset-dir "{output_dir}" \
--accelerator 'gpu' \
--devices 2 \
--batch-size {batch_size} \
--validation-split {validation_split} \
--num-test-examples {num_test_examples} \
--quality {quality} \
--checkpoint-epochs {checkpoint_epochs} \
--num_ckpt {num_ckpt} \
{save_last_command}\
--log_every_n_steps {log_every_n_steps} \
--max_epochs {max_epochs} \
{ft_command}\
--precision 32 \
--strategy ddp
''')

  rank_zero_deprecation(
  ckpt_path = ckpt_path or self.resume_from_checkpoint
  rank_zero_warn(
2024-03-20 22:29:41.607567: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-03-20 22:29:41.607666: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-03-20 22:29:41.610249: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
  rank_zero_warn(
Epoch: 2164. Steps: 1355540
Epoch: 2164. Steps: 1355540
Epoch: 2164. Steps: 1355540
Epoch: 2164. Steps: 1355540
  rank_zero_warn(
  rank_zero_warn(
grad.sizes() = [1, 9, 96], strides() = [36192, 96, 1]
bucket_view.sizes() = [1, 9, 96], strides() = [

In [37]:
!ls /kaggle/working/content/drive/MyDrive/colab/piper/uzbekfemalehq22k/lightning_logs/version_4/checkpoints

last.ckpt


#  <font color="orange">**Have you finished training and want to test the model?**

* If you want to run this model in any software that Piper integrates or the same Piper app, export your model using the [model exporter notebook](https://colab.research.google.com/github/rmcpantoja/piper/blob/master/notebooks/piper_model_exporter.ipynb)!
* Wait! I want to test this right now before exporting it to the supported format for Piper. Test your generated last.ckpt with [this notebook](https://colab.research.google.com/github/rmcpantoja/piper/blob/master/notebooks/piper_inference_(ckpt).ipynb)!