### TAO remote client (TTS finetune)

Transfer learning is the process of transferring learned features from one application to another. It is a commonly used training technique where you use a model trained on one task and re-train to use it on a different task. Train Adapt Optimize (TAO) Toolkit  is a simple and easy-to-use Python based AI toolkit for taking purpose-built AI models and customizing them with users' own data.

![image](https://developer.nvidia.com/sites/default/files/akamai/TAO/tlt-tao-toolkit-bring-your-own-model-diagram.png)


### The workflow in a nutshell

- Get sample datasets (or bring your own)
- Creating source and target datasets
- Provide speech data
- Creating a spectrogram generator model experiment
- Getting a PTM from NGC
- Actions
  - Dataset convert
  - Pitch stats to compute fmin, fmax, pitch_avg, pitch_std
  - Finetune model
  - Infer to produce data for HiFiGan
- Process inferred data to be compatible with hifigan
- Create a vocoder model experiment
- Upload dataset to service
- Get a PTM for vocoder
- Finetune vocoder
- Inference on sample sentences using both fast_pitch and hifigan

### Table of contents

1. [Install TAO remote client](#head-1)
1. [Set the remote service base URL](#head-2)
1. [Access the shared volume](#head-3)
1. [Create the datasets](#head-4)
1. [List datasets](#head-5)
1. [Provide specs for source dataset convert](#head-5-1)
1. [Run source dataset convert](#head-5-2)
1. [Create a model experiment](#head-6)
1. [Find fastpitch pretrained model](#head-7)
1. [Customize model metadata](#head-8)
1. [Provide dataset convert specs to merge manifests](#head-9)
1. [Run dataset convert to merge manifests](#head-10)
1. [Provide pitch stats specs](#head-11)
1. [Customize fast pitch specs](#head-12)
1. [Run fast pitch](#head-13)
1. [Visualize images](#head-14)
1. [Provide finetune specs](#head-15)
1. [Customize fast pitch specs](#head-16)
1. [Run finetune](#head-17)
1. [Provide infer specs](#head-18)
1. [Customize infer specs](#head-19)
1. [Run infer](#head-20)
1. [Create vocoder dataset](#head-21)
1. [Create vocoder model](#head-22)
1. [Find hifigan pretrained model](#head-23)
1. [Customize model metadata](#head-24)
1. [Provide vocoder finetune specs from default](#head-25)
1. [Customize vocoder finetune specs](#head-26)
1. [Run vocoder finetune](#head-27)
1. [Track logs](#head-28)
1. [Inference from raw sentences](#head-29)
1. [Provide model infer specs form default](#head-30)
1. [Customize infer specs](#head-31)
1. [Run infer](#head-32)
1. [Track logs](#head-33)
1. [Create dataset](#head-34)
1. [Provide vocoder infer specs from default](#head-35)
1. [Run vocoder infer](#head-36)
1. [Track logs](#head-37)
1. [Display audio](#head-38)
1. [Delete experiment](#head-39)
1. [Delete datasets](#head-40)
1. [Unmount shared volume](#head-41)
1. [Uninstall TAO Remote Client](#head-42)

### Requirements
Please find the server requirements [here](https://docs.nvidia.com/tao/tao-toolkit/text/tao_toolkit_api/api_setup.html#)

In [None]:
import os
import glob
import subprocess
import getpass
import uuid
import json

### FIXME

1. Assign the ip_address and port_number in FIXME 1 and FIXME 2 ([info](https://docs.nvidia.com/tao/tao-toolkit/text/tao_toolkit_api/api_rest_api.html))
2. Set NGC API key in FIXME 3

In [None]:
namespace = 'default'

### Install TAO remote client <a class="anchor" id="head-1"></a>

In [None]:
 # SKIP this step IF you have already installed the TAO-Client wheel.
! pip3 install nvidia-tao-client

In [None]:
# View the version of the TAO-Client
! tao-client --version

### Set the remote service base URL <a class="anchor" id="head-2"></a>

In [None]:
# Define the node_addr and port number
node_addr = "<ip_address>" # FIXME1 example: 10.137.149.22
node_port = "<port_number>" # FIXME2 example: 32334
# In host machine, node ip_address and port number can be obtained as follows,
# ip_address: hostname -i
# port_number: kubectl get service ingress-nginx-controller -o jsonpath='{.spec.ports[0].nodePort}'
%env BASE_URL=http://{node_addr}:{node_port}/{namespace}/api/v1

In [None]:
# FIXME: Set ngc_api_key valiable
ngc_api_key = "<ngc_api_key>" # FIXME3 example: zZYtczM5amdtdDcwNjk0cnA2bGU2bXQ3bnQ6NmQ4NjNhMDItMTdmZS00Y2QxLWI2ZjktNmE5M2YxZTc0OGyM

# Exchange NGC_API_KEY for JWT
identity = json.loads(subprocess.getoutput(f'tao-client login --ngc-api-key {ngc_api_key}'))

%env USER={identity['user_id']}
%env TOKEN={identity['token']}

### Access the shared volume <a class="anchor" id="head-3"></a>

In [None]:
# Get PVC ID
pvc_id = subprocess.getoutput(f'kubectl get pvc tao-toolkit-api-pvc -n {namespace} -o jsonpath="{{.spec.volumeName}}"')
print(pvc_id)

In [None]:
# Get NFS server info
provisioner = json.loads(subprocess.getoutput(f'helm get values nfs-subdir-external-provisioner -o json'))
nfs_server = provisioner['nfs']['server']
nfs_path = provisioner['nfs']['path']
print(nfs_server, nfs_path)

In [None]:
user = getpass.getuser()
home = os.path.expanduser('~')

! echo "Password for {user}"
password = getpass.getpass()

In [None]:
# Mount shared volume 
! mkdir -p ~/shared

command = "apt-get -y install nfs-common >> /dev/null"
! echo {password} | sudo -S -k {command}

command = f"mount -t nfs {nfs_server}:{nfs_path}/{namespace}-tao-toolkit-api-pvc-{pvc_id} ~/shared"
! echo {password} | sudo -S -k {command} && echo DONE

### Create the datasets <a class="anchor" id="head-4"></a>

For the rest of this notebook, it is assumed that you have:

 - Pretrained FastPitch and HiFiGAN models that were trained on `LJSpeech` sampled at 22kHz
 
In the case that you are not using a TTS model trained on `LJSpeech` at the correct sampling rate. Please ensure that you have the original data, including wav files and a .json manifest file. If you have a TTS model but not at 22kHz, please ensure that you set the correct sampling rate, and fft parameters.

For the rest of the notebook, we will be using a toy dataset consisting of 5 mins of audio. This dataset is for demo purposes only. For a good quality model, we recommend at least 30 minutes of audio. We recommend using the [NVIDIA Custom Voice Recorder](https://developer.nvidia.com/riva-voice-recorder-early-access) tool, to generate a good dataset for finetuning.

Let's first download the original LJSpeech dataset. We download the toy dataset after. Then, using the API, we create these datasets and upload them to the service in the required format. Note that for the ljspeech source data, we need to run the convert action in order to create manifest files from the metadata.csv

The first step downloads audio to text file lists from NVIDIA for LJSpeech and generates the manifest files. If you use your own dataset, you have to generate three files: `ljs_audio_text_train_filelist.txt`, `ljs_audio_text_val_filelist.txt`, `ljs_audio_text_test_filelist.txt` yourself and place it inside the ljspeech directory created below. Those files correspond to your train / val / test split. For each text file, the number of rows should be equal to number of samples in this split and each row should be like:

```
DUMMY/<file_name>.wav|<text_of_the_audio>
```

An example row is:

```
DUMMY/LJ045-0096.wav|Mrs. De Mohrenschildt thought that Oswald,
```

In [None]:
source_dataset_id = subprocess.getoutput("tao-client spectro-gen dataset-create  --dataset_type speech --dataset_format ljspeech")
print(source_dataset_id)

In [None]:
! mkdir -p ~/shared/users/{os.environ['USER']}/datasets/{source_dataset_id}/ljspeech
! curl https://data.keithito.com/data/speech/LJSpeech-1.1.tar.bz2 | tar -xvj -C ~/shared/users/{os.environ['USER']}/datasets/{source_dataset_id}/ljspeech/ --strip-components 1
! chmod 777 ~/shared/users/{os.environ['USER']}/datasets/{source_dataset_id}/ljspeech
! chmod 777 ~/shared/users/{os.environ['USER']}/datasets/{source_dataset_id}/ljspeech/wavs

In [None]:
target_dataset_id = subprocess.getoutput("tao-client spectro-gen dataset-create  --dataset_type speech --dataset_format custom")
print(target_dataset_id)

In [None]:
! curl https://nemo-public.s3.us-east-2.amazonaws.com/6097_5_mins.tar.gz | tar -xvz -C ~/shared/users/{os.environ['USER']}/datasets/{target_dataset_id}/ --strip-components 1

In [None]:
auxillary_dataset_id = subprocess.getoutput("tao-client spectro-gen dataset-create  --dataset_type speech --dataset_format auxillary")
print(auxillary_dataset_id)

In [None]:
# Downloading auxillary files to train.
!wget -q -P ~/shared/users/{os.environ['USER']}/datasets/{auxillary_dataset_id}/ https://github.com/NVIDIA/NeMo/raw/v1.9.0/scripts/tts_dataset_files/cmudict-0.7b_nv22.01
!wget -q -P ~/shared/users/{os.environ['USER']}/datasets/{auxillary_dataset_id}/ https://github.com/NVIDIA/NeMo/raw/v1.9.0/scripts/tts_dataset_files/heteronyms-030921
!wget -q -P ~/shared/users/{os.environ['USER']}/datasets/{auxillary_dataset_id}/  https://github.com/NVIDIA/NeMo/raw/v1.9.0//nemo_text_processing/text_normalization/en/data/whitelist/lj_speech.tsv

### List datasets <a class="anchor" id="head-5"></a>

In [None]:
pattern = os.path.join(home, 'shared', 'users', os.environ['USER'], 'datasets', '*', 'metadata.json')

datasets = []
for metadata_path in glob.glob(pattern):
    with open(metadata_path, 'r') as metadata_file:
        datasets.append(json.load(metadata_file))

print(json.dumps(datasets, indent=2))

### Provide source dataset convert specs <a class="anchor" id="head-5-1"></a>


- First we generate manifest for the "source" ljspeech dataset by running the convert action on the ljspeech format dataset
- Then we merge the ljspeech with target dataset by running the model dataset_convert action

In [None]:
! tao-client spectro-gen dataset-convert-defaults --id {source_dataset_id} --action convert | tee ~/shared/users/{os.environ['USER']}/datasets/{source_dataset_id}/specs/convert.json

### Run source dataset convert <a class="anchor" id="head-5-2"></a>

In [None]:
source_convert_job_id = subprocess.getoutput("tao-client spectro-gen dataset-convert --action convert --id " + source_dataset_id)
print(source_convert_job_id)

In [None]:
def my_tail(logs_dir, log_file):
    %env LOG_FILE={logs_dir}/{log_file}
    ! mkdir -p {logs_dir}
    ! [ ! -f "$LOG_FILE" ] && touch $LOG_FILE && chmod 666 $LOG_FILE
    ! tail -f -n +1 $LOG_FILE | while read LINE; do echo "$LINE"; [[ "$LINE" == "EOF" ]] && pkill -P $$ tail; done
    
# Check status (the file won't exist until the backend Toolkit container is running -- can take several minutes)
logs_dir = os.path.join(home, 'shared', 'users', os.environ['USER'], 'datasets', source_dataset_id, 'logs')
log_file = f"{source_convert_job_id}.txt"

my_tail(logs_dir, log_file)

### Create a model experiment <a class="anchor" id="head-6"></a>

In [None]:
spectro_gen_model_id = subprocess.getoutput("tao-client spectro-gen model-create --network_arch spectro_gen --encryption_key tlt_encode")
print(spectro_gen_model_id)

### Find fastpitch pretrained model <a class="anchor" id="head-7"></a>

In [None]:
pattern = os.path.join(home, 'shared', 'users', '*', 'models', '*', 'metadata.json')

fastpitch_ptm_id = None
for metadata_path in glob.glob(pattern):
  with open(metadata_path, 'r') as metadata_file:
    metadata = json.load(metadata_file)
    ngc_path = metadata.get("ngc_path")
    if ngc_path and "fastpitch:1.8.1" in ngc_path:
      fastpitch_ptm_id = metadata["id"]
      break

print(fastpitch_ptm_id)

### Customize model metadata <a class="anchor" id="head-8"></a>

In [None]:
metadata_path = os.path.join(home, 'shared', 'users', os.environ['USER'], "models", spectro_gen_model_id, "metadata.json")

with open(metadata_path , "r") as metadata_file:
    metadata = json.load(metadata_file)

metadata["train_datasets"] = [source_dataset_id, target_dataset_id]
metadata["eval_dataset"] = target_dataset_id
metadata["inference_dataset"] = target_dataset_id
metadata["ptm"] = fastpitch_ptm_id

with open(metadata_path, "w") as metadata_file:
    json.dump(metadata, metadata_file, indent=2)

print(json.dumps(metadata, indent=2))

### Provide dataset convert specs to merge manifests <a class="anchor" id="head-9"></a>

In [None]:
! tao-client spectro-gen model-dataset-convert-defaults --id {spectro_gen_model_id} | tee ~/shared/users/{os.environ['USER']}/models/{spectro_gen_model_id}/specs/dataset_convert.json

### Run target dataset convert to merge manifests <a class="anchor" id="head-10"></a>

In [None]:
target_convert_job_id = subprocess.getoutput("tao-client spectro-gen model-dataset-convert --id " + spectro_gen_model_id)
print(target_convert_job_id)

In [None]:
# Check status (the file won't exist until the backend Toolkit container is running -- can take several minutes)
logs_dir = os.path.join(home, 'shared', 'users', os.environ['USER'], 'models', spectro_gen_model_id, 'logs')
log_file = f"{target_convert_job_id}.txt"

my_tail(logs_dir, log_file)

### Provide pitch stats specs <a class="anchor" id="head-11"></a>

In [None]:
! tao-client spectro-gen dataset-pitch-stats-defaults --id {target_dataset_id} | tee ~/shared/users/{os.environ['USER']}/datasets/{target_dataset_id}/specs/pitch_stats.json

### Customize fast pitch specs <a class="anchor" id="head-12"></a>

In [None]:
specs_path = os.path.join(home, 'shared', 'users', os.environ['USER'], "datasets", target_dataset_id, "specs", "pitch_stats.json")

with open(specs_path, "r") as specs_file:
    specs = json.load(specs_file)

specs["pitch_fmin"] = 65
specs["pitch_fmax"] = 2094

with open(specs_path, "w") as specs_file:
    json.dump(specs, specs_file, indent=2)

print(json.dumps(specs, indent=2))

### Run fast pitch <a class="anchor" id="head-13"></a>

In [None]:
pitch_stats_job_id = subprocess.getoutput("tao-client spectro-gen dataset-pitch-stats --id " + target_dataset_id)
print(pitch_stats_job_id)

In [None]:
# Check status (the file won't exist until the backend Toolkit container is running -- can take several minutes)
logs_dir = os.path.join(home, 'shared', 'users', os.environ['USER'], 'datasets', target_dataset_id, 'logs')
log_file = f"{pitch_stats_job_id}.txt"

my_tail(logs_dir, log_file)

### Visualize images <a class="anchor" id="head-14"></a>

In [None]:
! pip3 install matplotlib==3.3.3

import matplotlib.pyplot as plt

%matplotlib inline

from math import ceil

valid_image_ext = ['.jpg', '.png', '.jpeg', '.ppm']

def visualize_images(image_dir, num_cols=2, num_images=10):
    """Visualize images in the notebook.
    
    Args:
        image_dir (str): Path to the directory containing images.
        num_cols (int): Number of columns.
        num_images (int): Number of images.

    """
    output_path = os.path.join(image_dir)
    num_rows = int(ceil(float(num_images) / float(num_cols)))
    f, axarr = plt.subplots(num_rows, num_cols, figsize=[240,90])
    f.tight_layout()
    a = [os.path.join(output_path, image) for image in os.listdir(output_path) 
         if os.path.splitext(image)[1].lower() in valid_image_ext]
    for idx, img_path in enumerate(a[:num_images]):
        col_id = idx % num_cols
        row_id = idx // num_cols
        img = plt.imread(img_path)
        axarr[row_id, col_id].imshow(img)

path = os.path.join(home, 'shared', 'users', os.environ['USER'], 'datasets', target_dataset_id, pitch_stats_job_id)
visualize_images(path, num_cols=5, num_images=10)

### Provide finetune specs <a class="anchor" id="head-15"></a>

In [None]:
! tao-client spectro-gen model-finetune-defaults --id {spectro_gen_model_id} | tee ~/shared/users/{os.environ['USER']}/models/{spectro_gen_model_id}/specs/finetune.json

### Customize fast pitch specs <a class="anchor" id="head-16"></a>

In [None]:
specs_path = os.path.join(home, 'shared', 'users', os.environ['USER'], "models", spectro_gen_model_id, "specs", "finetune.json")

with open(specs_path, "r") as specs_file:
    specs = json.load(specs_file)

# Apply changes from pitch_stats job
specs["n_speakers"] = 2
specs["pitch_fmin"] = 65
specs["pitch_fmax"] = 2094
specs["pitch_avg"] = 117.27540199742586
specs["pitch_std"] = 22.1851002822779
specs["trainer"] = {"max_epochs":2}
specs["train_ds"]["dataloader_params"]["batch_size"] = 4

with open(specs_path, "w") as specs_file:
    json.dump(specs, specs_file, indent=2)

print(json.dumps(specs, indent=2))

### Run finetune <a class="anchor" id="head-17"></a>

In [None]:
spectro_gen_finetune_job_id = subprocess.getoutput("tao-client spectro-gen model-finetune --id " + spectro_gen_model_id)
print(spectro_gen_finetune_job_id)

In [None]:
# Check status (the file won't exist until the backend Toolkit container is running -- can take several minutes)
logs_dir = os.path.join(home, 'shared', 'users', os.environ['USER'], 'models', spectro_gen_model_id, 'logs')
log_file = f"{spectro_gen_finetune_job_id}.txt"

my_tail(logs_dir, log_file)

### Provide infer specs <a class="anchor" id="head-18"></a>

In [None]:
! tao-client spectro-gen model-infer-defaults --id {spectro_gen_model_id} | tee ~/shared/users/{os.environ['USER']}/models/{spectro_gen_model_id}/specs/infer.json

### Customize infer specs <a class="anchor" id="head-19"></a>

In [None]:
specs_path = os.path.join(home, 'shared', 'users', os.environ['USER'], "models", spectro_gen_model_id, "specs", "infer.json")

with open(specs_path, "r") as specs_file:
    specs = json.load(specs_file)

specs["mode"] = "infer_hifigan_ft"
specs["speaker"] = 1

with open(specs_path, "w") as specs_file:
    json.dump(specs, specs_file, indent=2)

print(json.dumps(specs, indent=2))

### Run infer <a class="anchor" id="head-20"></a>

In [None]:
spectro_gen_infer_job_id = subprocess.getoutput(f"tao-client spectro-gen model-infer --id {spectro_gen_model_id} --job {spectro_gen_finetune_job_id}")
print(spectro_gen_infer_job_id)

In [None]:
# Check status (the file won't exist until the backend Toolkit container is running -- can take several minutes)
log_file = f"{spectro_gen_infer_job_id}.txt"
my_tail(logs_dir, log_file)

### Create vocoder dataset <a class="anchor" id="head-21"></a>

In [None]:
mel1_dataset_id = subprocess.getoutput("tao-client vocoder dataset-create  --dataset_type mel_spectrogram --dataset_format hifigan")
print(mel1_dataset_id)

In [None]:
# copy data
! mkdir -p ~/shared/users/{os.environ['USER']}/datasets/{mel1_dataset_id}/mel_spectrogram
! cp ~/shared/users/{os.environ['USER']}/models/{spectro_gen_model_id}/{spectro_gen_infer_job_id}/*.npy ~/shared/users/{os.environ['USER']}/datasets/{mel1_dataset_id}/mel_spectrogram/

In [None]:
# create data manifest
manifest_in = os.path.join(home, 'shared', 'users', os.environ['USER'], "datasets", target_dataset_id, "manifest.json")
manifest_out = os.path.join(home, 'shared', 'users', os.environ['USER'], "datasets", mel1_dataset_id, "manifest.json")

with open(manifest_in, "r") as infile:
    lines = infile.readlines()

with open(manifest_out,"w+") as outfile:
    for cnt, line in enumerate(lines):
        line_dict = json.loads(line)
        line_dict["mel_filepath"] = f"mel_spectrogram/{cnt}.npy"
        outfile.write(json.dumps(line_dict) + "\n")

### Create vocoder model <a class="anchor" id="head-22"></a>

In [None]:
vocoder_model_id = subprocess.getoutput("tao-client vocoder model-create --network_arch vocoder --encryption_key tlt_encode")
print(vocoder_model_id)

### Find hifigan pretrained model <a class="anchor" id="head-23"></a>

In [None]:
pattern = os.path.join(home, 'shared', 'users', '*', 'models', '*', 'metadata.json')

hifigan_ptm_id = None
for metadata_path in glob.glob(pattern):
  with open(metadata_path, 'r') as metadata_file:
    metadata = json.load(metadata_file)
    ngc_path = metadata.get("ngc_path")
    if ngc_path and "hifigan:1.0.0rc1" in ngc_path:
      hifigan_ptm_id = metadata["id"]
      break

print(hifigan_ptm_id)

### Customize model metadata <a class="anchor" id="head-24"></a>

In [None]:
metadata_path = os.path.join(home, 'shared', 'users', os.environ['USER'], "models", vocoder_model_id, "metadata.json")

with open(metadata_path , "r") as metadata_file:
    metadata = json.load(metadata_file)

metadata["train_datasets"] = [mel1_dataset_id]
metadata["eval_dataset"] = mel1_dataset_id
metadata["ptm"] = hifigan_ptm_id

with open(metadata_path, "w") as metadata_file:
    json.dump(metadata, metadata_file, indent=2)

print(json.dumps(metadata, indent=2))

### Provide vocoder finetune specs from default <a class="anchor" id="head-25"></a>

In [None]:
! tao-client vocoder model-finetune-defaults --id {vocoder_model_id} | tee ~/shared/users/{os.environ['USER']}/models/{vocoder_model_id}/specs/finetune.json

### Customize vocoder finetune specs <a class="anchor" id="head-26"></a>

In [None]:
specs_path = os.path.join(home, 'shared', 'users', os.environ['USER'], "models", vocoder_model_id, "specs", "finetune.json")

with open(specs_path, "r") as specs_file:
    specs = json.load(specs_file)

specs["trainer"] = {"max_epochs": 2}
specs["training_ds"]["dataloader_params"]["batch_size"] = 4

with open(specs_path, "w") as specs_file:
    json.dump(specs, specs_file, indent=2)

print(json.dumps(specs, indent=2))

### Run vocoder finetune <a class="anchor" id="head-27"></a>

In [None]:
vocoder_finetune_job_id = subprocess.getoutput(f"tao-client vocoder model-finetune --id {vocoder_model_id}")
print(vocoder_finetune_job_id)

### Track logs <a class="anchor" id="head-28"></a>

In [None]:
# Check status (the file won't exist until the backend Toolkit container is running -- can take several minutes)
logs_dir = os.path.join(home, 'shared', 'users', os.environ['USER'], 'models', vocoder_model_id, 'logs')
log_file = f"{vocoder_finetune_job_id}.txt"

my_tail(logs_dir, log_file)

### Inference from raw sentences <a class="anchor" id="head-29"></a>
- Take some sentences and run spectro_gen inference
- Then use the output of this to generate vocoder inference

In [None]:
sentences = ["by the end of no such thing the audience , like beatrice , has a watchful affection for the monster .",
             "director rob marshall went out gunning to make a great one .",
             "uneasy mishmash of styles and genres ."   
            ]

### Provide model infer specs form default <a class="anchor" id="head-30"></a>

In [None]:
! tao-client spectro-gen model-infer-defaults --id {spectro_gen_model_id} | tee ~/shared/users/{os.environ['USER']}/models/{spectro_gen_model_id}/specs/infer.json

### Customize infer specs <a class="anchor" id="head-31"></a>

In [None]:
specs_path = os.path.join(home, 'shared', 'users', os.environ['USER'], "models", spectro_gen_model_id, "specs", "infer.json")

with open(specs_path, "r") as specs_file:
    specs = json.load(specs_file)

specs["mode"] = "infer"
specs["input_batch"] = sentences

with open(specs_path, "w") as specs_file:
    json.dump(specs, specs_file, indent=2)

print(json.dumps(specs, indent=2))

### Run infer <a class="anchor" id="head-32"></a>

In [None]:
spectro_gen_infer_job_id = subprocess.getoutput(f"tao-client spectro-gen model-infer --id {spectro_gen_model_id} --job {spectro_gen_finetune_job_id}")
print(spectro_gen_infer_job_id)

### Track logs <a class="anchor" id="head-33"></a>

In [None]:
# Check status (the file won't exist until the backend Toolkit container is running -- can take several minutes)
logs_dir = os.path.join(home, 'shared', 'users', os.environ['USER'], 'models', spectro_gen_model_id, 'logs')
log_file = f"{spectro_gen_infer_job_id}.txt"

my_tail(logs_dir, log_file)

### Create dataset <a class="anchor" id="head-34"></a>

In [None]:
mel2_dataset_id = subprocess.getoutput("tao-client vocoder dataset-create --dataset_type mel_spectrogram --dataset_format raw")
print(mel2_dataset_id)

In [None]:
# add inference dataset to vocoder model metadata
metadata_path = os.path.join(home, 'shared', 'users', os.environ['USER'], "models", vocoder_model_id, "metadata.json")

with open(metadata_path , "r") as metadata_file:
    metadata = json.load(metadata_file)

metadata["inference_dataset"] = mel2_dataset_id

with open(metadata_path, "w") as metadata_file:
    json.dump(metadata, metadata_file, indent=2)

print(json.dumps(metadata, indent=2))

In [None]:
# Copy data
! mkdir -p ~/shared/users/{os.environ['USER']}/datasets/{mel2_dataset_id}/mel_spectrogram
! cp ~/shared/users/{os.environ['USER']}/models/{spectro_gen_model_id}/{spectro_gen_infer_job_id}/*.npy ~/shared/users/{os.environ['USER']}/datasets/{mel2_dataset_id}/mel_spectrogram/

### Provide vocoder infer specs from default <a class="anchor" id="head-35"></a>

In [None]:
! tao-client vocoder model-infer-defaults --id {vocoder_model_id} | tee ~/shared/users/{os.environ['USER']}/models/{vocoder_model_id}/specs/infer.json

### Run vocoder infer <a class="anchor" id="head-36"></a>

In [None]:
vocoder_infer_job_id = subprocess.getoutput(f"tao-client vocoder model-infer --id {vocoder_model_id} --job {vocoder_finetune_job_id}")
print(vocoder_infer_job_id)

### Track logs <a class="anchor" id="head-37"></a>

In [None]:
# Check status (the file won't exist until the backend Toolkit container is running -- can take several minutes)
logs_dir = os.path.join(home, 'shared', 'users', os.environ['USER'], 'models', vocoder_model_id, 'logs')
log_file = f"{vocoder_infer_job_id}.txt"

my_tail(logs_dir, log_file)

### Display audio <a class="anchor" id="head-38"></a>

In [None]:
import IPython.display as ipd
ipd.Audio(f"{home}/shared/users/{os.environ['USER']}/models/{vocoder_model_id}/{vocoder_infer_job_id}/0.wav")

### Delete experiment <a class="anchor" id="head-39"></a>

In [None]:
! rm -rf ~/shared/users/{os.environ['USER']}/models/{spectro_gen_model_id}
! rm -rf ~/shared/users/{os.environ['USER']}/models/{vocoder_model_id}
! echo DONE

### Delete datasets <a class="anchor" id="head-40"></a>

In [None]:
! rm -rf ~/shared/users/{os.environ['USER']}/datasets/{source_dataset_id}
! rm -rf ~/shared/users/{os.environ['USER']}/datasets/{target_dataset_id}
! rm -rf ~/shared/users/{os.environ['USER']}/datasets/{auxillary_dataset_id}
! rm -rf ~/shared/users/{os.environ['USER']}/datasets/{mel1_dataset_id}
! rm -rf ~/shared/users/{os.environ['USER']}/datasets/{mel2_dataset_id}
! echo DONE

### Unmount shared volume <a class="anchor" id="head-41"></a>

In [None]:
command = "umount ~/shared"
! echo {password} | sudo -S -k {command} && echo DONE

### Uninstall TAO Remote Client <a class="anchor" id="head-42"></a>

In [None]:
! pip3 uninstall -y nvidia-tao-client