# TTS Deploy

In this tutorial we will explain the process of generating a TTS RMIR from an acoustic model and a vocoder for both data center and embedded machines. The acoustic model and vocoder need to be store as .riva files. RMIR (Riva Model Intermediate Representation) is an intermediate file that has all the necessary artifacts (models, files, configurations, and user settings) required to deploy a Riva service.  

In this tutorial we will use pretrained [Fastpitch.riva](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/tao/models/speechsynthesis_en_us_fastpitch_ipa) and [HifiGan.riva](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/tao/models/speechsynthesis_en_us_hifigan_ipa). These can be replaced with any custom acoutic_model or vocoder riva files. [`nemo2riva`](https://docs.nvidia.com/deeplearning/riva/user-guide/docs/model-overview.html#export-models-with-nemo2riva) can be used to generate .riva files from nemo checkpoints.  

We will also deploy the RMIR we generated using [riva_quickstart](https://docs.nvidia.com/deeplearning/riva/user-guide/docs/quick-start-guide.html).

### Set configs and params.
Set following config parameters:  
`acoustic_model`: Full path for acoustic_model.riva file from [ngc](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/tao/models/speechsynthesis_en_us_fastpitch_ipa). This can be replaced with a custom acoustic model .riva checkpoint.  
`vocoder`: Full path for vocoder.riva file file from [ngc](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/tao/models/speechsynthesis_en_us_hifigan_ipa). This can be replaced with a custom vocoder .riva checkpoint.  
`out_dir`: Directory to put TTS.rmir file. The RMIR will be placed in ${out_dir}/RMIR/RMIR_NAME.rmir  
`voice`: Set the voice name of the model.  
`key`: This is the encryption key used in nemo2riva. Same key will be used to deploy the RMIR generated in this tutorial.  
`use_ipa`: Set to "y" or "Y" if the model uses IPA phones, "no" if the model uses arpabet.  
`lang`: Model language.  
`sample_rate`: Sample rate of generated audios.  
`machine_type`: type of machine the tutorial is being run on. Acceptable values are `arm` and `amd`.  

`target_machine`: type of machine the RMIR will be deployed on. Acceptable values are `arm` and `amd`.  


In [None]:
import pathlib
import warnings

In [None]:
acoustic_model = pathlib.Path.cwd() / "speechsynthesis_en_us_fastpitch_ipa_vdeployable_v1.0/FastPitch_44k_EnglishUS_IPA.riva" ##acoustic_model .riva location
vocoder = pathlib.Path.cwd() / "speechsynthesis_en_us_hifigan_ipa_vdeployable_v1.0/HifiGAN_44k_EnglishUS_IPA.riva" ##vocoder .riva location
out_dir = pathlib.Path("out/") ##Output directory to store generated RMIR. The RMIR will be placed in ${out_dir}/RMIR/RMIR_NAME.rmir
voice = "test" ##Voice name     
key = "tlt_encode" ##Encryption key used during nemo2riva
use_ipa = "no" ##"y" or "Y" if the model uses ipa, no otherwise.
lang = "en-US" ##Language
sample_rate = 44100 ##Sample rate of the audios
machine_type="amd" #Change this to amd incase of an x86_64 machine.
target_machine="arm" #Change this to amd incase of an x86_64 machine.
riva_model_files=pathlib.Path.cwd() / "speechsynthesis_en_us_auxiliary_files_vdeployable_v1.3" ##Riva model repo path. incase of custom model repo, change this to the full path of the custom riva model repo.

rmir_dir = out_dir / "rmir"

## Riva NGC, servicemaker image config.
riva_ngc_org = "nvidia"
riva_ngc_team = "riva"
NGC_TARGET = f"{riva_ngc_org}/{riva_ngc_team}"
riva_ngc_image_version = "2.8.0"
if machine_type=="arm":
    riva_init_image = f"nvcr.io/{NGC_TARGET}/riva-speech:{riva_ngc_image_version}-servicemaker-l4t-aarch64"
elif machine_type=="amd":
    riva_init_image = f"nvcr.io/{NGC_TARGET}/riva-speech:{riva_ngc_image_version}-servicemaker"

### Download models
We will download pretrained [Fastpitch](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/tao/models/speechsynthesis_en_us_fastpitch_ipa) and [HifiGan](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/tao/models/speechsynthesis_en_us_hifigan_ipa) models from ngc. You can replace these models with the paths of your custom model, incase of custom models.

In [None]:
!ngc registry model download-version "nvidia/tao/speechsynthesis_en_us_fastpitch_ipa:deployable_v1.0"
!ngc registry model download-version "nvidia/tao/speechsynthesis_en_us_hifigan_ipa:deployable_v1.0"

Download the auxiliary TTS deployable files from ngc. This will include the following files:  
- Arpabet dict
- IPA dict
- abbreviation dict.

In [None]:
!ngc registry model download-version "nvidia/tao/speechsynthesis_en_us_auxiliary_files:deployable_v1.3"

Get acoustic_model, vocoder directory path and model names.

In [None]:
synt_dir = acoustic_model.parent
voc_dir = vocoder.parent

synt_name = acoustic_model.name
voc_name = vocoder.name

Create output directories.

In [None]:
if not out_dir.exists():
    out_dir.mkdir()
if not rmir_dir.exists():
    rmir_dir.mkdir()

Stop already running docker file and run riva_servicemaker and run again with acoustic_model and vocoder paths.

In [None]:
##Run the riva servicemaker.
!docker stop riva_rmir_gen &> /dev/null
!set -x && docker run -td --gpus all --rm -v {str(riva_model_files)}:/riva_repo -v {str(synt_dir)}/:/synt -v {str(voc_dir)}:/voc \
            -v {str(rmir_dir.resolve())}:/data --name riva_rmir_gen --entrypoint="/bin/bash" {riva_init_image}

<div class="alert-warning">
    Using <b>--force</b> tag in <b>riva-build</b> this will replace any existing RMIR.
</div>

In [None]:
warnings.warn("Using --force in riva-build will replace any existing RMIR.")
riva_build=f"""riva-build speech_synthesis --force --voice_name={voice}  --language_code={lang} \
                --sample_rate={sample_rate} /data/FastPitch_HifiGan.rmir:{key} /synt/{synt_name}:{key} \
                /voc/{voc_name}:{key}  --abbreviations_file=/riva_repo/abbr.txt"""

In [None]:
if target_machine=="arm":
    riva_build += """--max_batch_size 1 --denoiser.max_batch_size 1 --preprocessor.max_batch_size 1 \
                --encoderFastPitch.max_batch_size 1 --chunkerFastPitch.max_batch_size 1 --hifigan.max_batch_size 1"""

In [None]:
if use_ipa == "Y" or use_ipa=="y":
    riva_build+=" --phone_set=ipa --arpabet_file=/riva_repo/ipa_cmudict-0.7b_nv22.08.txt"
else:
    riva_build+=" --arpabet_file=/riva_repo/cmudict-0.7b_nv22.08"
print(riva_build)

Execute the riva build command and stop the riva_servicemaker container.

In [None]:
!docker exec  riva_rmir_gen {riva_build}
!docker stop riva_rmir_gen

## Deploy.

So far in this tutorial, we have learned how to generate RMIR files from .riva files. We would see that a `FastPitch_HifiGan.rmir` has been generated in the `${out_dir}/rmir` location we defined earlier.  

The RMIR file generated in this tutorial can be deployed using [riva_quickstart](https://docs.nvidia.com/deeplearning/riva/user-guide/docs/quick-start-guide.html).

### Steps to deploy the RMIR
- Download the riva_quickstart
- Open `config.sh` and update the following params:  
    - set `service_enabled_asr` to `false`.  
    - set `service_enabled_nlp` to `false`.  
    - set `service_enabled_tts` to `true`.  
    - `riva_model_loc` to the location of your `out_dir`.  
    - set `use_existing_rmirs` to `true`.  
- run `riva_init.sh`.  
- run `riva_start.sh`.  


From this step onwards you need to download the Riva QuickStart Resource from NGC. Set the path to the directory here:

In [None]:
RIVA_DIR = "<Path to the uncompressed folder downloaded from quickstart(include the folder name)>"

Next, we modify the `config.sh` file to enable the relevant Riva services (TTS in this case for FastPitch and HiFi-GAN), and provide the encryption key and path to the model repository (riva_model_loc) generated in the previous step.

For example, if the above model repository is generated at `$MODEL_LOC/models`, then you can specify `riva_model_loc` as the same directory as `MODEL_LOC`


### config.sh snippet  
    # Enable or Disable Riva Services 
    service_enabled_asr=false                                                      ## MAKE CHANGES HERE  
    service_enabled_nlp=false                                                      ## MAKE CHANGES HERE  
    service_enabled_tts=true                                                     ## MAKE CHANGES HERE  

    # Specify one or more GPUs to use
    # specifying more than one GPU is currently an experimental feature, and may result in undefined behaviours.
    gpus_to_use="device=0"

    # Specify the encryption key to use to deploy models
    MODEL_DEPLOY_KEY="tlt_encode"                                                  ## MAKE CHANGES HERE

    # Locations to use for storing models artifacts
    #
    # If an absolute path is specified, the data will be written to that location
    # Otherwise, a docker volume will be used (default).
    #
    # riva_init.sh will create a `rmir` and `models` directory in the volume or
    # path specified. 
    #
    # RMIR ($riva_model_loc/rmir)
    # Riva uses an intermediate representation (RMIR) for models
    # that are ready to deploy but not yet fully optimized for deployment. Pretrained
    # versions can be obtained from NGC (by specifying NGC models below) and will be
    # downloaded to $riva_model_loc/rmir by `riva_init.sh`
    # 
    # Custom models produced by NeMo or TAO and prepared using riva-build
    # may also be copied manually to this location $(riva_model_loc/rmir).
    #
    # Models ($riva_model_loc/models)
    # During the riva_init process, the RMIR files in $riva_model_loc/rmir
    # are inspected and optimized for deployment. The optimized versions are
    # stored in $riva_model_loc/models. The riva server exclusively uses these
    # optimized versions.
    riva_model_loc="<add path>"                              ## MAKE CHANGES HERE (Replace with MODEL_LOC)    

    # The default RMIRs are downloaded from NGC by default in the above $riva_rmir_loc directory
    # If you'd like to skip the download from NGC and use the existing RMIRs in the $riva_rmir_loc
    # then set the below $use_existing_rmirs flag to true. You can also deploy your set of custom
    # RMIRs by keeping them in the riva_rmir_loc dir and use this quickstart script with the
    # below flag to deploy them all together.
    use_existing_rmirs=false                                ## MAKE CHANGES HERE (Set to true)


In [None]:
# Ensure you have permission to execute these scripts
! cd $RIVA_DIR && chmod +x ./riva_init.sh && chmod +x ./riva_start.sh

In [None]:
# Run Riva Init. This will fetch the containers/models
# YOU CAN SKIP THIS STEP IF YOU DID RIVA DEPLOY
! cd $RIVA_DIR && ./riva_init.sh config.sh

In [None]:
# Run Riva Start. This will deploy your model(s).
! cd $RIVA_DIR && ./riva_start.sh config.sh

# Run Inference
Once the Riva server is up and running with your models, you can send inference requests querying the server.

To send gRPC requests, install the Riva Python API bindings for the client.

In [None]:
# Install client API bindings
! pip install nvidia-riva-client

### Connect to the Riva server and run inference
Now, we can query the Riva server; let’s get started. The following cell queries the Riva server (using gRPC) to yield a result.

In [None]:
import os
import soundfile
import riva.client
import IPython.display as ipd
import numpy as np

server = "localhost:50051"                # location of riva server
auth = riva.client.Auth(uri=server)
tts_service = riva.client.SpeechSynthesisService(auth)


text = "Is it recognize speech or wreck a nice beach?"
language_code = "en-US"                   # currently required to be "en-US"
sample_rate_hz = 22050                    # the desired sample rate
voice_name = "new_speaker.new_voice"      # subvoice to generate the audio output.
data_type = np.int16                      # For RIVA version < 1.10.0 please set this to np.float32

resp = tts_service.synthesize(text, voice_name=voice_name, language_code=language_code, sample_rate_hz=sample_rate_hz)
audio = resp.audio
meta = resp.meta
processed_text = meta.processed_text
predicted_durations = meta.predicted_durations

audio_samples = np.frombuffer(resp.audio, dtype=data_type)
print(processed_text)
ipd.Audio(audio_samples, rate=sample_rate_hz)