# Gai/Gen: Text-to-Speech (TTS)

## 1. Note

The following examples has been tested on the following environment:
-   Ubuntu 22.04
-   Python 3.10
-   CUDA Toolkit 11.8
-   openai 1.6.1
-   TTS 0.22.0
-   deepspeed 0.12.6


## 2. Create Virtual Environment and Install Dependencies

We will create a seperate virtual environment for this to avoid conflicting dependencies that each underlying model requires.

```sh
sudo apt update -y && sudo apt install ffmpeg git git-lfs -y
conda create -n TTS python=3.10.10 -y
conda activate TTS
pip install gai-gen[TTS]
```

## 3. Examples

In [1]:
## 3.1 OpenAI Text-to-Speech

print("GENERATING:")
from gai.gen import Gaigen
gen = Gaigen.GetInstance().load('openai-tts-1')
response = gen.create(
  voice="alloy",
  input="The definition of insanity is doing the same thing over and over and expecting different results."
)
from IPython.display import Audio
Audio(response, rate=24000)

GENERATING:


The following demo is uses Coqui AI's xTTS model. Create and run the following script `xtts_download.py` to download the model:

```python
# xtts_download.py
import os
os.environ["COQUI_TOS_AGREED"]="1"

from TTS.utils.manage import ModelManager
print("Downloading...")
mm =  ModelManager(output_prefix="~/gai/models/tts")
model_name="tts_models/multilingual/multi-dataset/xtts_v2"
mm.download_model(model_name)
print("Downloaded")
```

Take note that loading the model for the first time will take a while for deepspeed to compile the model.

In [1]:
## 3.2 Coqui xTTS Text-to-Speech

print("GENERATING:")
from gai.gen import Gaigen
gen = Gaigen.GetInstance().load('xtts-2')
response = gen.create(
  voice="Vjollca Johnnie",
  input="The definition of insanity is doing the same thing over and over and expecting different results."
)
from IPython.display import Audio
Audio(response, rate=24000)

GENERATING:


[2023-12-26 14:27:50,365] [INFO] [real_accelerator.py:161:get_accelerator] Setting ds_accelerator to cuda (auto detect)
[2023-12-26 14:27:51,681] [INFO] [logging.py:96:log_dist] [Rank -1] DeepSpeed info: version=0.12.6, git-hash=unknown, git-branch=unknown
[2023-12-26 14:27:51,684] [INFO] [logging.py:96:log_dist] [Rank -1] quantize_bits = 8 mlp_extra_grouping = False, quantize_groups = 1


Using /home/roylai/.cache/torch_extensions/py310_cu121 as PyTorch extensions root...
Detected CUDA files, patching ldflags
Emitting ninja build file /home/roylai/.cache/torch_extensions/py310_cu121/transformer_inference/build.ninja...
Building extension module transformer_inference...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)


[1/11] /usr/local/cuda-11.8/bin/nvcc  -DTORCH_EXTENSION_NAME=transformer_inference -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/home/roylai/miniconda/envs/TTS-test/lib/python3.10/site-packages/deepspeed/ops/csrc/transformer/inference/includes -I/home/roylai/miniconda/envs/TTS-test/lib/python3.10/site-packages/deepspeed/ops/csrc/includes -isystem /home/roylai/miniconda/envs/TTS-test/lib/python3.10/site-packages/torch/include -isystem /home/roylai/miniconda/envs/TTS-test/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /home/roylai/miniconda/envs/TTS-test/lib/python3.10/site-packages/torch/include/TH -isystem /home/roylai/miniconda/envs/TTS-test/lib/python3.10/site-packages/torch/include/THC -isystem /usr/local/cuda-11.8/include -isystem /home/roylai/miniconda/envs/TTS-test/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVE

Loading extension module transformer_inference...


XTTS Loaded.




------------------------------------------------------
Free memory : 4.624023 (GigaBytes)  
Total memory: 7.999573 (GigaBytes)  
Requested memory: 0.335938 (GigaBytes) 
Setting maximum total tokens (input + output) to 1024 
WorkSpace: 0x79d000000 
------------------------------------------------------


  return F.conv1d(input, weight, bias, self.stride,
