<a href="https://colab.research.google.com/github/nguyenviettung7691/TTS-Runner/blob/main/tts.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# GTTS (Google Text-To-Speech)
[Document](https://gtts.readthedocs.io/)

In [None]:
text = "In Japanese folklore and mythology, thunder and lightning are often personified as deities or kami. However, there isn't a specific deity named Fusuikazuchi in traditional Japanese mythology.  If we consider the term \"Fusuikazuchi\" in a symbolic or metaphorical sense, it could represent the unseen or latent power of thunder and lightning. It suggests that the forces of thunder and lightning may lie dormant or hidden until they are unleashed during a thunderstorm. It could also be interpreted as the potential for destruction or transformation that exists within the calm before the storm." #@param {type:"string"}
lang = "en" #@param ["en", "vi", "ja"]
slow = False #@param {type:"boolean"}
!pip install gtts
!pip install pydub

from gtts import gTTS, gTTSError
from pydub import AudioSegment
from IPython.display import Audio

# Function to convert text to speech and save as MP3 with a specific voice
def text_to_speech(text, filename, lang='en', voice='default', slow=False):
    tts = gTTS(text, lang=lang, tld='com', slow=False, lang_check=False)
    tts.save(filename)

    return filename

# Example usage
filename = "output.mp3"
output_file = text_to_speech(text, filename, lang='en', voice='en-us', slow=False)

# Play the generated audio
Audio(output_file)



# SpeechT5 (Microsoft)
[Huggingface](https://huggingface.co/microsoft/speecht5_tts)

In [None]:
#@title Default title text
text = "In the context of mythology, thunder and lightning are often personified as deities or kami. They are revered for their awe-inspiring and sometimes fearsome qualities. The imagery and symbolism of thunder and lightning can vary across different myths and cultural interpretations." #@param {type:"string"}
# Following pip packages need to be installed:
!pip install git+https://github.com/huggingface/transformers sentencepiece datasets

from transformers import SpeechT5Processor, SpeechT5ForTextToSpeech, SpeechT5HifiGan
from datasets import load_dataset
import torch
import soundfile as sf
from datasets import load_dataset
from IPython.display import Audio

processor = SpeechT5Processor.from_pretrained("microsoft/speecht5_tts")
model = SpeechT5ForTextToSpeech.from_pretrained("microsoft/speecht5_tts")
vocoder = SpeechT5HifiGan.from_pretrained("microsoft/speecht5_hifigan")

inputs = processor(text=text, return_tensors="pt")

# load xvector containing speaker's voice characteristics from a dataset
embeddings_dataset = load_dataset("Matthijs/cmu-arctic-xvectors", split="validation")
speaker_embeddings = torch.tensor(embeddings_dataset[7306]["xvector"]).unsqueeze(0)

speech = model.generate_speech(inputs["input_ids"], speaker_embeddings, vocoder=vocoder)

sf.write("speech.wav", speech.numpy(), samplerate=16000)
Audio("speech.wav")

Collecting git+https://github.com/huggingface/transformers
  Cloning https://github.com/huggingface/transformers to /tmp/pip-req-build-8sfww9dx
  Running command git clone --filter=blob:none --quiet https://github.com/huggingface/transformers /tmp/pip-req-build-8sfww9dx
  Resolved https://github.com/huggingface/transformers to commit abaca9f9432a84cfaa95531de4c72334f38a42f2
  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone




# FastSpeech 2 (facebook/fastspeech2-en-ljspeech)
[Document](https://huggingface.co/facebook/fastspeech2-en-ljspeech)

In [None]:
#@title Default title text
text = "In some interpretations, Tsuchi-Ikazuchi is portrayed as a figure with a powerful and imposing presence. They may be depicted with symbols associated with both earthquakes and thunder, such as cracked earth or lightning bolts. The deity's appearance can convey a sense of strength and authority, reflecting their association with these natural phenomena." #@param {type:"string"}
!pip install fairseq
!pip install g2p_en
from fairseq.checkpoint_utils import load_model_ensemble_and_task_from_hf_hub
from fairseq.models.text_to_speech.hub_interface import TTSHubInterface
import IPython.display as ipd

models, cfg, task = load_model_ensemble_and_task_from_hf_hub(
    "facebook/fastspeech2-en-ljspeech",
    arg_overrides={"vocoder": "hifigan", "fp16": False}
)
model = models[0]
TTSHubInterface.update_cfg_with_data_cfg(cfg, task.data_cfg)
generator = task.build_generator([model], cfg)

sample = TTSHubInterface.get_model_input(task, text)
wav, rate = TTSHubInterface.get_prediction(task, model, generator, sample)

ipd.Audio(wav, rate=rate)




Fetching 9 files:   0%|          | 0/9 [00:00<?, ?it/s]

# Bark (suno/bark)

*   [Huggingface](https://huggingface.co/suno/bark)
*   [Speaker Library](https://suno-ai.notion.site/8b8e8749ed514b0cbf3f699013548683?v=bc67cff786b04b50b3ceb756fd05f68c)
*   [Long Form Generation](https://github.com/suno-ai/bark/blob/main/notebooks/long_form_generation.ipynb)


In [7]:
#@title Default title text
text_prompt = "As a deity of agriculture and fertility, Harayamatsumi-no-kami is revered for their role in promoting bountiful harvests, ensuring the prosperity of farming communities, and maintaining the balance between humans and the natural world." #@param {type:"string"}
history_prompt = "v2/en_speaker_9" #@param {type:"string"}
!pip install git+https://github.com/suno-ai/bark.git
from bark import SAMPLE_RATE, generate_audio, preload_models
from IPython.display import Audio

# download and load all models
preload_models()

# generate audio from text
# text_prompt = """
#      Hello, my name is Suno. And, uh — and I like pizza. [laughs]
#      But I also have other interests such as playing tic tac toe.
# """

audio_array = generate_audio(text_prompt, history_prompt)

# play text in notebook
Audio(audio_array, rate=SAMPLE_RATE)


Collecting git+https://github.com/suno-ai/bark.git
  Cloning https://github.com/suno-ai/bark.git to /tmp/pip-req-build-d1289t1o
  Running command git clone --filter=blob:none --quiet https://github.com/suno-ai/bark.git /tmp/pip-req-build-d1289t1o
  Resolved https://github.com/suno-ai/bark.git to commit 599fed040e52c89e0b3580e02e2684b2c9100701
  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Installing backend dependencies ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone


100%|██████████| 737/737 [00:08<00:00, 87.25it/s]
100%|██████████| 37/37 [00:35<00:00,  1.03it/s]
