<a href="https://colab.research.google.com/github/AnetaKovacheva/text-to-speech/blob/main/Text_to_Speech_with_Google_TTS.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Convert Text-to-Speech with Google's TTS

In this Notebook, I explore `gTTS` - the Google's Text-to-Speech engine. It writes text data to a mp3 file. I show a very simple example, inspired by an [article](https://towardsdatascience.com/easy-text-to-speech-with-python-bfb34250036e) in Towards Data Science.

Text-to-speech technology is software that takes text as an input and produces audible speech as an output. In other words, it goes from text to speech, making TTS one of the more aptly named technologies of the digital revolution ([ref](https://www.readspeaker.ai/blog/tts-technology/)).


Google's Text-to-Speech framework shoud be installed since it does not come as ready-to-use library in Colab.

In [2]:
!pip install gTTS

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting gTTS
  Downloading gTTS-2.2.4-py3-none-any.whl (26 kB)
Installing collected packages: gTTS
Successfully installed gTTS-2.2.4


### Imports

In [18]:
from IPython.display import Audio
from IPython.display import display

from gtts import gTTS

## 1. Get text data

I use short and longer texts only in English since according to `gTTS` [documentation](https://gtts.readthedocs.io/en/latest/module.html#module-gtts.tts) as of August 2022, it supports only English, French, Spanish, Portuguese, and Mandarin languages.

In [20]:
short_text = "In a study recently published in the journal Sports Medicine, it was found that when participants went for a short stroll after a meal, even for as little as two to five minutes, their blood sugar levels rose and fell more gradually."
short_text

'In a study recently published in the journal Sports Medicine, it was found that when participants went for a short stroll after a meal, even for as little as two to five minutes, their blood sugar levels rose and fell more gradually.'

In [23]:
long_text = open("china_drought.txt", "r").read().replace("\n", " ")
long_text

'China has declared its first drought emergency of the year amid fears that a period of exceptionally hot and dry weather will lead to significant shortages of water.   The hottest, driest summer since Chinese records began 61 years ago has wilted crops and left reservoirs at half of their normal water level.   Factories in Sichuan province were shut down last week to save power for homes as air-conditioning demand surged, with temperatures as high as 45 degrees Celsius.  Meanwhile, authorities say an estimated 1 million people in rural areas will face water shortages.  The coming 10 days is a “key period of damage resistance” for southern China’s rice crop, said Agriculture Minister Tang Renjian, according to the newspaper Global Times.  Authorities will take emergency steps to “ensure the autumn grain harvest,” which is 75% of China’s annual total, Tang said Friday, according to the report.  Drought conditions across a swath of China from the densely populated east across central far

## 2. Define text and speech settings, and produce speech

`gTTS` class expects to get the language (IETF language tag) in which it will read the text. The default setting is 'en'. Nonetheless, the language is defined in a global variable; its value could be changed easily. 


In [5]:
language = "en"

The code lines below use `gTTS` to convert both texts into speech. If "slow" was set to "True", the engine will read text more slowly.

In [24]:
speech_short = gTTS(text = short_text, lang = language, slow = False)
speech_long = gTTS(text = long_text, lang = language, slow = False)

"Save" command saves the converted to speech texts in *mp3* files.

In [25]:
speech_short.save("short_text.mp3")
speech_long.save("long_text.mp3")


In [26]:
sound_file_short = "short_text.mp3"
sound_file_long = "long_text.mp3"

In [27]:
def play_audio(sound_file):
  wn = Audio(sound_file, autoplay = True)
  display(wn)

In [28]:
play_audio(sound_file_short)

In [29]:
play_audio(sound_file_long)