# Introduction

The first step of this program is to convert an audio(speech) to text using a Speech-To-Text API provided by IBM Watson.

The next step is to translate the converted text into different languages using the Language Traslator API which also comes from IBM Watson.

Fist step is to get the ibm_watson package installed using pip package installer.

In [1]:
!pip install ibm_watson



We need API Keys to access the IBM Cloud. Without authentication we can not use those APIs. And the API endpoints are based on location of the instance and is different for different service provided by IBM Cloud.
Is is better to store them inside of a variable.

# Convert Audio to Text

We need to import SpeechToTextV1 from ibm_watson and IAMAuthenticators for ibm cloud authentication.

In [2]:
from ibm_watson import SpeechToTextV1
import json
from ibm_cloud_sdk_core.authenticators import IAMAuthenticator 

In [3]:
# This is the url endpoint to access the instance of Speech to Text API, from IBM Watson Cloud. 
instance_url_s2t = "https://api.eu-gb.speech-to-text.watson.cloud.ibm.com/instances/0ccbd1c6-a12e-4132-888d-f68011b8a909"
# This is the API key to access the Speech to Text API
iam_apikey_s2t = "xUIbMu9Js_xIWc58LgrfkMarAm4W1zBncukijuVBcOR3"

Create an Adapter object of SpeechToTextV1, with the api endpoint and the key

In [4]:
authenticator = IAMAuthenticator(iam_apikey_s2t)
s2t = SpeechToTextV1(authenticator=authenticator)
s2t.set_service_url(instance_url_s2t)
s2t

<ibm_watson.speech_to_text_v1_adapter.SpeechToTextV1Adapter at 0x7f812c42bb70>

Now we need to get the audio/speech file. I am using gdown to bring the file from my gdrive

In [5]:
!pip install gdown
!gdown https://drive.google.com/uc?id=1lpBuaAkwEX8fhLJAK6jzWi5yZaTF6n0S

Downloading...
From: https://drive.google.com/uc?id=1lpBuaAkwEX8fhLJAK6jzWi5yZaTF6n0S
To: /content/audio_sample.mp3
100% 170k/170k [00:00<00:00, 64.0MB/s]


In [6]:
filename = '../content/audio_sample.mp3'

We will create a binary file object (wav) by using file.open and the mode set to 'rb'. Then the method **recognize** will return the recognized text. This method takes two parameter *audio* and *content_type*

In [7]:
with open(filename, mode='rb') as wav:
  response = s2t.recognize(audio=wav, content_type='audio/mp3')

In [8]:
response

<ibm_cloud_sdk_core.detailed_response.DetailedResponse at 0x7f812c3e2b00>

This response have a result attribute which contains a dictionary includes the translation.

In [9]:
response.result

{'result_index': 0,
 'results': [{'alternatives': [{'confidence': 0.87,
     'transcript': 'this is a python program to convert audio to speech and in different language '}],
   'final': True}]}

Normalize the dictionary to JSON, it also adds timestamp which is helpful if the audio is of more than 2 minutes.

In [10]:
from pandas.io.json import json_normalize
json_normalize(response.result['results'], "alternatives")

  


Unnamed: 0,transcript,confidence
0,this is a python program to convert audio to s...,0.87


In [11]:
recognized_text = response.result['results'][0]["alternatives"][0]["transcript"]
print(recognized_text)

this is a python program to convert audio to speech and in different language 


# Translate to Different Language

Import LanguageTranslatorV3 from ibm_watson and get a access key and access url from instance's service credentials.

In [12]:
from ibm_watson import LanguageTranslatorV3

In [13]:
# This is the url endpoint to access the instance of Language Translator API, from IBM Watson Cloud. 
instance_url_lt = "https://api.eu-gb.language-translator.watson.cloud.ibm.com/instances/af85a29b-f71e-4161-90ff-35fb096479c3"
# This is the API key to access the Language Translator API
iam_apikey_lt = "yLqJCfZ8e5Kly5A3Vv4WNi5acXPDMN34CBcE8BP_25Y8"

This api needs a version number to be passed in the format "YYYY-MM-DD".
The current version is 2018-05-01

In [14]:
version = '2018-05-01'

Now create Language Translator object

In [15]:
authenticator = IAMAuthenticator(apikey=iam_apikey_lt)
language_translator = LanguageTranslatorV3(version=version, authenticator=authenticator)
language_translator.set_service_url(instance_url_lt)
language_translator

<ibm_watson.language_translator_v3.LanguageTranslatorV3 at 0x7f812bfb1d30>

Lets see in how many language we can convert our text

In [16]:
json_normalize(language_translator.list_identifiable_languages().get_result(), "languages")

  """Entry point for launching an IPython kernel.


Unnamed: 0,language,name
0,af,Afrikaans
1,ar,Arabic
2,az,Azerbaijani
3,ba,Bashkir
4,be,Belarusian
...,...,...
71,uk,Ukrainian
72,ur,Urdu
73,vi,Vietnamese
74,zh,Simplified Chinese


The **translate** method will take two parameter, the string to be translated and the language it to be translated from-into (like 'en-es' will transtale English text into Spanish).

 **English => Spanish**

In [17]:
translating_to_spanish = language_translator.translate(text=recognized_text, model_id='en-es')

It returns a dictionary

In [18]:
translating_to_spanish.get_result()

{'character_count': 78,
 'translations': [{'translation': 'este es un programa de python para convertir audio a voz y en diferentes idiomas '}],
 'word_count': 14}

In [19]:
translated_spanish_text = translating_to_spanish.get_result()['translations'][0]['translation']
translated_spanish_text

'este es un programa de python para convertir audio a voz y en diferentes idiomas '

**English => Bengali**

In [20]:
translating_to_bengali = language_translator.translate(text=recognized_text, model_id='en-bn')
translated_bengali_text = translating_to_bengali.get_result()['translations'][0]['translation']
translated_bengali_text

'এটি একটি পাইথন প্রোগ্রাম, যা অডিও ভাষাকে বক্তৃতা এবং বিভিন্ন ভাষায় রূপান্তর করা. '

This is python program which I learned while doing a Data Science certification course from [cognitiveclass.ai](https://glados.cognitiveclass.ai/)

**References**

1. Speech to Text [Documentstion](https://cloud.ibm.com/apidocs/speech-to-text?code=python) from IBM Cloud
2. Language Translator [Documentation](https://cloud.ibm.com/apidocs/language-translator?code=python) from IBM Cloud.


