# Automatic Transcription : Tool to convert an audio signal into text and translate to different languages

A project to give a demo of how to use IBM Watson to convert audio to text and convert this text from english to other languages. 


<h2 id="ref1">Speech to text conversion using IBM watson</h2>

We will require ibm-watson package:ibm-watson wget ibm-cloud-sdk-core 

In [32]:
import pandas as pd
from pandas.io.json import json_normalize

In [1]:
!pip install ibm_watson wget



<p>For more information on the API use this <a href="https://cloud.ibm.com/apidocs/speech-to-text?code=python">link</a>.</p>

In [3]:
from ibm_watson import SpeechToTextV1 
import json
from ibm_cloud_sdk_core.authenticators import IAMAuthenticator

The service endpoint is based on the location of the service instance. We store the information in the variable URL. To find out which URL to use, view the service credentials.
default_url = 'https://stream.watsonplatform.net/speech-to-text/api'

In [6]:
url = "https://stream.watsonplatform.net/speech-to-text/api"

We need an API key for accessing the API.

In [None]:
file=open("C:/Users/Sheetal/Desktop/api_key.txt","r")
apikey=file.read()
apikey

Next, create a speech to text adapter object

In [9]:
authenticator = IAMAuthenticator(apikey)
s2t = SpeechToTextV1(authenticator=authenticator)
s2t.set_service_url(url)
s2t

<ibm_watson.speech_to_text_v1_adapter.SpeechToTextV1Adapter at 0x1a7c388fd08>

In [26]:
s2t.set_disable_ssl_verification(False)

<p>We next select an audio file. I am downloading my audio file from this <a href="https://s3-api.us-geo.objectstorage.softlayer.net/cf-courses-data/CognitiveClass/PY0101EN/labs/PolynomialRegressionandPipelines.mp3">link</a>.</p>

In [22]:
audiofile="C:/Users/Sheetal/Desktop/PolynomialRegressionandPipelines.mp3"

<p>We create the file object <code>wav</code> with the wav file using  <code>open</code> ; we set the <code>mode</code> to  "rb" ,  this is similar to read mode, but it ensures the file is in binary mode.We use the method <code>recognize</code> to return the recognized text. The parameter audio is the file object <code>wav</code>, the parameter <code>content_type</code> is the format of the audio file.</p>

In [27]:
from ibm_watson import ApiException

In [30]:
try:
    with open(audiofile,"rb")  as wav:
        response = s2t.recognize(audio=wav, content_type='audio/mp3')
except ApiException as ex:
    print ("Method failed with status code " + str(ex.code) + ": " + ex.message)



In [31]:
response.result

{'results': [{'alternatives': [{'confidence': 0.93,
     'transcript': 'in this video we will cover polynomial regression and pipelines '}],
   'final': True},
  {'alternatives': [{'confidence': 0.9,
     'transcript': "what do we do when a linear model is not the best fit for our data let's look into another type of regression model the polynomial regression we transform our data into a polynomial then use linear regression to fit the parameters that we will discuss pipelines pipelines are way to simplify your code "}],
   'final': True},
  {'alternatives': [{'confidence': 0.95,
     'transcript': "polynomial regression is a special case of the general linear regression this method is beneficial for describing curvilinear relationships what is a curvilinear relationship it's what you get by squaring or setting higher order terms of the predictor variables in the model transforming the data the model can be quadratic which means the predictor variable in the model is squared we use a b

In [63]:
trial_text=response.result['results'][0]['alternatives'][0]['transcript']
trial_text

'in this video we will cover polynomial regression and pipelines '

We use pandas <code>json_normalize</code> to transform the json object to a pandas dataframe

In [60]:
df=json_normalize(response.result['results'],"alternatives")
df

Unnamed: 0,confidence,transcript
0,0.93,in this video we will cover polynomial regress...
1,0.9,what do we do when a linear model is not the b...
2,0.95,polynomial regression is a special case of the...
3,0.95,the model can be cubic which means the predict...
4,0.91,there also exists higher order polynomial regr...
5,0.89,let's look at an example from our data we gene...
6,0.92,in python we do this by using the poly fit fun...
7,0.9,negative one point five five seven X. one cute...
8,0.9,consider the feature shown here applying the m...
9,0.89,pipeline sequentially perform a series of tran...


In [59]:
series=df['transcript']
audio_text=' '.join([str(elem) for elem in series])
audio_text

"in this video we will cover polynomial regression and pipelines  what do we do when a linear model is not the best fit for our data let's look into another type of regression model the polynomial regression we transform our data into a polynomial then use linear regression to fit the parameters that we will discuss pipelines pipelines are way to simplify your code  polynomial regression is a special case of the general linear regression this method is beneficial for describing curvilinear relationships what is a curvilinear relationship it's what you get by squaring or setting higher order terms of the predictor variables in the model transforming the data the model can be quadratic which means the predictor variable in the model is squared we use a bracket to indicated as an exponent this is the second order polynomial regression with a figure representing the function  the model can be cubic which means the predictor variable is cute this is the third order polynomial regression we 

<h2 id="ref1">Language Translator</h2>

<p>First we import <code>LanguageTranslatorV3</code> from ibm_watson. For more information on the API click <a href="https://cloud.ibm.com/apidocs/language-translator/language-translator"> here</a></p>

In [48]:
from ibm_watson import LanguageTranslatorV3

In [49]:
url_lt='https://gateway.watsonplatform.net/language-translator/api'

In [None]:
file=open("C:/Users/Sheetal/Desktop/apikey_lt.txt","r")
apikey_lt=file.read()
apikey_lt

In [53]:
version_lt='2018-05-01'

In [54]:
authenticator = IAMAuthenticator(apikey_lt)
language_translator = LanguageTranslatorV3(version=version_lt,authenticator=authenticator)
language_translator.set_service_url(url_lt)
language_translator

<ibm_watson.language_translator_v3.LanguageTranslatorV3 at 0x1a7c4ddd8c8>

In [72]:
language_list=json_normalize(language_translator.list_identifiable_languages().get_result(), "languages")
language_list.head()

Unnamed: 0,language,name
0,af,Afrikaans
1,ar,Arabic
2,az,Azerbaijani
3,ba,Bashkir
4,be,Belarusian


Before converting the text extracted from the audio let us work on an example to understand the languague translator API.

In [68]:
spanish_text='Esta es una frase de prueba para probar el traductor'
es_response = language_translator.translate(text=spanish_text, model_id='es-en')
english_translation=es_response.get_result()
english_translation['translations'][0]['translation']

'This is a test sentence for testing the translator'

In [61]:
translation_response = language_translator.translate(text=trial_text, model_id='en-es')
translation_response

<ibm_cloud_sdk_core.detailed_response.DetailedResponse at 0x1a7c4ddd548>

In [62]:
translation=translation_response.get_result()
translation

{'translations': [{'translation': 'en este video cubriremos la regresión polinómica y los oleoductos '}],
 'word_count': 10,
 'character_count': 64}

In [64]:
spanish_translation =translation['translations'][0]['translation']
spanish_translation 

'en este video cubriremos la regresión polinómica y los oleoductos '

In [73]:
audio_response=language_translator.translate(text=audio_text,model_id='en-hi')
translation_of_text=audio_response.get_result()
translated_text=translation_of_text['translations'][0]['translation']
translated_text

'इस विडियो में हम बहुमीय रीप्रेशन और पाइपलाइनों को कवर करने के लिए क्या करते हैं जब एक रैखिक मॉडल को एक अन्य प्रकार की रीजेंट मॉडल में परिवर्तित करने के लिए सबसे अच्छा उपयुक्त नहीं है । क्या आप इस मॉडल को परिवर्तित करने के लिए प्रयोग कर रहे हैं । इस क्रिया में अधिक विविधता होती है जब एक अच्छे योग्य और तीसरे क्रम के द्वारा प्राप्त किए जाने के क्रम में अधिक विविधता होती है जब हम यह देख सकते हैं कि जब हम बहुपद के क्रम में परिवर्तन के क्रम में परिवर्तन करते हैं, जब हम यह मान लेते हैं कि सभी मामलों में सही मान लेने से, यदि आप सभी मामलों में सही मान लेते हैं तो पैरामीटर हमेशा रैखिक रूप से रैखिक होते हैं जैसे कि हम एक उदाहरण के रूप में हमारे डेटा का उत्पादन करते हैं इस उदाहरण वेफ लिए हम एक तीसरे क्रम का प्रयोग कर सकते हैं । इस उदाहरण में हम एक तीसरे क्रम का प्रयोग कर सकते हैं । इस उदाहरण के लिए हम एक तीसरा क्रम का प्रयोग कर सकते हैं । किसी भी प्रकार की दो त्रिआयामी द्वितीय क्रम के लिये अभिव्यक्ति की प्रक्रिया कुछ जटिल हो सकती है । इस तरीका को लागू करने से हम डेटा को बदल देते हैं जो हमारे मूल 