# Watson Speech to Text
Author: Ayush Kothule

This Jupyter Notebook uses Python library `ibm_watson` to convert audio files into text.

### Requirements:
1. speech_to_text_api_url: provide API url for IBM Watson speech to text service
2. speech_to_text_api_key: provide API Key for IBM Watson speech to text service
3. We'll download a sample audio file https://raw.githubusercontent.com/akothule/Data-Science-Fundamentals/main/samples/ClimateChangeInGreatBarrierReef.flac. We'll use IBM Watson service to transcribe this audio file into text

In [44]:
speech_to_text_api_url = "https://api.us-south.speech-to-text.watson.cloud.ibm.com/"
speech_to_text_api_key = ""


In [45]:
# install wget and download sample audio file
!pip install ibm_watson wget

# download the sample audio file
!wget -O climate_change_in_great_barrier_reef.flac "https://raw.githubusercontent.com/akothule/Data-Science-Fundamentals/main/samples/ClimateChangeInGreatBarrierReef.flac"

#store audio file name
audio_file_name = "climate_change_in_great_barrier_reef.flac"

--2021-07-06 19:09:45--  https://raw.githubusercontent.com/akothule/Data-Science-Fundamentals/main/samples/ClimateChangeInGreatBarrierReef.flac
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.111.133, 185.199.109.133, 185.199.108.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.111.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1409291 (1.3M) [audio/flac]
Saving to: ‘climate_change_in_great_barrier_reef.flac’


2021-07-06 19:09:46 (194 MB/s) - ‘climate_change_in_great_barrier_reef.flac’ saved [1409291/1409291]



In [46]:
# import watson speech to text service and json libraries
from ibm_watson import SpeechToTextV1
from ibm_cloud_sdk_core.authenticators import IAMAuthenticator

authenticator = IAMAuthenticator(speech_to_text_api_key)
speech_to_text = SpeechToTextV1(
    authenticator=authenticator
)

speech_to_text.set_service_url(speech_to_text_api_url)

In [47]:
# open audio file and recognize api on speech to text service
with open(audio_file_name, mode="rb")  as wav:
    response = s2t.recognize(audio=wav, content_type='audio/flac')

#
response.result

{'result_index': 0,
 'results': [{'final': True,
   'alternatives': [{'transcript': 'climate change in the Great Barrier Reef the oceans will start to get warmer and acidify destroying the coral coral will also start to bleach the climate change will cause more extreme storms we can destroy the coral ',
     'confidence': 0.88}]}]}

In [52]:
# response.result contains the json object returned by speech to text service. We'll process the response using json and pandas library
import json
from pandas import json_normalize

json_normalize(response.result['results'],"alternatives")

# print response
response

recognized_text = response.result['results'][0]["alternatives"][0]["transcript"]
confidence = response.result['results'][0]["alternatives"][0]["confidence"]
#print recognized text
print ("Recognized Text from Audio:", recognized_text)
print ("Confidence Level:", confidence * 100, "%")


Recognized Text from Audio: climate change in the Great Barrier Reef the oceans will start to get warmer and acidify destroying the coral coral will also start to bleach the climate change will cause more extreme storms we can destroy the coral 
Confidence Level: 88.0 %
