#  Speech API demo notebook
This notebook demonstrates simply  
* how you can send a speech to Speech api to get converted to text  
* how you can translate a text by using translator api
* how you can send a text to get the spoken version of it
* how to play the aıdio via python 

Below you will find the required packages to be installed in your python enviroment.  
You can either install from here or your terminal vindow via [activating your enviroment](https://docs.conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html#activating-an-environment)

In [None]:
!pip install azure-cognitiveservices-speech
!pip install websocket-client
!pip install playsound

Import the [Azure Cognitive Services Speech SDK](https://docs.microsoft.com/en-us/azure/cognitive-services/Speech/home)

In [1]:
import azure.cognitiveservices.speech as speechsdk

Set your speech api and translator api keys. *(for production you may prefer them to be saved in an enviroment variable)*

In [2]:
speech_key, service_region = "YourSubscriptionKey", "YourServiceRegion" ## "westeurope","westus" etc.

translator_subscriptionKey = "YourSubscriptionKey"

## Text To Speech Part:

The language is specified in BCP-47 format. Language codes can be found [here](https://docs.microsoft.com/en-us/azure/cognitive-services/speech-service/language-support). 

In [3]:
# SST settings
speech_language="tr-TR" #Turkish
#speech_language="en-US"
#speech_language="ar-EG" #Egyptian Arabic
#speech_language="ar-SA" #Saudi Arabic #not supported yet

The Voice codes codes can be found [here](https://docs.microsoft.com/en-us/azure/cognitive-services/speech-service/language-support#standard-voices). 

In [4]:
# TTS settings
voice_language="Microsoft Server Speech Text to Speech Voice (tr-TR, SedaRUS)" #Turkish
#voice_language="Microsoft Server Speech Text to Speech Voice (en-US, Guy24KRUS)" #English
#voice_language="Microsoft Server Speech Text to Speech Voice (ar-EG, Hoda)"#Egyptian Arabic
#voice_language="Microsoft Server Speech Text to Speech Voice (ar-SA, Naayf)"#Saudi Arabic


The following code snippet shows how speech can be recognized from audio input from the default microphone (make sure the audio settings are correct), and how to interpret the results.

In [5]:
speech_config = speechsdk.SpeechConfig(subscription=speech_key, 
                                       region=service_region,
                                       speech_recognition_language=speech_language)

speech_recognizer = speechsdk.SpeechRecognizer(speech_config=speech_config)

**While running below line also play/speak the sentences you want to get logged as text**

In [8]:
result = speech_recognizer.recognize_once()

Call the api:

In [9]:
if result.reason == speechsdk.ResultReason.RecognizedSpeech:
    print("Recognized: {}".format(result.text))
    input_text =format(result.text)
elif result.reason == speechsdk.ResultReason.NoMatch:
    print("No speech could be recognized: {}".format(result.no_match_details))
elif result.reason == speechsdk.ResultReason.Canceled:
    cancellation_details = result.cancellation_details
    print("Speech Recognition canceled: {}".format(cancellation_details.reason))
    if cancellation_details.reason == speechsdk.CancellationReason.Error:
        print("Error details: {}".format(cancellation_details.error_details))

Recognized: Bu akşam bir yerlere mi gitsek diyorum


## Translate The Text

In [10]:
import os, requests, uuid, json

Finalize the required Translator API parameter definitions, **subscription key** (done already at the key defnition part), **api url**, **from-to which language** *(if you do not identify from it decides by itself by usng language detection capabilities)*, and **header information**:

In [11]:
translator_base_url = 'https://api.cognitive.microsofttranslator.com/translate?api-version=3.0'
translator_params = '&to=en' #translate to english 
translator_constructed_url = translator_base_url + translator_params
translator_headers = {
    'Ocp-Apim-Subscription-Key': translator_subscriptionKey,
    'Content-type': 'application/json',
    'X-ClientTraceId': str(uuid.uuid4())
}

Construct the translator rest api query:

In [12]:
body = [{ 'text' : input_text }]

translator_request = requests.post(translator_constructed_url, headers=translator_headers, json=body)

translated_input = translator_request.json()
print(translated_input)

[{'detectedLanguage': {'language': 'tr', 'score': 1.0}, 'translations': [{'text': "I'm saying we should go somewhere tonight.", 'to': 'en'}]}]


## Text To Speech Part:

<span style="color:red">**If you want to use the input you gave via microphone do not run below 2 line:** </span>

<span style="color:blue">**If you want to give another input to the TTS run below 2 line:** </span>

<span style="color:red">**------------------------------------------------------------------------------------------------------**</span>

In [None]:
try: input = raw_input
except NameError: pass

In [None]:
input_text= input("What would you like to convert to speech: ")

<span style="color:red">**------------------------------------------------------------------------------------------------------**</span>

Import the neccesary packages to build the requred XML format for the REST query

In [13]:
import os, requests, time
from xml.etree import ElementTree

Here we use [TTS REST API](https://docs.microsoft.com/en-us/azure/cognitive-services/Speech/API-Reference-REST/BingVoiceOutput) with python.  
The subscription_key is your unique key from the Azure portal.  
The text-to-speech REST API requires an access token for authentication.  
Therefore, first we define required information for  the token exchange,  
get the token for the session  and call the text-to-speech API.  
To get an access token, an exchange is required. This sample exchanges your Speech Service subscription key for an access token using the issueToken endpoint.  
Then we build the required headers for TTS, and set the REST CALL XML structure.

After calling the API we will write the audio in the response to a file.

In [14]:
fetch_token_url = "https://"+service_region+".api.cognitive.microsoft.com/sts/v1.0/issueToken"
token_headers = {'Ocp-Apim-Subscription-Key': speech_key}

In [15]:
access_token = str(requests.post(fetch_token_url, headers=token_headers).text)

constructed_url = "https://"+service_region+".tts.speech.microsoft.com/cognitiveservices/v1"

headers = {
            'Authorization': 'Bearer ' + access_token,
            'Content-Type': 'application/ssml+xml',
            'X-Microsoft-OutputFormat': 'riff-24khz-16bit-mono-pcm',
            'User-Agent': 'TurkcellSSTTSSDemo'#Application name:The application name is required and must be fewer than 255 characters.
        }
xml_body = ElementTree.Element('speak', version='1.0')
xml_body.set('{http://www.w3.org/XML/1998/namespace}lang', speech_language)
voice = ElementTree.SubElement(xml_body, 'voice')
voice.set('{http://www.w3.org/XML/1998/namespace}lang', speech_language)
voice.set('name', voice_language)


voice.text = input_text

body = ElementTree.tostring(xml_body)
response = requests.post(constructed_url, headers=headers, data=body)

timestr = time.strftime("%Y%m%d-%H%M")

if response.status_code == 200:
    audio_file='sample-' + timestr + '.wav'
    with open(audio_file, 'wb') as audio:
            audio.write(response.content)
            print("\nStatus code: " + str(response.status_code) + "\nYour TTS is ready for playback.\n")
else:
            print("\nStatus code: " + str(response.status_code) + "\nSomething went wrong. Check your subscription key and headers.\n")
   


Status code: 200
Your TTS is ready for playback.



Below you can play the audio file returned anad saved from the service.

In [16]:
from playsound import playsound
playsound(audio_file)