# Cognitive Services Speech APIs

## Speech Service

### Speech to text (speech recognition)

Microsoft speech recognition API transcribes audio streams into text that your application can display to the user or act upon as command input. It provides two ways for developers to add Speech to their apps: REST APIs or Websocket-based client libraries.

**For this code to run, you need create a valid subscription key.
If you want to create a test key for free, have a look [here](https://azure.microsoft.com/en-us/try/cognitive-services/my-apis/).**

In [None]:
import requests
import json
import IPython.display as ipd

api_key = "" # PASTE SPEECH SERVICE API KEY HERE

url = "https://northeurope.tts.speech.microsoft.com/cognitiveservices/v1"
token_url = "https://northeurope.api.cognitive.microsoft.com/sts/v1.0/issueToken"

headers = {'Ocp-Apim-Subscription-Key': api_key}
response = requests.post(token_url, headers=headers)
token = response.text

print("Token: " + token)

In [None]:
headers = {'Authorization': token,
           'Content-Type': 'application/ssml+xml',
           'User-Agent': 'Test',
           'X-Microsoft-OutputFormat': 'riff-16khz-16bit-mono-pcm'}

data = "<speak version='1.0' xmlns='http://www.w3.org/2001/10/synthesis' xml:lang='de-DE'> \
<voice name='Microsoft Server Speech Text to Speech Voice (de-DE, HeddaRUS)'> \
    Hallo beisammen, das ist der neue Service fuer Sprachausgabe!  \
</voice></speak>"

response = requests.post(url, headers=headers, data=data)
audio_data = response.content

print(response.headers)

In [None]:
with open("test.wav", "wb") as f: 
    f.write(audio_data)
    
ipd.Audio('test.wav')

### Text to speech (speech synthesis)

Text to Speech APIs use REST to convert structured text to an audio stream. The APIs provide fast text to speech conversion in various voices and languages. In addition users also have the ability to change audio characteristics like pronunciation, volume, pitch etc. using SSML tags.

In [None]:
import requests
import json

api_key = "" # PASTE SPEECH SERVICE API KEY HERE

token_url = "https://northeurope.api.cognitive.microsoft.com/sts/v1.0/issueToken"
headers = {'Ocp-Apim-Subscription-Key': api_key}
response = requests.post(token_url, headers=headers)
token = response.text

print("Token: " + token)

url = "https://northeurope.stt.speech.microsoft.com/speech/recognition/conversation/cognitiveservices/v1"

headers = {'Authorization': 'Bearer ' + token,
           'Accept': 'application/json',
           'Ocp-Apim-Subscription-Key': api_key,
           'Content-Type': 'audio/wav; codec=audio/pcm; samplerate=16000'}

params = {'language': 'de-DE', 'format': 'detailed'}


data = ''
with open("test.wav", 'rb') as f:
    data = f.read();

response = requests.post(url, headers=headers, params=params, data=data)
print(response)
print(json.dumps(response.json(), indent=2))

## Bing Speech API

Rather use new Speech Service APIs and SDKs. :)