Skip to content
Branch: master
Find file Copy path
Find file Copy path
Fetching contributors…
Cannot retrieve contributors at this time
75 lines (55 sloc) 2.28 KB

Azure Speech to Text

This MLHub package provides a quick demonstration of the pre-built Speech to Text model provided through Azure's Cognitive Services. This service takes an audio signal and transcribes it to return the text.

An Azure subscription is required and a free Azure subscription allowing up to 5,000 transactions per month is available from Once set up visit and Create a resource under AI and Machine Learning called Speech Services. Once created you can access the web API subscription key from the portal by visiting the resource and choosing the Keys link. The key will be prompted for in the demo.

Please note that this is closed source software which limits your freedoms and has no guarantee of ongoing availability.

Visit the github repository for more details:

The Python code is based on the Azure Speech Services Quick Start for Python


  • To install mlhub
$ pip3 install mlhub
  • To install and run the pre-built model:
$ ml install   azspeech2txt
$ ml configure azspeech2txt
$ ml do        azspeech2txt

Interactive Use

$ ml do azspeech2txt 
Speech to Text

Welcome to a demo of the pre-built models for Speech to Text provided
through Azure's Cognitive Services. This cloud service accepts audio
and then converts that into text which it returns locally.

The following file has been found and is assumed to contain
an Azure Speech Services subscription key and region. We will load 
the file and use this information.


Say something...

> Recognized: Welcome to a demo of the prebuilt models for speech to
> text provided through azure's cognitive services. This cloud service 
> accepts audio and then converts that into text, which it returns locally.

Thank you for exploring the 'azspeech2txt' model.

As you can see I read the first paragraph from the screen and the Azure Speech to Text service was quite accurate in its transcription. It is quite suitable, for example, to be used as a dictation tool.

You can’t perform that action at this time.