Skip to content

Latest commit

 

History

History
44 lines (29 loc) · 2.39 KB

File metadata and controls

44 lines (29 loc) · 2.39 KB

How to use the Speech Services Batch Transcription API from Python

Download and install the API client library

To execute the sample you need to generate the Python library for the REST API which is generated through Swagger.

Follow these steps for the installation:

  1. Go to https://editor.swagger.io.
  2. Click File, then click Import URL.
  3. Enter the Swagger URL for the Speech Services API: https://raw.githubusercontent.com/Azure/azure-rest-api-specs/main/specification/cognitiveservices/data-plane/Speech/SpeechToText/stable/v3.1/speechtotext.json.
  4. Click Generate Client and select Python.
  5. Save the client library.
  6. Extract the downloaded python-client-generated.zip somewhere in your file system.
  7. Install the extracted python-client module in your Python environment using pip: pip install path/to/package/python-client.
  8. The installed package has the name swagger_client. You can check that the installation worked using the command python -c "import swagger_client".

Install other dependencies

The sample uses the requests library. You can install it with the command

pip install requests

Run the sample code

The sample code itself is main.py and can be run using Python 3.7 or higher. You will need to adapt the following information to run the sample:

  1. Your Cognitive Services subscription key and region.

    Some notes:

    • You can get the subscription key from the "Keys and Endpoint" tab on your Cognitive Services or Speech resource in the Azure Portal.
    • Batch transcription is only available for paid subscriptions, free subscriptions are not supported.
    • Please refer to this page for a complete list of region identifiers in the expected format.
  2. The URI of an audio recording in blob storage. Please refer to the Azure Storage documentation on information on how to authorize accesses against blob storage.

  3. (Optional:) The model ID of an adapted model, if you want to use a custom model.

  4. (Optional:) The URI of a container with audio files if you want to transcribe all of them with a single request.

You can use a development environment like Visual Studio Code to edit, debug, and execute the sample.