Skip to content

brennv/assemblyai-python-sdk

Repository files navigation

assemblyai

Transcribe audio into text. Recognize made-up words and boost accuracy using custom language models.

Getting started

Run pip install and get an API token from https://assemblyai.com

pip install assemblyai

Quickstart

Start transcribing:

import assemblyai

aai = assemblyai.Client(token='your-secret-api-token')

transcript = aai.transcribe('https://example.com/sample.wav')

Get the completed transcript. Transcripts take about half the duration of the audio to complete.

while transcript.status != 'completed':
    transcript = transcript.get()

text = transcript.text

Custom models

The quickstart example transcribes audio using a generic English model.

In order to retain accuracy with unique word sets, create a custom model.

For this example, we create a model using a list of words/sentences found on a wikipedia page.

Create the custom model.

import assemblyai
import wikipedia

aai = assemblyai.Client(token='your-secret-api-token')

phrases = wikipedia.page("List of Pokemon characters").content.split('. ')

model = aai.train(phrases)

Check to see that the model has finished training -- models take about six minutes to complete.

while model.status != 'trained':
    model = model.get()

Reference the model when creating a transcript.

transcript = aai.transcribe('https://example.com/pokemon.wav', model=model)

Model and Transcript attributes

Prior models and transcripts can by called by ID.

model = aai.model.get(id=<id>)
transcript = aai.transcript.get(id=<id>)

To inspect additional attributes, use props():

model.props()

>>> ['headers',
>>>  'id',
>>>  'status',
>>>  'name',
>>>  'phrases',
>>>  'closed_domain',
>>>  'warning',
>>>  'dict']

transcript.props()

>>> ['headers',
>>>  'id',
>>>  'audio_url',
>>>  'model',
>>>  'status',
>>>  'warning',
>>>  'text',
>>>  'text_raw',
>>>  'confidence',
>>>  'segments',
>>>  'speaker_count',
>>>  'dict']

The dict attribute contains the raw API response:

model.dict
transcript.dict

For additional background see: https://docs.assemblyai.com

About

Python package for AssemblyAI

Resources

License

Stars

Watchers

Forks

Packages

No packages published