Ara 🦜

Overview

Ara is a script / api to transcribe ✍️ and diarize 📓 audio. The typical use case for this is transcribing audio from interviews, podcasts and anything where multiple people are speaking. The output is 'easy' to read (if you like .txt files), formatted so that speakers are clear for each segment.

It uses Whisper to transcribe the audio into text. It then uses Pyannote to diarize different speakers. Finally, it matches the segments from the two models and writes the output to file or returns it through the api.

Usage

Script

call the script like so:

python script.py -i input.wav -o output.txt -l English

Flask API

main.py defines a basic FastAPI with an endpoint for transcription Start the server

uvicorn main:app --reload

query

curl 127.0.0.1:8000/transcribe/sample_data.interview.wav

This can be useful for interacting with it through Docker, or deploying the code.

The repo comes with a Dockerfile, which makes it easier to deploy in a containerised way. build the docker, then run like so

sudo docker run -p 80:80 --gpus all <CONTAINER NAME>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Ara 🦜

Overview

Usage

Script

Flask API

Files

README.md

Latest commit

History

README.md

File metadata and controls

Ara 🦜

Overview

Usage

Script

Flask API