Ara 🦜

Overview

Ara is a script / api to transcribe ✍️ and diarize 📓 audio. The typical use case for this is transcribing audio from interviews, podcasts and anything where multiple people are speaking. The output is 'easy' to read (if you like .txt files), formatted so that speakers are clear for each segment.

It uses Whisper to transcribe the audio into text. It then uses Pyannote to diarize different speakers. Finally, it matches the segments from the two models and writes the output to file or returns it through the api.

Usage

Script

call the script like so:

python script.py -i input.wav -o output.txt -l English

Flask API

main.py defines a basic FastAPI with an endpoint for transcription Start the server

uvicorn main:app --reload

query

curl 127.0.0.1:8000/transcribe/sample_data.interview.wav

This can be useful for interacting with it through Docker, or deploying the code.

The repo comes with a Dockerfile, which makes it easier to deploy in a containerised way. build the docker, then run like so

sudo docker run -p 80:80 --gpus all <CONTAINER NAME>

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
ara		ara
server		server
.gitignore		.gitignore
README.md		README.md
pyproject.toml		pyproject.toml
script.py		script.py
test_script		test_script

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ara

ara

server

server

.gitignore

.gitignore

README.md

README.md

pyproject.toml

pyproject.toml

script.py

script.py

test_script

test_script

Repository files navigation

Ara 🦜

Overview

Usage

Script

Flask API

About

Releases

Packages

Languages

EdoardoPona/Ara

Folders and files

Latest commit

History

Repository files navigation

Ara 🦜

Overview

Usage

Script

Flask API

About

Topics

Resources

Stars

Watchers

Forks

Languages