GitHub - plaggy/fast-whisper-server: ASR + diarization model server with speculative decoding

A blog post with the details of inner workings: https://huggingface.co/blog/asr-diarization

Use with a prebuilt image:

docker run --gpus all -p 7860:7860 --env-file .env ghcr.io/plaggy/asrdiarization-server:latest

and parametrize via .env:

ASR_MODEL=
DIARIZATION_MODEL=
ASSISTANT_MODEL=
HF_TOKEN=

Or build your own

Once deployed, send your audio with inference parameters like this:

import requests
import json
import aiohttp

# synchronous call
def sync_post():
    files = {"file": open("<path/to/audio>", "rb")}
    data = {"parameters": json.dumps({"batch_size": 1, "assisted": "true"})}
    resp = requests.post("<ENDPOINT_URL>", files=files, data=data)
    print(resp.json())

# asynchronous call
async def async_post():
    data = {
        "file": open("<path/to/audio>", "rb"),
        "parameters": json.dumps({"batch_size": 30})
    }
    async with aiohttp.ClientSession() as session:
        response = await session.post("<ENDPOINT_URL>", data=data)
        print(await response.json())

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
benchmarking		benchmarking
model-server		model-server
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

benchmarking

benchmarking

model-server

model-server

README.md

README.md

Repository files navigation

About

Releases

Packages

Languages

plaggy/fast-whisper-server

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Stars

Watchers

Forks

Languages