Skip to content

ASR + diarization model server with speculative decoding

Notifications You must be signed in to change notification settings

plaggy/fast-whisper-server

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 

Repository files navigation

A blog post with the details of inner workings: https://huggingface.co/blog/asr-diarization

Use with a prebuilt image:

docker run --gpus all -p 7860:7860 --env-file .env ghcr.io/plaggy/asrdiarization-server:latest

and parametrize via .env:

ASR_MODEL=
DIARIZATION_MODEL=
ASSISTANT_MODEL=
HF_TOKEN=

Or build your own

Once deployed, send your audio with inference parameters like this:

import requests
import json
import aiohttp

# synchronous call
def sync_post():
    files = {"file": open("<path/to/audio>", "rb")}
    data = {"parameters": json.dumps({"batch_size": 1, "assisted": "true"})}
    resp = requests.post("<ENDPOINT_URL>", files=files, data=data)
    print(resp.json())

# asynchronous call
async def async_post():
    data = {
        "file": open("<path/to/audio>", "rb"),
        "parameters": json.dumps({"batch_size": 30})
    }
    async with aiohttp.ClientSession() as session:
        response = await session.post("<ENDPOINT_URL>", data=data)
        print(await response.json())

About

ASR + diarization model server with speculative decoding

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published