Skip to content

MattSegal/parakeet-server

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Parakeet ASR Server

Local speech-to-text API using FluidAudio's Parakeet CoreML models. Achieves ~100-150x realtime transcription on Apple Silicon via the Neural Engine.

Requirements

  • macOS 14.0+ (Sonoma)
  • Apple Silicon (M1/M2/M3/M4)
  • Xcode Command Line Tools (xcode-select --install)
  • Python 3.11+ with uv

Setup

1. Build the transcribe binary

./scripts/build-transcribe.sh

This clones FluidAudio, adds the TranscribeCLI wrapper, and builds a release binary to bin/transcribe. The script is idempotent.

2. Download models (first run only)

The first transcription downloads ~1-2GB of CoreML models from HuggingFace. Do this once manually to avoid a slow first request:

./bin/transcribe data/test-2.wav

3. Install Python dependencies

uv sync

4. Run the server

uv run uvicorn server:app --host 0.0.0.0 --port 8765 --workers 2

Usage

Health check

curl http://localhost:8765/health

Transcribe audio

curl -X POST http://localhost:8765/transcribe -F "file=@data/test-1.wav"
curl -X POST http://localhost:8765/transcribe -F "file=@data/test-2.wav"

Response:

{
  "text": "Full transcription text",
  "segments": [
    { "start": 0.0, "end": 2.5, "text": "First sentence." },
    { "start": 2.5, "end": 5.0, "text": "Second sentence." }
  ],
  "confidence": 0.98,
  "rtfx": 155.0,
  "processing_time": 0.08
}

Redeploying

After pulling changes:

# If pyproject.toml changed
uv sync

# If scripts/TranscribeCLI.swift changed
./scripts/build-transcribe.sh

# Restart to pick up changes
./scripts/service.sh restart

For most changes (server.py), just restart is enough.

Running as a launchd service

./scripts/service.sh install   # Generate plist, install, and start
./scripts/service.sh status    # Check if running + health check
./scripts/service.sh stop
./scripts/service.sh start
./scripts/service.sh restart
./scripts/service.sh uninstall # Stop and remove

API

Endpoint Method Description
/health GET Health check
/transcribe POST Transcribe uploaded audio (wav, mp3, m4a, flac)
/docs GET Swagger UI

Architecture

HTTP Request → FastAPI (server.py) → subprocess → bin/transcribe → CoreML Neural Engine

The server shells out to the Swift binary for each transcription. The binary loads FluidAudio's Parakeet TDT v3 model and runs inference on the Apple Neural Engine.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published