v0.2.0

fcwu released this 03 May 07:54

· 26 commits to main since this release

90ad02e

Release v0.2.0

Notable changes since v0.1.2:

CLI

aureka download: pre-fetch all model weights (Kokoro + faster-whisper)
aureka benchmark: measure ASR/TTS/LLM speed and write a Markdown report
aureka type --no-streaming: opt out of streaming
aureka ui (settings UI via pywebview), aureka type/speak --speed

ASR

VAD-segmented streaming via silero-vad: partial transcripts arrive while
you are still talking
In refine/translate modes, partials are stderr-only and only the final
refined text is injected into the cursor
Phase events (transcribing / finalizing / refining) so the client can
show what the daemon is doing
faster-whisper model is now configurable via [asr] model in config.toml

TTS / Daemon

POST /speak endpoint: shared warm Kokoro pipeline across clients
POST /reload: re-read config without restarting the daemon
aureka speak is now daemon-aware and streams over HTTP when the daemon
is up
System tray menu (pystray) with daemon controls
Cross-platform launch-at-login (macOS launchd, Windows Task Scheduler)

LLM refine

Stricter system prompt + few-shot examples to suppress reasoning prelude
Detect truncated-mid-thinking responses and fall back to raw transcript
max_tokens / thinking_budget knobs in [llm] config

BREAKING

TheWhisper backend removed (the PyPI package was a placeholder, the
code path never actually loaded). [asr-thewhisper] extra dropped.
Default ASR model changed from large-v3 to medium. Override with
[asr] model = "large-v3" if you want the old precision.
aureka.models.MODEL_REGISTRY (dict) replaced by model_registry() (fn).
aureka.device.resolve_asr_backend removed.

Assets 2