Skip to content

v0.4.0 — Voice chat (push-to-talk) + instant startup

Choose a tag to compare

@Gheop Gheop released this 10 Apr 15:27
· 177 commits to main since this release

Highlights

  • 🎤 Voice chat (push-to-talk) — talk to your cats instead of typing. Hold the mic button next to the chat entry, or simply hold Space while the entry is focused (like `/voice` in Claude Code). Release to auto-transcribe and send.
  • 🔒 100% local & private — STT runs on-device via faster-whisper. Nothing is sent to any cloud for transcription.
  • GPU auto-detection — CUDA float16 if available, CPU int8 otherwise.
  • 🎚️ Whisper model picker in Settings — 7 models from `tiny` (39 MB, CPU) to `large-v3` (1.5 GB, GPU), annotated with size and recommended device. Changes take effect immediately.
  • 🚀 Model preload at startup — if the selected model is cached, it loads in the background during cat spawn so the first recording is instant.
  • 🔑 Token refresh feedback — animated braille spinner + localized message during Claude OAuth refresh (no more silent `...`).
  • 🏃 Instant click on startup — fixed a 4–5 s freeze where clicking a cat did nothing during the first seconds. All heavy per-cat setup (sprite loading, anim offsets, pixel-scan, chat backend creation) now runs in background threads.

Install

```bash
pip install catai-linux

Optional: adds ~100 MB of deps (faster-whisper + CTranslate2) for voice chat

pip install catai-linux[voice]
```

Then enable voice via `--voice` CLI flag or the Settings checkbox.