Summary
Allow configuring a custom Speech-to-Text endpoint (URL + model + auth header) so users can route voice transcription to a self-hosted Whisper / Faster-Whisper / whisper.cpp server instead of OpenAI / ChatGPT.
Motivation
Voice transcription is currently hardcoded to OpenAI / ChatGPT endpoints:
apps/ios/Sources/Litter/Models/VoiceTranscriptionManager.swift — https://chatgpt.com/backend-api/transcribe and https://api.openai.com/v1/audio/transcriptions.
apps/android/app/src/main/java/com/litter/android/state/VoiceTranscriptionManager.kt — same two endpoints, model field gpt-4o-mini-transcribe.
For users who:
- Run their own Whisper/whisper.cpp/Faster-Whisper server on their LAN or VPN,
- Are subject to data-residency / privacy constraints,
- Don't want voice audio leaving their network,
- Or already pay for a different STT vendor (Deepgram, AssemblyAI, etc. with OpenAI-compatible endpoints),
…there is no way to redirect transcription without forking the apps.
Concrete ask
Per-server (or global) STT config with at least:
stt.endpoint — full URL of the transcription endpoint.
stt.model — model field sent in the multipart form (default gpt-4o-mini-transcribe for OpenAI-compat servers, configurable).
stt.auth_header — optional, e.g. Authorization: Bearer <token> for self-hosted servers behind an auth proxy. Empty/None for open LAN deployments.
- (Optional)
stt.disabled for users who want to suppress voice transcription entirely.
When set, the existing transcribe(wav:authMethod:token:) path uses these instead of the hardcoded ChatGPT/OpenAI URLs. The OpenAI multipart contract (form fields file, model, plus standard transcription params) is the same as what whisper.cpp's server example and many self-hosted Whisper wrappers already implement, so for OpenAI-compatible servers no protocol change is needed — only URL/model/auth.
For genuinely non-OpenAI-shaped APIs (e.g. raw whisper.cpp), a small adapter layer per provider could come later; the immediate win is OpenAI-compatible endpoint redirection.
Why this is worth solving in Litter
Alternatives considered
- MITM proxy on the network rewriting
chatgpt.com/api.openai.com to a local server — fragile, needs a custom CA on the device, breaks the moment Litter pins certs.
- Fork the apps — defeats the point of using the upstream client.
Happy to test on iOS/Android beta channels once landed.
Summary
Allow configuring a custom Speech-to-Text endpoint (URL + model + auth header) so users can route voice transcription to a self-hosted Whisper / Faster-Whisper / whisper.cpp server instead of OpenAI / ChatGPT.
Motivation
Voice transcription is currently hardcoded to OpenAI / ChatGPT endpoints:
apps/ios/Sources/Litter/Models/VoiceTranscriptionManager.swift—https://chatgpt.com/backend-api/transcribeandhttps://api.openai.com/v1/audio/transcriptions.apps/android/app/src/main/java/com/litter/android/state/VoiceTranscriptionManager.kt— same two endpoints, model fieldgpt-4o-mini-transcribe.For users who:
…there is no way to redirect transcription without forking the apps.
Concrete ask
Per-server (or global) STT config with at least:
stt.endpoint— full URL of the transcription endpoint.stt.model— model field sent in the multipart form (defaultgpt-4o-mini-transcribefor OpenAI-compat servers, configurable).stt.auth_header— optional, e.g.Authorization: Bearer <token>for self-hosted servers behind an auth proxy. Empty/None for open LAN deployments.stt.disabledfor users who want to suppress voice transcription entirely.When set, the existing
transcribe(wav:authMethod:token:)path uses these instead of the hardcoded ChatGPT/OpenAI URLs. The OpenAI multipart contract (form fieldsfile,model, plus standard transcription params) is the same as what whisper.cpp'sserverexample and many self-hosted Whisper wrappers already implement, so for OpenAI-compatible servers no protocol change is needed — only URL/model/auth.For genuinely non-OpenAI-shaped APIs (e.g. raw whisper.cpp), a small adapter layer per provider could come later; the immediate win is OpenAI-compatible endpoint redirection.
Why this is worth solving in Litter
VoiceTranscriptionManager.{swift,kt}.Alternatives considered
chatgpt.com/api.openai.comto a local server — fragile, needs a custom CA on the device, breaks the moment Litter pins certs.Happy to test on iOS/Android beta channels once landed.