iOS Platform Support - Keyboard Extension #967
Replies: 4 comments 6 replies
-
|
It will not be part of this project, however if you want the start of some source code I have a full swift project built out that is partially working already I didn't use transcribe rs and ended up building native bindings in swift to the models I wanted |
Beta Was this translation helpful? Give feedback.
-
|
Thanks for the quick reply! Good to know on the repo structure. I'd love to take a look at your Swift project if you're open to sharing it, even partially. The architecture decisions alone would be really valuable, specifically:
Happy to collaborate or continue independently, just want to avoid re-discovering the same dead ends. |
Beta Was this translation helpful? Give feedback.
-
let me put up the repo and ill edit this reply |
Beta Was this translation helpful? Give feedback.
-
|
Thanks a lot for sharing this, really generous of you, especially given how detailed your answers were on the architecture constraints. The two-process design, the GPU limitation in background threads, and the Moonshine/ONNX choice all saved me from going down dead ends. I've cloned the repo and read through the full codebase. The architecture is solid, DictationBridge state machine, the session keepalive, audio pipeline with sample rate conversion, it's a great reference even if you call it slop. Quick question on licensing: the repo doesn't have a license file. Would you be open to adding one (MIT or similar)? I'd like to either fork this as a starting point or rebuild from scratch using the same architecture, and I want to make sure I'm doing it right. Either way I'd credit this repo as the original reference. Also curious, you mentioned you'd pick different (bigger) models in hindsight since the memory limit doesn't apply to the containing app. Have you looked at Parakeet TDT via ONNX, or are you sticking with Moonshine's newer streaming models (medium-streaming looks interesting at 245M params, lower WER than Whisper Large v3)? |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi! I'm interested in contributing iOS support for Handy as a custom keyboard extension.
The idea is essentially what Wispr Flow does on iOS, but fully offline and open-source: a system-wide keyboard that you install once, enable in Settings, and then use from any app. You tap the mic button on the keyboard, speak, and text appears directly in the input field. No app-switching, no clipboard gymnastics.
This means building an iOS keyboard extension (
UIInputViewController) with Full Access enabled for microphone access, powered bytranscribe-rsfor on-device transcription.Prior mobile discussion for context:
transcribe-rsThe codebase already has some Tauri 2.x mobile markers (
mobile_entry_pointinlib.rs, platform-conditional deps inCargo.toml), but a keyboard extension is a fundamentally different architecture from the desktop app.Repo structure:
Given how different this is from the desktop app (native Swift/SwiftUI, keyboard extension lifecycle, no Tauri webview), would you prefer:
transcribe-rsas a libraryAny thoughts on feasibility or interest?
Thanks!
Beta Was this translation helpful? Give feedback.
All reactions