Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incompatibility with various Android keyboards; wrap vosk-android in IME service (especially for standalone use/accessibility) #32

Open
drew-sinha opened this issue Mar 25, 2023 · 3 comments

Comments

@drew-sinha
Copy link
Contributor

drew-sinha commented Mar 25, 2023

As of 5e02806*, the vosk service is fully functional/compatible with AnySoftKeyboard, but incompatible with OpenBoard and FlorisBoard**. Both of the latter use the inputmethodmanager framework as opposed to interacting with the speechrecognition service, and do not identify the vosk service as an input method.

Given that there is no open-source stt alternative to Google, etc. at time of posting (3/2023), relying on the SpeechRecognitionService is appropriate***. The vosk service works totally fine as a (de-facto) plugin for AnySoft. However, switching to/stuffing an IME service on top of vosk-android as a standalone service would be nice for accessibility (i.e. for those with disabilities due to which it would make sense to use voice as the IME). This isn't unreasonable given that Google already does this with speech services.

Without significant experience with IMM/IMEs, I think that this should be pretty straightforward: add an intermediate level activity on top of the vosk-recognition-service that can be forked off in the manifest as its own service. Then, the given keyboard can decide which service to latch onto for STT.

*additional configuration:
Build Configuration: Gradle Toolkit command-line, debug w/universal apk (compilesdk 33)
Gradle toolkit version: 7.6 (defaults despite the build kts depending on 7.2.2; no api level spec'd); builds against OpenJDK-14
Device: Pixel 3
Device OS: Android 12
Additional Device Apps: AnySoftKeyboard (v1.11.7137/F-droid; UTD)

**nothing special per se about these two keyboards. I chose them as the major open alternatives I've seen on reddit and f-droid. Of note, I haven't tested konele, but would be surprised if it wasn't compatible given @ccoreilly's efforts with localstt.

***I am hesitant to say that choosing that IMM/IME is better vs direct-speech recognition service, or on any ime designer's preference to use either. Per above, I don't think that either are incompatible per se, and can be construed to have separate use cases. Any thoughts would be appreciated. Tagging some people who may have some useful input: @patrickgold, @dslul, @ewheelerinc, @ildar, @Kaljurand, @Felicis. CC:@Stypox

Edit: misspelled kaljurand, added stypox.

@Kaljurand
Copy link

I think two general principles make sense:

The second principle is a bit problematic in the current Android, where it is not easy (for the end-user) to install multiple apps at once and use them in combination (regarding locating them in the app store, assigning permissions, etc.). So it unfortunately makes sense to bundle several independent services into a single app.

@nshmyrev
Copy link
Contributor

For IME thing there is also ElishaAz/Sayboard#25

@aboveagency
Copy link

This feature we're discussing will add much needed functionality for the growing number of people who want to use degoogled phones. For the sake of user experience and adoption I would suggest packaging the IME/Recognitionservice functionality into one app unless the services would impact each other.

In this way, the related FOSS keyboards (FlorisBoard, AnySoft) could point to this one app/project and drive more attention to it.

I'd also say that Sayboard (https://github.com/ElishaAz/Sayboard) may be the correct project to bundle all of the services together along with a voice keyboard.

The maintainer has suggested the app is supposed to be a companion voice IME (ElishaAz/Sayboard#4) so its heading that direction already.

From the user side, it makes sense to download one application that serves as an IME, Recognition Service, and a standalone voice keyboard with an apt name like Sayboard.

Looking forward to seeing the collaboration.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

4 participants