Implementing Speech-To-Text Service for other Apps #100

nebkrid · 2022-11-20T22:31:42Z

Implemented export of Speech-To-Text functionality for other Apps, which can call this by "startActivityForResult" with an "Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH)"

Extra "RecognizerIntent.EXTRA_PROMPT" is implemented

Stypox · 2022-11-21T10:40:16Z

Thank you!

There are some checkstyle issues, Android Studio should have reported the errors to you when building the app in theory.
I don't think you needed to create a skill, since skills interpret user sentences and react by providing some interesting output. What you are doing here is instead just recognizing speech. But I get it that it was simpler to connect with the already-implemented Vosk STT this way. For now this is ok, but once everything works well this part should be moved to a separate activity.
Were you able to test whether these changes work? Can you suggest a specific app or keyboard that would allow the RecognizerIntent.ACTION_RECOGNIZE_SPEECH intent to be tested?

…sult

nebkrid · 2022-11-23T18:54:57Z

Thanks for your feedback.

checkstyle: indeed I turned it off, because I had the same issue with the unchanged code, as described here When I checked the config the SuppressionSingleFilter was already added (of course, in this moment I realised who asked the question ;) ) Well seems, that the solution is not generally working. Do you may have news which are not posted there? Manually on and off turning would be really annoying...
Yes I agree that this should be moved to a seperate activity in future. Then it may be even possible to avoid that the voice model has to be loaded every time anew.
Test apps: Yes and no (at first): I first tested it with my own app, which was the trigger why I was looking for this feature. It already worked with the google speech recognition, and since it was working with the changes I thought it will work universally. However, when you asked for an example I tried with some more apps - and the result was mixed. I now understand that there are two result mechanisms which need to be served and implemented the second one. With the changes from today it is also working with other apps, I tested with openhab (but I don't know whether you can test it without an openhab server running) and with automate (using a "App decision?" Block with "android.speech.action.RECOGNIZE_SPEECH" action - something like this should be also a solution for tasker without explicitly having a tasker plugin available, I guess).
Generally there are more extras which can be served to the intent. I think I will implement the easy ones (where I can expect that they will work as described) within the next days, too. At least, currently there shouldn't be a braking one any more

nebkrid · 2022-12-08T20:48:46Z

Hi,
I manually reactived checkstyle and made the changes (there were still 3 left which I couldn't solve, don't know whether they even show up in your configuration).
Additionally, as working app examples: Google Maps and Ebay are using the android.speech.action.RECOGNIZE_SPEECH in their search field. Dicio is working with both, too. (When first requested, the android systems shows a popup to choose between Google speech input and dicio, as for other standard apps.)

PS: I left two "TODO" annotations. They are not really relevant, as all the apps I tested are not using these extra parameters (and even if they would be used, they are only an additional help but not required to use from the speech recognition). However, they are designed by android to be available, so if there is a possibility to pass this extras to the vosk speech engine, it may be helpful to add it. Therefore, I left them as a reminder so that this does not have to researched again. If this disturbs the code, I can remove them.

Stypox · 2022-12-13T11:22:28Z

@nebkrid thank you for the research! I really appreciated it. I opened #109 based on your implementation, but instead of creating a skill like you did, I created a separate stt activity that can popup on top of apps. I would like you to take a look at #109 and tell me whether it's fine, if you have some time. Thanks :-)

nebkrid · 2022-12-13T22:13:24Z

Thanks for your feedback, indeed the separate activity is much more beautiful. Luckily I already had time today to look on it. I added two small things - and if I correctly figured out how to work together with github, these should now pop up in your #109 (otherwise please let me know how to best merge this :) )

Implementing Speech-To-Text Service for other Apps

096d80b

STT Service, added pendingIntent Result as 2nd possibility to send re…

35cd465

…sult

codechecks formatting and some more extras added (e.g. confidence score)

10ffc67

nebkrid closed this Dec 8, 2022

nebkrid reopened this Dec 8, 2022

Stypox mentioned this pull request Dec 13, 2022

Speech to text service, also available to other apps #109

Merged

Stypox closed this Dec 13, 2022

nebkrid mentioned this pull request Dec 13, 2022

Added prompt message + preference for Auto-finish #111

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implementing Speech-To-Text Service for other Apps #100

Implementing Speech-To-Text Service for other Apps #100

nebkrid commented Nov 20, 2022

Stypox commented Nov 21, 2022

nebkrid commented Nov 23, 2022

nebkrid commented Dec 8, 2022

Stypox commented Dec 13, 2022

nebkrid commented Dec 13, 2022

Implementing Speech-To-Text Service for other Apps #100

Implementing Speech-To-Text Service for other Apps #100

Conversation

nebkrid commented Nov 20, 2022

Stypox commented Nov 21, 2022

nebkrid commented Nov 23, 2022

nebkrid commented Dec 8, 2022

Stypox commented Dec 13, 2022

nebkrid commented Dec 13, 2022