Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to use other (English for example) models? #3

Closed
kcpr opened this issue Apr 28, 2021 · 8 comments
Closed

How to use other (English for example) models? #3

kcpr opened this issue Apr 28, 2021 · 8 comments

Comments

@kcpr
Copy link

kcpr commented Apr 28, 2021

I downloaded the Vosk models from https://alphacephei.com/vosk/models (vosk-model-small-en-us-0.15 exactly) and added them into /app/src/main/assets/sync . I wanted to also add DeepSpeech models, but I couldn't find a .scorer smaller then 900 MB which seems just too much (the "apk" hed about 1 GB). Are those models required? Where to take a smaller "scorer" from? I experimented for quite a time (trying to use only Vosk models, also DeepSpeech once without "scorer" and so on) and installed app I built has been just crashing after some time after running. Do the folders need to have some specific names for example?

@kcpr
Copy link
Author

kcpr commented Apr 28, 2021

Ok, I think I figured it out. The DeepSpeech models are not required. I had to change the folder name of Vosk models to "vosk-catala".

I also discovered that in Android settings it's possible to choose which models to use, so I believe the other folder names may be possible, but using them would require choosing them in Android setting as "Kaldi/Vosk Recognizer" is selected by default. Maybe some more changes would be required.

@ccoreilly
Copy link
Owner

ccoreilly commented Apr 29, 2021

Hi kcpr! Great that you managed. The code is just a PoC which is why some stuff like the folder name is hard-coded. I currently have no time to turn this into a fully-fledged application but anyone willing to do it is welcome!

@kcpr
Copy link
Author

kcpr commented May 4, 2021

Hi. Ok, I understand. I think I wasn't aware that You treat it as a PoC. Thank You for the answer.

@ippocratis
Copy link

Ok, I think I figured it out. The DeepSpeech models are not required. I had to change the folder name of Vosk models to "vosk-catala".

I also discovered that in Android settings it's possible to choose which models to use, so I believe the other folder names may be possible, but using them would require choosing them in Android setting as "Kaldi/Vosk Recognizer" is selected by default. Maybe some more changes would be required.

Can share your English trained localsst apk?

@kcpr
Copy link
Author

kcpr commented Jul 29, 2021

Hi.

To be honest, I haven't had APKs prepared for sending I think. I'm including one APK I've generated while backing up my phone and two I build today which uses models I've found before. Sadly I haven't managed to confirm that any of them works...

I changed a ROM in a meantime, hasn't used the app until then and right now nothing from what I've tested has seemed to work on my phone: just an "Error loading recognizer" message appears and the application claims to be "Loading...". The reason could be that I do not have Google Play Services installed I think or because it's Android 11. I also tested it on my friend's Samsung phone with Android 11 and the error message does not appear, but it's also just hasn't stopped "Loading...". Also, on his phone I couldn't really find an option in settings to change used model (maybe that Samsung ROM just lacks this option, which isn't really nice in my opinion as an AOSP does have it, heh).

The other issue is that I think it's not really easy to find relatively small Vosk English models. Except the one (which I used to build the included APKs, but I'm not sure if it really works at all) I have had saved only some 3.2 GB one. I believe the generated APK would be rather too big for a regular Android application.
If You mange to find some other models though, You can link them here for example and I can try building an APK using them.

The "app-release-unsigned" and "app-debug" were created in the same build and "app-release-unsigned" may fail to install. Just use "app-debug" then.

I cannot include it as attachment to comment as files are too big: https://ufile.io/f/9qo0c .

@ippocratis
Copy link

ippocratis commented Jul 29, 2021

Hi.

To be honest, I haven't had APKs prepared for sending I think. I'm including one APK I've generated while backing up my phone and two I build today which uses models I've found before. Sadly I haven't managed to confirm that any of them works...

I changed a ROM in a meantime, hasn't used the app until then and right now nothing from what I've tested has seemed to work on my phone: just an "Error loading recognizer" message appears and the application claims to be "Loading...". The reason could be that I do not have Google Play Services installed I think or because it's Android 11. I also tested it on my friend's Samsung phone with Android 11 and the error message does not appear, but it's also just hasn't stopped "Loading...". Also, on his phone I couldn't really find an option in settings to change used model (maybe that Samsung ROM just lacks this option, which isn't really nice in my opinion as an AOSP does have it, heh).

The other issue is that I think it's not really easy to find relatively small Vosk English models. Except the one (which I used to build the included APKs, but I'm not sure if it really works at all) I have had saved only some 3.2 GB one. I believe the generated APK would be rather too big for a regular Android application.
If You mange to find some other models though, You can link them here for example and I can try building an APK using them.

The "app-release-unsigned" and "app-debug" were created in the same build and "app-release-unsigned" may fail to install. Just use "app-debug" then.

I cannot include it as attachment to comment as files are too big: https://ufile.io/f/vw453 .

Thanks a ton for sharing
Localsst ittself wont work as a voice input service.
As you said the pop up window stays at ''loading''
To make it usable use kōnele and on its ''speak and swipe keyboard'' and ''voice search panel'' setting locallsst registers as ''deepspeech recogniser'' and ''kaldi/vosk recogniser''
I tested your eng/pl apk and it works this way with anysoft and on browser, actualy everywhere. The kaldi/vosk is quite accurate

A small video to see it in action
https://t.me/microG/33453

The ''official'' eng vosk model you are refering to in your opening message is quite small (50MB) the installed apk size on disk is 230mb
the deepspeech models are big

@kcpr
Copy link
Author

kcpr commented Jul 29, 2021

Oh, I'm glad it works! You're welcome!

Localsst ittself wont work as a voice input service.
As you said the pop up window stays at ''loading''
To make it usable use kōnele and on its ''speak and swipe keyboard'' and ''voice search panel'' setting locallsst registers as ''deepspeech recogniser'' and ''kaldi/vosk recogniser''

Hmm, thank You for the information! I may give a try then. It seems quite weird to me though, because I do not recall having such problems before. Maybe I just forgot. ^^ Thank You for the video too.

The ''official'' eng vosk model you are refering to in your opening message is quite small (50MB) the installed apk size on disk is 230mb
the deepspeech models are big

You're probably right, I just compared the size of models I've downloaded before and I have "deepspeech-model-en-us" model which is about 100 MB big and "vosk-english-us" which has 3.2 GB. ;) I truly also have "model-en-us" which seems to also be a Vosk model and it has about 70 MB. I haven't checked it online though, sorry. And thanks again! ;)

@kcpr
Copy link
Author

kcpr commented Aug 23, 2021

Hi @ippocratis ,

I confirm that Kõnele truly works with LocalSTT for me! Thank You very much for it!

Right now I'd like to have some better models to handle the languages I use, but it looks kind of promising in general.
I also really appreciate that the app allows for stuff like quick whitespace addition or basic text navigation and it's possible to quickly switch between it and a regular keyboard (for anyone interested).

Thank You again!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants