Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory leakage and wish the cancel() without keeping current PartialResult #49

Closed
huynguyen82 opened this issue Apr 24, 2020 · 11 comments
Closed

Comments

@huynguyen82
Copy link

Hi Nickolay,
I am now testing with a model. Sizes of AM, LM, and i-vector extractor are 19MB,11MB, and 9MB. There are 2 issues:

  1. The memory after loading and initialize the model is about 140MB, but it keeps increasing after each recording and recognizing. It could be up to 500MB. I used the Profiler for measuring. It similar to "memory leakage on Android? #14", I think. But I did not find any solution.
  2. Is there any way to remove the current result when calling reconizer.cancel(). Because when people press a cancel button he/she wants to ignore or delete what that was spoken before.
@nshmyrev
Copy link
Collaborator

The memory after loading and initialize the model is about 140MB, but it keeps increasing after each recording and recognizing. It could be up to 500MB. I used the Profiler for measuring. It similar to "#14", I think. But I did not find any solution.

Ok, I will check some time later when I have time. A bit busy now.

Is there any way to remove the current result when calling reconizer.cancel(). Because when people press a cancel button he/she wants to ignore or delete what that was spoken before.

It is just about UI? Should resultView.setText(""); work if you put it in appropriate place?

@huynguyen82
Copy link
Author

huynguyen82 commented Apr 24, 2020

It is just about UI? Should resultView.setText(""); work if you put it in appropriate place?

Assume that we speak "Hello" and press a button to call recornizer.cancel(). Afterward, press button "Recording microphone" to call recognizer.startlistening() and we speak "my name is A" and wait until get the final result, then the output will be "Hello my name is A". It continued from the previous canceling point. So I mean that how we can remove "Hello" when calling cancel() at the previous step.

@nshmyrev
Copy link
Collaborator

Ok, I got your problem, can you verify this patch works:

diff --git a/android/src/main/java/org/kaldi/SpeechRecognizer.java b/android/src/main/java/org/kaldi/SpeechRecognizer.java
index d667281..dbb244e 100644
--- a/android/src/main/java/org/kaldi/SpeechRecognizer.java
+++ b/android/src/main/java/org/kaldi/SpeechRecognizer.java
@@ -149,7 +149,7 @@ public class SpeechRecognizer {
         boolean result = stopRecognizerThread();
         if (result) {
             Log.i(TAG, "Stop recognition");
-            mainHandler.post(new ResultEvent(recognizer.FinalResult(), true));
+            mainHandler.post(new ResultEvent(recognizer.Result(), true));
         }
         return result;
     }
@@ -162,6 +162,7 @@ public class SpeechRecognizer {
      */
     public boolean cancel() {
         boolean result = stopRecognizerThread();
+        recognizer.Result(); // Reset recognizer state
         if (result) {
             Log.i(TAG, "Cancel recognition");
         }

@huynguyen82
Copy link
Author

Thank you a lot!
The cancelling problem was solved. Hope that you will back to the memory leakage issue soon.

@huynguyen82
Copy link
Author

Hi Nickolay,
I replaced the lookahead model (HCLr.fst and Gr.fst) by HCLG.fst. The memory leakage problem was solved. RAM is stable and does not increase during recognition. But if use HCLG.fst with a large LM, the memory should be large. Can we applying lm rescoring? I think that the solution for the Android model with the purpose of a good enough WER and model size.

@nshmyrev
Copy link
Collaborator

I replaced the lookahead model (HCLr.fst and Gr.fst) by HCLG.fst. The memory leakage problem was solved. RAM is stable and does not increase during recognition. But if use HCLG.fst with a large LM, the memory should be large. Can we applying lm rescoring? I think that the solution for the Android model with the purpose of a good enough WER and model size.

Ok, I pushed stopListening change, thanks for testing.

It doesn't really matter what you use since hcl and g are expanded to hclg on the fly.

What model are you trying to use so you see 500Gb? Is it a default model or your custom one?

@nshmyrev
Copy link
Collaborator

What is the maximum memory usage you see with a default model?

@huynguyen82
Copy link
Author

huynguyen82 commented Apr 27, 2020

What model are you trying to use so you see 500Gb? Is it a default model or your custom one?

I trained a model with size of 67MB on disk, and about 160MB after loading on RAM. If HCLr and Gr are used, affer 1 hour decoding it can reach to 500MB. But it does not happen to the same model with HCLG.

What is the maximum memory usage you see with a default model?

I did not check this

@huynguyen82
Copy link
Author

huynguyen82 commented Apr 27, 2020

My model configuration:
Input: mfcc(40)+ivec (40)
Model: 3 CNN (in height=40, out height =40, filters=64) + 5TDNNF (Big dim is 512, small dim is 128)
output layer size: about 4000

@kas84
Copy link

kas84 commented May 31, 2020

I am still seeing the memory leak with the default model (have not tried any other model yet)
Here I attach two screencaptures of the app after 2 hours running.
Captura de pantalla 2020-05-31 a las 19 41 57
Captura de pantalla 2020-05-31 a las 19 42 08

The app had like 20 results, I mean, I wasn't talking for two hours, the app was just listening at silence.

Edit: Running it for 4 hours and 20 minutes got me to 1GB of RAM usage.
Captura de pantalla 2020-05-31 a las 21 54 59

@nshmyrev
Copy link
Collaborator

nshmyrev commented Jun 1, 2020

This should be fixed now with vosk 0.3.8

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

3 participants