Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate long loading time of LM #2384

Closed
lissyx opened this issue Sep 26, 2019 · 6 comments · Fixed by #2385
Closed

Investigate long loading time of LM #2384

lissyx opened this issue Sep 26, 2019 · 6 comments · Fixed by #2385
Assignees
Labels

Comments

@lissyx
Copy link
Collaborator

lissyx commented Sep 26, 2019

On some arch (RPi, Android) we can feel that there's a several-seconds delay when enabling the language model. This is only on first run. LM should already be mmap()'d, so we should investigate what happens, this cripples user experience.

@lissyx lissyx added the bug label Sep 26, 2019
@lissyx lissyx added this to To do in Deep Speech v0.6.0 via automation Sep 26, 2019
@lissyx lissyx self-assigned this Sep 26, 2019
@lissyx
Copy link
Collaborator Author

lissyx commented Sep 26, 2019

pi@raspberrypi:~/ds $ echo "1" | sudo tee /proc/sys/vm/drop_caches 
1
pi@raspberrypi:~/ds $ time ./deepspeech --model models/output_graph.tflite --alphabet models/alphabet.txt --lm models/lm.binary --trie models/trie --audio audio -t
TensorFlow: v1.14.0-16-g3b4ce37
DeepSpeech: v0.6.0-alpha.7-0-gf67818e
INFO: Initialized TensorFlow Lite runtime.
Running on directory audio
> audio/4507-16021-0012.wav
why should one halt on the way
cpu_time_overall=2.32796 cpu_time_decoding=0.00000 cpu_time_decodeall=0.00000
> audio/2830-3980-0043.wav
experienced proof less
cpu_time_overall=1.60453 cpu_time_decoding=0.00000 cpu_time_decodeall=0.00000
> audio/8455-210777-0068.wav
your power is sufficient i said
cpu_time_overall=2.15120 cpu_time_decoding=0.00000 cpu_time_decodeall=0.00000

real    0m49.527s
user    0m5.887s
sys     0m7.327s
pi@raspberrypi:~/ds $ time ./deepspeech --model models/output_graph.tflite --alphabet models/alphabet.txt --lm models/lm.binary --trie models/trie --audio audio -t
TensorFlow: v1.14.0-16-g3b4ce37
DeepSpeech: v0.6.0-alpha.7-0-gf67818e
INFO: Initialized TensorFlow Lite runtime.
Running on directory audio
> audio/4507-16021-0012.wav
why should one halt on the way
cpu_time_overall=2.14526 cpu_time_decoding=0.00000 cpu_time_decodeall=0.00000
> audio/2830-3980-0043.wav
experienced proof less
cpu_time_overall=1.60353 cpu_time_decoding=0.00000 cpu_time_decodeall=0.00000
> audio/8455-210777-0068.wav
your power is sufficient i said
cpu_time_overall=2.15414 cpu_time_decoding=0.00000 cpu_time_decodeall=0.00000

real    0m6.823s
user    0m5.893s
sys     0m0.931s

@lissyx
Copy link
Collaborator Author

lissyx commented Sep 26, 2019

We have 6 seconds of difference on just loading the LM on cold caches:

pi@raspberrypi:~/ds $ echo "1" | sudo tee /proc/sys/vm/drop_caches                                                                                                                                                                                                                                                                                                                                               
1
pi@raspberrypi:~/ds $ time ./deepspeech --model models/output_graph.tflite --alphabet models/alphabet.txt --lm models/lm.binary --trie models/trie --audio audio -t                                                                                                                                                                                                                                              
TensorFlow: v1.14.0-16-g3b4ce374f5
DeepSpeech: v0.6.0-alpha.7-8-g513c8e9
INFO: Initialized TensorFlow Lite runtime.
setup_time=6.82576
lm_time=6.8248
trie_time=0.000511
Running on directory audio
> audio/4507-16021-0012.wav
why should one halt on the way
cpu_time_overall=2.29800 cpu_time_decoding=0.00000 cpu_time_decodeall=0.00000
> audio/2830-3980-0043.wav
experienced proof less
cpu_time_overall=1.58624 cpu_time_decoding=0.00000 cpu_time_decodeall=0.00000
> audio/8455-210777-0068.wav
your power is sufficient i said
cpu_time_overall=2.12006 cpu_time_decoding=0.00000 cpu_time_decodeall=0.00000

real    0m49.456s
user    0m5.859s
sys     0m7.298s
pi@raspberrypi:~/ds $ time ./deepspeech --model models/output_graph.tflite --alphabet models/alphabet.txt --lm models/lm.binary --trie models/trie --audio audio -t
TensorFlow: v1.14.0-16-g3b4ce374f5
DeepSpeech: v0.6.0-alpha.7-8-g513c8e9
INFO: Initialized TensorFlow Lite runtime.
setup_time=0.784094
lm_time=0.783789
trie_time=0.000177
Running on directory audio
> audio/4507-16021-0012.wav
why should one halt on the way
cpu_time_overall=2.11127 cpu_time_decoding=0.00000 cpu_time_decodeall=0.00000
> audio/2830-3980-0043.wav
experienced proof less
cpu_time_overall=1.57743 cpu_time_decoding=0.00000 cpu_time_decodeall=0.00000
> audio/8455-210777-0068.wav
your power is sufficient i said
cpu_time_overall=2.12058 cpu_time_decoding=0.00000 cpu_time_decodeall=0.00000

real    0m6.734s
user    0m5.812s
sys     0m0.922s

@lissyx
Copy link
Collaborator Author

lissyx commented Sep 26, 2019

Using LoadMethod::LAZY:

pi@raspberrypi:~/ds $ echo "1" | sudo tee /proc/sys/vm/drop_caches 
1
pi@raspberrypi:~/ds $ time ./deepspeech --model models/output_graph.tflite --alphabet models/alphabet.txt --lm models/lm.binary --trie models/trie --audio audio -t
TensorFlow: v1.14.0-16-g3b4ce374f5
DeepSpeech: v0.6.0-alpha.7-8-g513c8e9
INFO: Initialized TensorFlow Lite runtime.
create_model_time=0.00818
setup_time=0.002471
lm_time=0.001803
trie_time=0.000473
Running on directory audio
> audio/4507-16021-0012.wav
why should one halt on the way
cpu_time_overall=3.43854 cpu_time_decoding=0.00000 cpu_time_decodeall=0.00000
> audio/2830-3980-0043.wav
experienced proof less
cpu_time_overall=1.63052 cpu_time_decoding=0.00000 cpu_time_decodeall=0.00000
> audio/8455-210777-0068.wav
your power is sufficient i said
cpu_time_overall=2.31658 cpu_time_decoding=0.00000 cpu_time_decodeall=0.00000

real    0m18.207s
user    0m6.506s
sys     0m0.968s
pi@raspberrypi:~/ds $ time ./deepspeech --model models/output_graph.tflite --alphabet models/alphabet.txt --lm models/lm.binary --trie models/trie --audio audio -t
TensorFlow: v1.14.0-16-g3b4ce374f5
DeepSpeech: v0.6.0-alpha.7-8-g513c8e9
INFO: Initialized TensorFlow Lite runtime.
create_model_time=0.004335
setup_time=0.000772
lm_time=0.000382
trie_time=0.000247
Running on directory audio
> audio/4507-16021-0012.wav
why should one halt on the way
cpu_time_overall=2.17790 cpu_time_decoding=0.00000 cpu_time_decodeall=0.00000
> audio/2830-3980-0043.wav
experienced proof less
cpu_time_overall=1.58182 cpu_time_decoding=0.00000 cpu_time_decodeall=0.00000
> audio/8455-210777-0068.wav
your power is sufficient i said
cpu_time_overall=2.13153 cpu_time_decoding=0.00000 cpu_time_decodeall=0.00000

real    0m5.988s
user    0m5.857s
sys     0m0.130s

lissyx pushed a commit to lissyx/STT that referenced this issue Sep 26, 2019
@lissyx
Copy link
Collaborator Author

lissyx commented Sep 26, 2019

@reuben Tested on Pixel 2, after rebooting, using SpeechModule (part of androidspeech): no more huge seconds blocking of the UI when starting inference.

lissyx pushed a commit to lissyx/STT that referenced this issue Sep 26, 2019
Deep Speech v0.6.0 automation moved this from To do to Done Sep 26, 2019
@lissyx
Copy link
Collaborator Author

lissyx commented Sep 27, 2019

FTR, also checking with htop on RPi4 with 4GB of RAM:

  • prior to the change, memory usage reported by htop would be ~45%
  • after the change, it would only report ~8%.

@lock
Copy link

lock bot commented Oct 27, 2019

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot locked and limited conversation to collaborators Oct 27, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
No open projects
Development

Successfully merging a pull request may close this issue.

1 participant