Investigate long loading time of LM #2384

lissyx · 2019-09-26T12:23:00Z

On some arch (RPi, Android) we can feel that there's a several-seconds delay when enabling the language model. This is only on first run. LM should already be mmap()'d, so we should investigate what happens, this cripples user experience.

The text was updated successfully, but these errors were encountered:

lissyx · 2019-09-26T14:27:54Z

pi@raspberrypi:~/ds $ echo "1" | sudo tee /proc/sys/vm/drop_caches 
1
pi@raspberrypi:~/ds $ time ./deepspeech --model models/output_graph.tflite --alphabet models/alphabet.txt --lm models/lm.binary --trie models/trie --audio audio -t
TensorFlow: v1.14.0-16-g3b4ce37
DeepSpeech: v0.6.0-alpha.7-0-gf67818e
INFO: Initialized TensorFlow Lite runtime.
Running on directory audio
> audio/4507-16021-0012.wav
why should one halt on the way
cpu_time_overall=2.32796 cpu_time_decoding=0.00000 cpu_time_decodeall=0.00000
> audio/2830-3980-0043.wav
experienced proof less
cpu_time_overall=1.60453 cpu_time_decoding=0.00000 cpu_time_decodeall=0.00000
> audio/8455-210777-0068.wav
your power is sufficient i said
cpu_time_overall=2.15120 cpu_time_decoding=0.00000 cpu_time_decodeall=0.00000

real    0m49.527s
user    0m5.887s
sys     0m7.327s
pi@raspberrypi:~/ds $ time ./deepspeech --model models/output_graph.tflite --alphabet models/alphabet.txt --lm models/lm.binary --trie models/trie --audio audio -t
TensorFlow: v1.14.0-16-g3b4ce37
DeepSpeech: v0.6.0-alpha.7-0-gf67818e
INFO: Initialized TensorFlow Lite runtime.
Running on directory audio
> audio/4507-16021-0012.wav
why should one halt on the way
cpu_time_overall=2.14526 cpu_time_decoding=0.00000 cpu_time_decodeall=0.00000
> audio/2830-3980-0043.wav
experienced proof less
cpu_time_overall=1.60353 cpu_time_decoding=0.00000 cpu_time_decodeall=0.00000
> audio/8455-210777-0068.wav
your power is sufficient i said
cpu_time_overall=2.15414 cpu_time_decoding=0.00000 cpu_time_decodeall=0.00000

real    0m6.823s
user    0m5.893s
sys     0m0.931s

lissyx · 2019-09-26T14:35:19Z

We have 6 seconds of difference on just loading the LM on cold caches:

pi@raspberrypi:~/ds $ echo "1" | sudo tee /proc/sys/vm/drop_caches                                                                                                                                                                                                                                                                                                                                               
1
pi@raspberrypi:~/ds $ time ./deepspeech --model models/output_graph.tflite --alphabet models/alphabet.txt --lm models/lm.binary --trie models/trie --audio audio -t                                                                                                                                                                                                                                              
TensorFlow: v1.14.0-16-g3b4ce374f5
DeepSpeech: v0.6.0-alpha.7-8-g513c8e9
INFO: Initialized TensorFlow Lite runtime.
setup_time=6.82576
lm_time=6.8248
trie_time=0.000511
Running on directory audio
> audio/4507-16021-0012.wav
why should one halt on the way
cpu_time_overall=2.29800 cpu_time_decoding=0.00000 cpu_time_decodeall=0.00000
> audio/2830-3980-0043.wav
experienced proof less
cpu_time_overall=1.58624 cpu_time_decoding=0.00000 cpu_time_decodeall=0.00000
> audio/8455-210777-0068.wav
your power is sufficient i said
cpu_time_overall=2.12006 cpu_time_decoding=0.00000 cpu_time_decodeall=0.00000

real    0m49.456s
user    0m5.859s
sys     0m7.298s
pi@raspberrypi:~/ds $ time ./deepspeech --model models/output_graph.tflite --alphabet models/alphabet.txt --lm models/lm.binary --trie models/trie --audio audio -t
TensorFlow: v1.14.0-16-g3b4ce374f5
DeepSpeech: v0.6.0-alpha.7-8-g513c8e9
INFO: Initialized TensorFlow Lite runtime.
setup_time=0.784094
lm_time=0.783789
trie_time=0.000177
Running on directory audio
> audio/4507-16021-0012.wav
why should one halt on the way
cpu_time_overall=2.11127 cpu_time_decoding=0.00000 cpu_time_decodeall=0.00000
> audio/2830-3980-0043.wav
experienced proof less
cpu_time_overall=1.57743 cpu_time_decoding=0.00000 cpu_time_decodeall=0.00000
> audio/8455-210777-0068.wav
your power is sufficient i said
cpu_time_overall=2.12058 cpu_time_decoding=0.00000 cpu_time_decodeall=0.00000

real    0m6.734s
user    0m5.812s
sys     0m0.922s

lissyx · 2019-09-26T14:48:20Z

Using LoadMethod::LAZY:

pi@raspberrypi:~/ds $ echo "1" | sudo tee /proc/sys/vm/drop_caches 
1
pi@raspberrypi:~/ds $ time ./deepspeech --model models/output_graph.tflite --alphabet models/alphabet.txt --lm models/lm.binary --trie models/trie --audio audio -t
TensorFlow: v1.14.0-16-g3b4ce374f5
DeepSpeech: v0.6.0-alpha.7-8-g513c8e9
INFO: Initialized TensorFlow Lite runtime.
create_model_time=0.00818
setup_time=0.002471
lm_time=0.001803
trie_time=0.000473
Running on directory audio
> audio/4507-16021-0012.wav
why should one halt on the way
cpu_time_overall=3.43854 cpu_time_decoding=0.00000 cpu_time_decodeall=0.00000
> audio/2830-3980-0043.wav
experienced proof less
cpu_time_overall=1.63052 cpu_time_decoding=0.00000 cpu_time_decodeall=0.00000
> audio/8455-210777-0068.wav
your power is sufficient i said
cpu_time_overall=2.31658 cpu_time_decoding=0.00000 cpu_time_decodeall=0.00000

real    0m18.207s
user    0m6.506s
sys     0m0.968s
pi@raspberrypi:~/ds $ time ./deepspeech --model models/output_graph.tflite --alphabet models/alphabet.txt --lm models/lm.binary --trie models/trie --audio audio -t
TensorFlow: v1.14.0-16-g3b4ce374f5
DeepSpeech: v0.6.0-alpha.7-8-g513c8e9
INFO: Initialized TensorFlow Lite runtime.
create_model_time=0.004335
setup_time=0.000772
lm_time=0.000382
trie_time=0.000247
Running on directory audio
> audio/4507-16021-0012.wav
why should one halt on the way
cpu_time_overall=2.17790 cpu_time_decoding=0.00000 cpu_time_decodeall=0.00000
> audio/2830-3980-0043.wav
experienced proof less
cpu_time_overall=1.58182 cpu_time_decoding=0.00000 cpu_time_decodeall=0.00000
> audio/8455-210777-0068.wav
your power is sufficient i said
cpu_time_overall=2.13153 cpu_time_decoding=0.00000 cpu_time_decodeall=0.00000

real    0m5.988s
user    0m5.857s
sys     0m0.130s

Fixes mozilla#2384

lissyx · 2019-09-26T16:18:45Z

@reuben Tested on Pixel 2, after rebooting, using SpeechModule (part of androidspeech): no more huge seconds blocking of the UI when starting inference.

Fixes mozilla#2384

lissyx · 2019-09-27T08:41:13Z

FTR, also checking with htop on RPi4 with 4GB of RAM:

prior to the change, memory usage reported by htop would be ~45%
after the change, it would only report ~8%.

lock · 2019-10-27T09:16:45Z

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

lissyx added the bug label Sep 26, 2019

lissyx added this to To do in Deep Speech v0.6.0 via automation Sep 26, 2019

lissyx self-assigned this Sep 26, 2019

lissyx pushed a commit to lissyx/STT that referenced this issue Sep 26, 2019

Load KenLM with LAZY

74042a2

Fixes mozilla#2384

lissyx mentioned this issue Sep 26, 2019

Load KenLM with LAZY #2385

Merged

lissyx pushed a commit to lissyx/STT that referenced this issue Sep 26, 2019

Load KenLM with LAZY

86b44a7

Fixes mozilla#2384

lissyx closed this as completed in #2385 Sep 26, 2019

Deep Speech v0.6.0 automation moved this from To do to Done Sep 26, 2019

lissyx mentioned this issue Oct 5, 2019

Replace structs with IntPtr .NET #2400

Merged

lock bot locked and limited conversation to collaborators Oct 27, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Investigate long loading time of LM #2384

Investigate long loading time of LM #2384

lissyx commented Sep 26, 2019

lissyx commented Sep 26, 2019

lissyx commented Sep 26, 2019

lissyx commented Sep 26, 2019

lissyx commented Sep 26, 2019

lissyx commented Sep 27, 2019

lock bot commented Oct 27, 2019

Investigate long loading time of LM #2384

Investigate long loading time of LM #2384

Comments

lissyx commented Sep 26, 2019

lissyx commented Sep 26, 2019

lissyx commented Sep 26, 2019

lissyx commented Sep 26, 2019

lissyx commented Sep 26, 2019

lissyx commented Sep 27, 2019

lock bot commented Oct 27, 2019