-
|
Running 50 seconds of audio with a GPU certainly has it's benefits over CPU but running either with the command line seems to take about 5 seconds to start. Is there a way to start up whisper, load the model and have it ready to read the audio file? |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
|
Because the command line needs to launch a fresh new process which will then read the model from the disk (it also does SHA256 hash check which may take a few seconds). Loading the model to the GPU should be faster but the CUDA runtime also takes some time to initialize. If you have multiple files, running In Python you can reuse the model returned by |
Beta Was this translation helpful? Give feedback.
Because the command line needs to launch a fresh new process which will then read the model from the disk (it also does SHA256 hash check which may take a few seconds). Loading the model to the GPU should be faster but the CUDA runtime also takes some time to initialize. If you have multiple files, running
whisper *.mp3will load the model only once and run faster than running the commands for each file. (See #153)In Python you can reuse the model returned by
load_model()to avoid the startup delay, and to overkill, keeping an API server running like in #132 will allow you to send request from either the command line or Python without the loading delay.