-
Notifications
You must be signed in to change notification settings - Fork 238
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Missing warmup process as k2-sherpa #134
Comments
Could you help test the online or offline websocket server in sherpa-onnx and see if the second request is much faster than the very first request? By the way, we have similar methods as k2-fsa/sherpa in sherpa-onnx
sherpa-onnx/sherpa-onnx/csrc/online-transducer-model.h Lines 44 to 45 in aa71087
sherpa-onnx/sherpa-onnx/csrc/online-transducer-model.h Lines 54 to 55 in aa71087
It is not difficult to add warmup to sherpa-onnx if you find it is helpful to reduce the latency of the first request.
You don't need to save the initial states during model exporting. The initial state is constructed in the code in sherpa-onnx. |
As I get better understanding of the code base, I realized that |
@csukuangfj @jingzhaoou Could you point me to code where model initialization is rewritten in C++. Didn't quite understand the point from previous comment. We see that, Warmup is still not part of the online websocket server. |
There is no warmup in sherpa-onnx, I think. |
In
k2-sherpa
, during warm-up, the encoder is initialized to the model initial states.However, I don't see similar warm-up process with
sherpa-onnx
. This may cause a bit worse performance at the beginning of a stream. It may be tricky to save the model initial states during ONNX export I guess.Appreciate any suggestions.
The text was updated successfully, but these errors were encountered: