Missing warmup process as k2-sherpa #134

jingzhaoou · 2023-04-28T22:17:57Z

In k2-sherpa, during warm-up, the encoder is initialized to the model initial states.

  void WarmUp(torch::Tensor features, torch::Tensor features_length) {
    torch::IValue states = GetEncoderInitStates();
    states = StackStates({states});

However, I don't see similar warm-up process with sherpa-onnx. This may cause a bit worse performance at the beginning of a stream. It may be tricky to save the model initial states during ONNX export I guess.

Appreciate any suggestions.

The text was updated successfully, but these errors were encountered:

csukuangfj · 2023-04-29T05:55:16Z

Could you help test the online or offline websocket server in sherpa-onnx and see if the second request is much faster than the very first request?

By the way, we have similar methods as k2-fsa/sherpa in sherpa-onnx

sherpa-onnx/sherpa-onnx/csrc/online-transducer-model.h

Line 61 in aa71087

virtual std::vector<Ort::Value> GetEncoderInitStates() = 0;

sherpa-onnx/sherpa-onnx/csrc/online-transducer-model.h

Lines 44 to 45 in aa71087

    
           virtual std::vector<Ort::Value> StackStates( 
        
               const std::vector<std::vector<Ort::Value>> &states) const = 0;

sherpa-onnx/sherpa-onnx/csrc/online-transducer-model.h

Lines 54 to 55 in aa71087

    
           virtual std::vector<std::vector<Ort::Value>> UnStackStates( 
        
               const std::vector<Ort::Value> &states) const = 0;

It is not difficult to add warmup to sherpa-onnx if you find it is helpful to reduce the latency of the first request.

It may be tricky to save the model initial states during ONNX export I guess.

You don't need to save the initial states during model exporting. The initial state is constructed in the code in sherpa-onnx.
We only need some model parameters to construct the initial states.

jingzhaoou · 2023-05-16T15:50:40Z

As I get better understanding of the code base, I realized that sherpa-onnx rewrites the Python initialization code in C++. The warm-up process is actually already in place. I am closing this PR.

uni-sagar-raikar · 2023-09-05T11:00:51Z

@csukuangfj @jingzhaoou Could you point me to code where model initialization is rewritten in C++. Didn't quite understand the point from previous comment. We see that, Warmup is still not part of the online websocket server.

csukuangfj · 2023-09-05T11:02:25Z

There is no warmup in sherpa-onnx, I think.

jingzhaoou closed this as completed May 16, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Missing warmup process as k2-sherpa #134

Missing warmup process as k2-sherpa #134

jingzhaoou commented Apr 28, 2023

csukuangfj commented Apr 29, 2023

jingzhaoou commented May 16, 2023

uni-sagar-raikar commented Sep 5, 2023

csukuangfj commented Sep 5, 2023

Missing warmup process as k2-sherpa #134

Missing warmup process as k2-sherpa #134

Comments

jingzhaoou commented Apr 28, 2023

csukuangfj commented Apr 29, 2023

jingzhaoou commented May 16, 2023

uni-sagar-raikar commented Sep 5, 2023

csukuangfj commented Sep 5, 2023