added triton config for streaming zipformer #430

uni-saurabh-vyas · 2023-07-07T08:48:41Z

Based on #412

yuekaizhang · 2023-07-07T08:52:35Z

Many thanks. Would you mind adding a script such that we could lauch the streaming service with one button? e.g. build_librispeech_pruned_transducer_stateless7_streaming.sh

yuekaizhang · 2023-07-07T08:54:38Z

triton/model_repo_streaming_zipformer/encoder/config.pbtxt.template

Would you mind using soft links for duplicate files from https://github.com/k2-fsa/sherpa/tree/master/triton/model_repo_streaming this repo?

uni-saurabh-vyas · 2023-07-07T09:16:28Z

Yes give me one day, will push everything and add a README.md also

…former in 'single click'

uni-saurabh-vyas · 2023-07-07T12:03:23Z

@yuekaizhang I have done requested changes, and tested with bash script, triton server is up and running with all modules without any issues, also tested via decode_manifest client script I get ASR outputs.

Also, I feel readme file is not required, as build script already lists all the steps to start the server, users should be able to follow easily. Let me know if anything else required.

yuekaizhang · 2023-07-07T12:10:55Z

triton/model_repo_streaming_zipformer/encoder/1/.gitignore

It should be a .gitkeep file.

yuekaizhang · 2023-07-07T12:14:11Z

All config.pbtxt, config.pbtxt.template and model.py files under model_repo_streaming_zipformer/ might be same with files in model_repo_streaming except for encoder/config.pbtxt.template. Would you mind checking them? (diff file1 file2 first, then decide if we need a soft link.)

The bash scripts look nice. Thanks again.

yuekaizhang · 2023-07-07T12:15:08Z

triton/model_repo_streaming_zipformer/decoder/config.pbtxt.template

check model_repo_streaming/decoder/config.pbtxt.template

yuekaizhang · 2023-07-07T12:16:07Z

triton/model_repo_streaming_zipformer/feature_extractor/1/model.py

check model_repo_streaming/feature_extractor/1/model.py

uni-saurabh-vyas · 2023-07-07T12:16:20Z

I already checked one by one, they are not same, I made some changes(for example changed type to FP32, to make it compatible with pretrained model), there was tensor name change for encoder outputs(x and x_lens from speech and speech_lengths) in python file for feature_extractor.py

The only same file is scorer/1/model.py

yuekaizhang · 2023-07-07T12:16:28Z

triton/model_repo_streaming_zipformer/feature_extractor/config.pbtxt.template

check model_repo_streaming/feature_extractor

yuekaizhang · 2023-07-07T12:16:52Z

triton/model_repo_streaming_zipformer/joiner/config.pbtxt.template

check check model_repo_streaming/joiner

yuekaizhang · 2023-07-07T12:18:11Z

triton/model_repo_streaming_zipformer/transducer/config.pbtxt.template

This might be duplicate with model_repo_streaming/transducer.

There were some minor changes in this like changing tensor names(to make it compatible with input and output names of onnx export script (export.py))

yuekaizhang · 2023-07-07T12:18:40Z

triton/model_repo_streaming_zipformer/scorer/config.pbtxt.template

check model_repo_streaming/scorer

yuekaizhang · 2023-07-07T12:26:28Z

I already checked one by one, they are not same, I made some changes(for example changed type to FP32, to make it compatible with pretrained model), there was tensor name change for encoder outputs(x and x_lens from speech and speech_lengths) in python file for feature_extractor.py

The only same file is scorer/1/model.py

I see. https://github.com/k2-fsa/icefall/blob/master/egs/librispeech/ASR/pruned_transducer_stateless7_streaming/export.py#L811-L820 Actually, you could set --fp16 to export fp16.onnx. Then the whole model_repo could infer with fp16, which could increase throughput a lot. I was wondering if you would like try with fp16.

uni-saurabh-vyas · 2023-07-07T12:28:41Z

Yes sure, I will try with that, didnt use that option before, will update the script(s) accordingly to make it as generic as possible

yuekaizhang · 2023-07-07T12:29:15Z

Also, it would be better if we make a PR https://github.com/k2-fsa/icefall/blob/master/egs/librispeech/ASR/pruned_transducer_stateless7_streaming/export.py#L377-L394 to change the str "x" into "speech" "x_length" into "speech_lenghts" such that we could reuse the feature_extractor part codes.

yuekaizhang · 2023-07-07T12:35:30Z

Also, it would be better if we make a PR https://github.com/k2-fsa/icefall/blob/master/egs/librispeech/ASR/pruned_transducer_stateless7_streaming/export.py#L377-L394 to change the str "x" into "speech" "x_length" into "speech_lenghts" such that we could reuse the feature_extractor part codes.

Or https://github.com/k2-fsa/sherpa/tree/master/triton/model_repo_streaming/feature_extractor change files here and scripts/export_onnx.py, from "speech" to "x", if this way is more convenient for you.

uni-saurabh-vyas · 2023-07-07T12:36:50Z

I am still a little bit confused as to why we have so many different export onnx scripts ?

yuekaizhang · 2023-07-07T12:42:20Z

I am still a little bit confused as to why we have so many different export onnx scripts ?

Ideally, we would like reuse the scripts for sherpa-onnx project. However, several minor differences block it e.g. if processed_len is regarded as a state tensor for streaming ASR.

The WAR for now is to use scripts/build_X bash scripts to avoid confusion. Also, the difference between conformer and zipformer let us have to keep separte model_repo files and export scripts.

…, changed tensor names(speech and speech_lengths) in export_onnx.py to x and x_lens, and also same thing in model_repo_streaming/feature_extractor and model_repo_streaming/transducer

uni-saurabh-vyas · 2023-07-08T05:46:10Z

Added the required changes.

Also tested out with fp16, the script works, however Triton gives a warning when loading.

I0708 04:34:38.286388 11446 model_lifecycle.cc:694] successfully loaded 'feature_extractor' version 1
2023-07-08 04:34:38.472352679 [W:onnxruntime:, constant_folding.cc:179 ApplyImpl] Could not find a CPU kernel and hence
 can't constant fold Pow node '/Pow_1'
2023-07-08 04:34:38.472412110 [W:onnxruntime:, constant_folding.cc:179 ApplyImpl] Could not find a CPU kernel and hence
 can't constant fold Pow node '/Pow'
2023-07-08 04:34:38.598780814 [W:onnxruntime:, constant_folding.cc:179 ApplyImpl] Could not find a CPU kernel and hence
 can't constant fold Pow node '/Pow_1'
2023-07-08 04:34:38.598831005 [W:onnxruntime:, constant_folding.cc:179 ApplyImpl] Could not find a CPU kernel and hence
 can't constant fold Pow node '/Pow'
2023-07-08 04:34:38.612756773 [W:onnxruntime:, constant_folding.cc:179 ApplyImpl] Could not find a CPU kernel and hence
 can't constant fold Pow node '/Pow_1'
2023-07-08 04:34:38.612800814 [W:onnxruntime:, constant_folding.cc:179 ApplyImpl] Could not find a CPU kernel and hence
 can't constant fold Pow node '/Pow'
2023-07-08 04:34:39.567382515 [W:onnxruntime:, session_state.cc:1030 VerifyEachNodeIsAssignedToAnEp] Some nodes were no
t assigned to the preferred execution providers which may or may not have an negative impact on performance. e.g. ORT e
xplicitly assigns shape related ops to CPU to improve perf.

yuekaizhang · 2023-07-10T02:01:34Z

Added the required changes.

Also tested out with fp16, the script works, however Triton gives a warning when loading.

I0708 04:34:38.286388 11446 model_lifecycle.cc:694] successfully loaded 'feature_extractor' version 1
2023-07-08 04:34:38.472352679 [W:onnxruntime:, constant_folding.cc:179 ApplyImpl] Could not find a CPU kernel and hence
 can't constant fold Pow node '/Pow_1'
2023-07-08 04:34:38.472412110 [W:onnxruntime:, constant_folding.cc:179 ApplyImpl] Could not find a CPU kernel and hence
 can't constant fold Pow node '/Pow'
2023-07-08 04:34:38.598780814 [W:onnxruntime:, constant_folding.cc:179 ApplyImpl] Could not find a CPU kernel and hence
 can't constant fold Pow node '/Pow_1'
2023-07-08 04:34:38.598831005 [W:onnxruntime:, constant_folding.cc:179 ApplyImpl] Could not find a CPU kernel and hence
 can't constant fold Pow node '/Pow'
2023-07-08 04:34:38.612756773 [W:onnxruntime:, constant_folding.cc:179 ApplyImpl] Could not find a CPU kernel and hence
 can't constant fold Pow node '/Pow_1'
2023-07-08 04:34:38.612800814 [W:onnxruntime:, constant_folding.cc:179 ApplyImpl] Could not find a CPU kernel and hence
 can't constant fold Pow node '/Pow'
2023-07-08 04:34:39.567382515 [W:onnxruntime:, session_state.cc:1030 VerifyEachNodeIsAssignedToAnEp] Some nodes were no
t assigned to the preferred execution providers which may or may not have an negative impact on performance. e.g. ORT e
xplicitly assigns shape related ops to CPU to improve perf.

This warning is fine.

yuekaizhang · 2023-07-10T02:03:58Z

triton/scripts/export_onnx.py

@@ -390,16 +390,16 @@ def export_encoder_model_onnx_triton(
        verbose=False,
        opset_version=opset_version,
        input_names=[
-            "speech",
-            "speech_lengths",
+            "x",


This function's input names are for offline ASR, which should keep unchanged.

…der export

yuekaizhang · 2023-07-10T06:07:52Z

Many thanks @uni-saurabh-vyas

added triton config for streaming zipformer

406eb9f

uni-saurabh-vyas mentioned this pull request Jul 7, 2023

Triton streaming support for old zipformer(pruned stateless 7 streaming) #412

Open

yuekaizhang requested changes Jul 7, 2023

View reviewed changes

added symlink(s) and created bash script to run triton server for zip…

f3901a2

…former in 'single click'

yuekaizhang reviewed Jul 7, 2023

View reviewed changes

triton/model_repo_streaming_zipformer/encoder/1/.gitignore Outdated

Copy link

Collaborator

yuekaizhang Jul 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be a .gitkeep file.

changed gitignore to gitkeep

d6a98be

yuekaizhang reviewed Jul 7, 2023

View reviewed changes

triton/model_repo_streaming_zipformer/joiner/config.pbtxt.template Outdated

Copy link

Collaborator

yuekaizhang Jul 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

check check model_repo_streaming/joiner

yuekaizhang reviewed Jul 7, 2023

View reviewed changes

triton/model_repo_streaming_zipformer/scorer/config.pbtxt.template Outdated

Copy link

Collaborator

yuekaizhang Jul 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

check model_repo_streaming/scorer

refactored build script, reused conformer config scripts via symlinks…

f2e0976

…, changed tensor names(speech and speech_lengths) in export_onnx.py to x and x_lens, and also same thing in model_repo_streaming/feature_extractor and model_repo_streaming/transducer

yuekaizhang reviewed Jul 10, 2023

View reviewed changes

reverted to old tensor names(input and output) for non streaming enco…

b0f9aa9

…der export

yuekaizhang approved these changes Jul 10, 2023

View reviewed changes

yuekaizhang merged commit 478b4bc into k2-fsa:master Jul 10, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

added triton config for streaming zipformer #430

added triton config for streaming zipformer #430

uni-saurabh-vyas commented Jul 7, 2023

yuekaizhang commented Jul 7, 2023

yuekaizhang Jul 7, 2023

uni-saurabh-vyas commented Jul 7, 2023

uni-saurabh-vyas commented Jul 7, 2023

yuekaizhang Jul 7, 2023

yuekaizhang commented Jul 7, 2023

yuekaizhang Jul 7, 2023

yuekaizhang Jul 7, 2023

uni-saurabh-vyas commented Jul 7, 2023

yuekaizhang Jul 7, 2023

yuekaizhang Jul 7, 2023

yuekaizhang Jul 7, 2023

uni-saurabh-vyas Jul 7, 2023

yuekaizhang Jul 7, 2023

yuekaizhang commented Jul 7, 2023

uni-saurabh-vyas commented Jul 7, 2023

yuekaizhang commented Jul 7, 2023

yuekaizhang commented Jul 7, 2023

uni-saurabh-vyas commented Jul 7, 2023

yuekaizhang commented Jul 7, 2023 •

edited

Loading

uni-saurabh-vyas commented Jul 8, 2023

yuekaizhang commented Jul 10, 2023

yuekaizhang Jul 10, 2023

yuekaizhang commented Jul 10, 2023

added triton config for streaming zipformer #430

added triton config for streaming zipformer #430

Conversation

uni-saurabh-vyas commented Jul 7, 2023

yuekaizhang commented Jul 7, 2023

Choose a reason for hiding this comment

uni-saurabh-vyas commented Jul 7, 2023

uni-saurabh-vyas commented Jul 7, 2023

Choose a reason for hiding this comment

yuekaizhang commented Jul 7, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

uni-saurabh-vyas commented Jul 7, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

yuekaizhang commented Jul 7, 2023

uni-saurabh-vyas commented Jul 7, 2023

yuekaizhang commented Jul 7, 2023

yuekaizhang commented Jul 7, 2023

uni-saurabh-vyas commented Jul 7, 2023

yuekaizhang commented Jul 7, 2023 • edited Loading

uni-saurabh-vyas commented Jul 8, 2023

yuekaizhang commented Jul 10, 2023

Choose a reason for hiding this comment

yuekaizhang commented Jul 10, 2023

yuekaizhang commented Jul 7, 2023 •

edited

Loading