Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ONNX bug #1634

Closed
AlexandderGorodetski opened this issue May 21, 2024 · 9 comments
Closed

ONNX bug #1634

AlexandderGorodetski opened this issue May 21, 2024 · 9 comments

Comments

@AlexandderGorodetski
Copy link

Guys,

I got following during ONNX decoding

File "/opt/conda/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "./zipformer/onnx_pretrained.py", line 393, in main
encoder_out, encoder_out_lens = model.run_encoder(features, feature_lengths)
File "./zipformer/onnx_pretrained.py", line 199, in run_encoder
out = self.encoder.run(
File "/opt/conda/lib/python3.8/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 188, in run
return self._sess.run(output_names, input_feed, run_options)
onnxruntime.capi.onnxruntime_pybind11_state.InvalidArgument: [ONNXRuntimeError] : 2 : INVALID_ARGUMENT : Non-zero status code returned while running If node. Name:'/encoder/0/encoder_pos/If' Status Message: Non-zero status code returned while running ScatterND node. Name:'/encoder/0/encoder_pos/ScatterND_2' Status Message: invalid indice found, indice = -1

Any help is appreciated.

Thanks,
AlexG.

@csukuangfj
Copy link
Collaborator

Please tell us the exact command you are using and also please tell us the duration of your test wave.

@AlexandderGorodetski
Copy link
Author

The duration of test waves is 30sec.
I use following command for the decoding.

export PYTHONPATH='/workspace/inputs/alexg/asr/src/models/k2_2024/icefall/egs/tedlium3/ASR/zipformer:/workspace/inputs/alexg/asr/src/models/k2_2024/icefall:$PYTHONPATH'

export CUDA_VISIBLE_DEVICES="0"

python ./zipformer/onnx_pretrained.py
--encoder-model-filename zipformer/exp/encoder-epoch-50-avg-1.onnx
--decoder-model-filename zipformer/exp/decoder-epoch-50-avg-1.onnx
--joiner-model-filename zipformer/exp/joiner-epoch-50-avg-1.onnx
--tokens data/lang_bpe_500/tokens.txt
/workspace/inputs/alexg/asr/src/projects/en_eval/input/test_1.wav

@csukuangfj
Copy link
Collaborator

Could you use a shorter wave, e.g., less than 10 seconds or 20 seconds ?

@AlexandderGorodetski
Copy link
Author

Great.
Well done.

For 10 seconds ONNX decoder works properly. Is it possible to add support to 30 sec or should I have to update my VAD so that it will not produce segments longer than 10sec?

@csukuangfj
Copy link
Collaborator

Is it possible to add support to 30 sec

Yes, absolutely.

Please change


to a larger value, re-export your model, and re-try.

@AlexandderGorodetski
Copy link
Author

I changed the value from 1,000 to 10,000 and it did not help.

I found that maximal time that I can work with is 20sec. For 21sec I already have an error.

Maybe I should change this value during the training, maybe this value is saved somewhere in the model?

@csukuangfj
Copy link
Collaborator

There must be some constant value about the length of some positional encoding vector in the code. You need to find and change it.

@AlexandderGorodetski
Copy link
Author

You are right. max_len currently represents time of 20 sec. It can be increased to 2,000 and then maximal time will be increase to 40 sec. But it is important to perform this change BEFORE exporting the ONNX model.

Thank you so much, this issue can be closed.

@AlexandderGorodetski
Copy link
Author

Completed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants