Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Model optimization fails with Protobuf serialization failed error #20371

Closed
tuhinpahari opened this issue Apr 18, 2024 · 6 comments
Closed

Model optimization fails with Protobuf serialization failed error #20371

tuhinpahari opened this issue Apr 18, 2024 · 6 comments
Labels
converter related to ONNX converters model:transformer issues related to a transformer model: BERT, GPT2, Hugging Face, Longformer, T5, etc.

Comments

@tuhinpahari
Copy link

tuhinpahari commented Apr 18, 2024

Describe the issue

Exported a Codellama (codellama/CodeLlama-7b-hf) model.
I tried to optimize the float model using the ORT optimizer (https://github.com/microsoft/onnxruntime/blob/v1.17.0/onnxruntime/python/tools/transformers/optimizer.py) with the below command
python optimizer.py --input <input_model> --output <out_dir> --use_external_data_format

Getting the below error

onnxruntime.capi.onnxruntime_pybind11_state.InvalidProtobuf: [ONNXRuntimeError] : 7 : INVALID_PROTOBUF : Protobuf serialization failed.

To reproduce

Export the codellama/CodeLlama-7b-hf ONNX model
optimum-cli export onnx --model codellama/CodeLlama-7b-hf codellama --no-post-process

Clone the onnxruntime repo (https://github.com/microsoft/onnxruntime/tree/v1.17.0)

cd onnxruntime/python/tools/transformers

Execute the python command
python optimizer.py --input <input_model> --output <out_dir> --use_external_data_format

Urgency

Basic funtionality is not working resulting in project delay

Platform

Linux

OS Version

CENTOS 7.4

ONNX Runtime Installation

Released Package

ONNX Runtime Version or Commit ID

1.17.0

ONNX Runtime API

Python

Architecture

X86

Execution Provider

Default CPU

Execution Provider Library Version

No response

@carzh carzh added converter related to ONNX converters model:transformer issues related to a transformer model: BERT, GPT2, Hugging Face, Longformer, T5, etc. labels Apr 18, 2024
@carzh
Copy link
Contributor

carzh commented Apr 18, 2024

Could you provide the stack trace for protobuf serialization error?

@tuhinpahari
Copy link
Author

tuhinpahari commented Apr 18, 2024

Could you provide the stack trace for protobuf serialization error?

Traceback (most recent call last):
File "/scratch/tuhinp/onnxruntime/onnxruntime/python/tools/transformers/optimizer.py", line 610, in
main()
File "/scratch/tuhinp/onnxruntime/onnxruntime/python/tools/transformers/optimizer.py", line 573, in main
optimizer = optimize_model(
File "/scratch/tuhinp/onnxruntime/onnxruntime/python/tools/transformers/optimizer.py", line 379, in optimize_model
temp_model_path = optimize_by_onnxruntime(
File "/scratch/tuhinp/onnxruntime/onnxruntime/python/tools/transformers/optimizer.py", line 204, in optimize_by_onnxruntime
onnxruntime.InferenceSession(onnx_model, sess_options, providers=providers, **kwargs)
File "/scratch/tuhinp/miniconda3/envs/x/lib/python3.9/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 419, in init
self._create_inference_session(providers, provider_options, disabled_optimizers)
File "/scratch/tuhinp/miniconda3/envs/x/lib/python3.9/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 463, in _create_inference_session
sess.initialize_session(providers, provider_options, disabled_optimizers)
onnxruntime.capi.onnxruntime_pybind11_state.InvalidProtobuf: [ONNXRuntimeError] : 7 : INVALID_PROTOBUF : Protobuf serialization failed.

@carzh
Copy link
Contributor

carzh commented Apr 19, 2024

Hello, give the following commands a try:

cd onnxruntime/onnxruntime/python/tools/transformers/
python3 optimizer.py --input /path/to/<filename>.onnx --output /path/to/<filename>.onnx --model_type gpt2 --num_heads <number of attention heads> --hidden_size <attention hidden size> --use_external_data_format --opt_level 0

You can also try the convert_to_onnx tool for Llama, which will convert & optimize in the same script.

Thanks @kunal-vaishnavi for the suggestions :)

@tuhinpahari
Copy link
Author

tuhinpahari commented Apr 22, 2024

Thanks @carzh and @kunal-vaishnavi for the suggestion. This command works for codellama.

I tried to optimize Qwen/Qwen1.5-7B-Chat onnx model with the same optimizer.py script but getting "Segmentation fault".
I used the same command as mentioned above
python3 optimizer.py --input /path/to/<filename>.onnx --output /path/to/<filename>.onnx --model_type gpt2 --num_heads <number of attention heads> --hidden_size <attention hidden size> --use_external_data_format --opt_level 0

Can you please help me on this?

@kunal-vaishnavi
Copy link
Contributor

Can you clone ORT from the main branch and try again? I can run the ORT transformer optimizer successfully with the following steps.

git clone https://github.com/microsoft/onnxruntime
cd onnxruntime/onnxruntime/python/tools/transformers
optimum-cli export onnx --model Qwen/Qwen1.5-7B-Chat ./qwen1.5 --no-post-process
mkdir -p ./qwen1.5-opt
python3 optimizer.py --input ./qwen1.5/model.onnx --output ./qwen1.5-opt/model_opt.onnx --model_type gpt2 --num_heads 32 --hidden_size 4096 --use_external_data_format --opt_level 0

@yuslepukhin
Copy link
Member

You can also run onnx.check_model to get more information on the nature of protobuf issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
converter related to ONNX converters model:transformer issues related to a transformer model: BERT, GPT2, Hugging Face, Longformer, T5, etc.
Projects
None yet
Development

No branches or pull requests

4 participants