Model optimization fails with Protobuf serialization failed error #20371

tuhinpahari · 2024-04-18T16:29:04Z

Describe the issue

Exported a Codellama (codellama/CodeLlama-7b-hf) model.
I tried to optimize the float model using the ORT optimizer (https://github.com/microsoft/onnxruntime/blob/v1.17.0/onnxruntime/python/tools/transformers/optimizer.py) with the below command
python optimizer.py --input <input_model> --output <out_dir> --use_external_data_format

Getting the below error

onnxruntime.capi.onnxruntime_pybind11_state.InvalidProtobuf: [ONNXRuntimeError] : 7 : INVALID_PROTOBUF : Protobuf serialization failed.

To reproduce

Export the codellama/CodeLlama-7b-hf ONNX model
optimum-cli export onnx --model codellama/CodeLlama-7b-hf codellama --no-post-process

Clone the onnxruntime repo (https://github.com/microsoft/onnxruntime/tree/v1.17.0)

cd onnxruntime/python/tools/transformers

Execute the python command
python optimizer.py --input <input_model> --output <out_dir> --use_external_data_format

Urgency

Basic funtionality is not working resulting in project delay

Platform

Linux

OS Version

CENTOS 7.4

ONNX Runtime Installation

Released Package

ONNX Runtime Version or Commit ID

1.17.0

ONNX Runtime API

Python

Architecture

X86

Execution Provider

Default CPU

Execution Provider Library Version

No response

carzh · 2024-04-18T17:24:46Z

Could you provide the stack trace for protobuf serialization error?

tuhinpahari · 2024-04-18T18:19:55Z

Could you provide the stack trace for protobuf serialization error?

Traceback (most recent call last):
File "/scratch/tuhinp/onnxruntime/onnxruntime/python/tools/transformers/optimizer.py", line 610, in
main()
File "/scratch/tuhinp/onnxruntime/onnxruntime/python/tools/transformers/optimizer.py", line 573, in main
optimizer = optimize_model(
File "/scratch/tuhinp/onnxruntime/onnxruntime/python/tools/transformers/optimizer.py", line 379, in optimize_model
temp_model_path = optimize_by_onnxruntime(
File "/scratch/tuhinp/onnxruntime/onnxruntime/python/tools/transformers/optimizer.py", line 204, in optimize_by_onnxruntime
onnxruntime.InferenceSession(onnx_model, sess_options, providers=providers, **kwargs)
File "/scratch/tuhinp/miniconda3/envs/x/lib/python3.9/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 419, in init
self._create_inference_session(providers, provider_options, disabled_optimizers)
File "/scratch/tuhinp/miniconda3/envs/x/lib/python3.9/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 463, in _create_inference_session
sess.initialize_session(providers, provider_options, disabled_optimizers)
onnxruntime.capi.onnxruntime_pybind11_state.InvalidProtobuf: [ONNXRuntimeError] : 7 : INVALID_PROTOBUF : Protobuf serialization failed.

carzh · 2024-04-19T21:02:49Z

Hello, give the following commands a try:

cd onnxruntime/onnxruntime/python/tools/transformers/
python3 optimizer.py --input /path/to/<filename>.onnx --output /path/to/<filename>.onnx --model_type gpt2 --num_heads <number of attention heads> --hidden_size <attention hidden size> --use_external_data_format --opt_level 0

You can also try the convert_to_onnx tool for Llama, which will convert & optimize in the same script.

Thanks @kunal-vaishnavi for the suggestions :)

tuhinpahari · 2024-04-22T15:02:52Z

Thanks @carzh and @kunal-vaishnavi for the suggestion. This command works for codellama.

I tried to optimize Qwen/Qwen1.5-7B-Chat onnx model with the same optimizer.py script but getting "Segmentation fault".
I used the same command as mentioned above
python3 optimizer.py --input /path/to/<filename>.onnx --output /path/to/<filename>.onnx --model_type gpt2 --num_heads <number of attention heads> --hidden_size <attention hidden size> --use_external_data_format --opt_level 0

Can you please help me on this?

kunal-vaishnavi · 2024-04-23T15:43:02Z

Can you clone ORT from the main branch and try again? I can run the ORT transformer optimizer successfully with the following steps.

git clone https://github.com/microsoft/onnxruntime
cd onnxruntime/onnxruntime/python/tools/transformers
optimum-cli export onnx --model Qwen/Qwen1.5-7B-Chat ./qwen1.5 --no-post-process
mkdir -p ./qwen1.5-opt
python3 optimizer.py --input ./qwen1.5/model.onnx --output ./qwen1.5-opt/model_opt.onnx --model_type gpt2 --num_heads 32 --hidden_size 4096 --use_external_data_format --opt_level 0

yuslepukhin · 2024-04-26T17:30:37Z

You can also run onnx.check_model to get more information on the nature of protobuf issue.

carzh added converter related to ONNX converters model:transformer issues related to a transformer model: BERT, GPT2, Hugging Face, Longformer, T5, etc. labels Apr 18, 2024

AuntyEms mentioned this issue Apr 21, 2024

### Describe the issue Aunty-Em-s/Me#1

Closed

kunal-vaishnavi closed this as completed May 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Model optimization fails with Protobuf serialization failed error #20371

Model optimization fails with Protobuf serialization failed error #20371

tuhinpahari commented Apr 18, 2024 •

edited

Loading

carzh commented Apr 18, 2024

tuhinpahari commented Apr 18, 2024 •

edited

Loading

carzh commented Apr 19, 2024

tuhinpahari commented Apr 22, 2024 •

edited

Loading

kunal-vaishnavi commented Apr 23, 2024

yuslepukhin commented Apr 26, 2024

Model optimization fails with Protobuf serialization failed error #20371

Model optimization fails with Protobuf serialization failed error #20371

Comments

tuhinpahari commented Apr 18, 2024 • edited Loading

Describe the issue

To reproduce

Urgency

Platform

OS Version

ONNX Runtime Installation

ONNX Runtime Version or Commit ID

ONNX Runtime API

Architecture

Execution Provider

Execution Provider Library Version

carzh commented Apr 18, 2024

tuhinpahari commented Apr 18, 2024 • edited Loading

carzh commented Apr 19, 2024

tuhinpahari commented Apr 22, 2024 • edited Loading

kunal-vaishnavi commented Apr 23, 2024

yuslepukhin commented Apr 26, 2024

tuhinpahari commented Apr 18, 2024 •

edited

Loading

tuhinpahari commented Apr 18, 2024 •

edited

Loading

tuhinpahari commented Apr 22, 2024 •

edited

Loading