Skip to content

std::bad_alloc when loading a model with sparse tesnsor constant node. #24530

@sh1ng

Description

@sh1ng

Describe the issue

Hi Team!

We are developing MOJO 2 ONNX converter and face the mentioned issue when creating an inference session for a converted model.
I don't want to go into details a lot, but shortly we use sparse tensors for one of mojo transformations. You can find a failing onnx model in the attachments. Uncompress

model.zip

it and run

from onnx import load
import onnxruntime as rt

onnx_model = load("model.onnx")
sess = rt.InferenceSession(
        onnx_model.SerializeToString(),
        providers=["CPUExecutionProvider"]
    )

Tested on onnx-runtime 1.18 and 1.21.0.
After a debugging session, I found out that it comes from

#0  0x00007fff4d0ae4a1 in __cxa_throw () from /lib/x86_64-linux-gnu/libstdc++.so.6
#1  0x00007fff4d0a27ac in ?? () from /lib/x86_64-linux-gnu/libstdc++.so.6
#2  0x00007fff4d14bd0c in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_construct(unsigned long, char) ()
   from /lib/x86_64-linux-gnu/libstdc++.so.6
#3  0x00007fff4baed265 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string<std::allocator<char> > (__a=..., 
    __c=0 '\000', __n=<optimized out>, this=0x7fffffffa4c0) at /usr/include/c++/11/bits/basic_string.h:555
#4  onnxruntime::utils::SparseTensorProtoToDenseTensorProto (sparse=..., model_path=filesystem::path "", dense=...)
    at /home/ubuntu/onnxruntime/onnxruntime/core/framework/tensorprotoutils.cc:1570
#5  0x00007fff4baee440 in onnxruntime::utils::ConstantNodeProtoToTensorProto (node=..., model_path=..., tensor=..., 
    tensor_name="mojo/predict_stacked_base_46b39022-12fa-481a-baac-7e25f38a90a9/1_LightGBMModel_0/pipeline.pb__MapOp_inputs_Age|#1_Amount_invested_monthly|#3_Annual_Income|#7_Changed_Credit_Limit|#4_Interest_Rate|#2_N"...) at /home/ubuntu/onnxruntime/onnxruntime/core/framework/tensorprotoutils.cc:1385
#6  0x00007fff4baee626 in onnxruntime::utils::ConstantNodeProtoToTensorProto (node=..., model_path=filesystem::path "", tensor=...)
    at /home/ubuntu/onnxruntime/onnxruntime/core/framework/tensorprotoutils.cc:1406
#7  0x00007fff4bba2da2 in onnxruntime::Graph::Graph (this=0x555572bfef20, owning_model=..., graph_proto=<optimized out>, domain_to_version=..., 
    ir_version=<optimized out>, schema_registry=..., parent_graph=0x0, parent_node=0x0, logger=..., strict_shape_type_inference=false)
    at /home/ubuntu/onnxruntime/build/Linux/RelWithDebInfo/_deps/gsl-src/include/gsl/pointers:115
#8  0x00007fff4bba3e71 in onnxruntime::Graph::Graph (this=this@entry=0x555572bfef20, owning_model=..., graph_proto=graph_proto@entry=0x555563f489d0, 
    domain_to_version=std::unordered_map with 10 elements = {...}, ir_version=ir_version@entry=10, 
    schema_registry=std::shared_ptr<onnxruntime::IOnnxRuntimeOpSchemaCollection> (use count 4, weak count 0) = {...}, logger=..., strict_shape_type_inference=false)
    at /home/ubuntu/onnxruntime/onnxruntime/core/graph/graph.cc:1207
#9  0x00007fff4bbcbdb6 in onnxruntime::Model::Model (this=0x555572bfde70, model_proto=..., model_path=..., local_registries=<optimized out>, logger=..., options=...)
    at /home/ubuntu/onnxruntime/onnxruntime/core/graph/model.cc:281
#10 0x00007fff4bbcc2f8 in std::make_unique<onnxruntime::Model, onnx::ModelProto, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::list<std::shared_ptr<onnxruntime::IOnnxRuntimeOpSchemaCollection>, std::allocator<std::shared_ptr<onnxruntime::IOnnxRuntimeOpSchemaCollection> > > const*&, onnxruntime::logging::Logger const&, onnxruntime::ModelOptions const&> () at /usr/include/c++/11/bits/unique_ptr.h:962
#11 onnxruntime::Model::Load (model_proto=..., model_path="", model=std::shared_ptr<onnxruntime::Model> (empty) = {...}, local_registries=0x0, logger=..., 
    options=...) at /home/ubuntu/onnxruntime/onnxruntime/core/graph/model.cc:474
#12 0x00007fff4b2c0d54 in operator() (__closure=0x555563efd540, model=std::shared_ptr<onnxruntime::Model> (empty) = {...})
    at /home/ubuntu/onnxruntime/onnxruntime/core/session/inference_session.cc:1102
#13 0x00007fff4b2c0ea5 in std::__invoke_impl<onnxruntime::common::Status, onnxruntime::InferenceSession::Load(void const*, int)::<lambda(std::shared_ptr<onnxruntime::Model>&)>&, std::shared_ptr<onnxruntime::Model>&> (__f=...) at /usr/include/c++/11/bits/invoke.h:60
#14 std::__invoke_r<onnxruntime::common::Status, onnxruntime::InferenceSession::Load(void const*, int)::<lambda(std::shared_ptr<onnxruntime::Model>&)>&, std::shared_ptr<onnxruntime::Model>&> (__fn=...) at /usr/include/c++/11/bits/invoke.h:116
#15 std::_Function_handler<onnxruntime::common::Status(std::shared_ptr<onnxruntime::Model>&), onnxruntime::InferenceSession::Load(void const*, int)::<lambda(std::shared_ptr<onnxruntime::Model>&)> >::_M_invoke(const std::_Any_data &, std::shared_ptr<onnxruntime::Model> &) (__functor=..., __args#0=...)
    at /usr/include/c++/11/bits/std_function.h:291
#16 0x00007fff4b2bed3c in std::function<onnxruntime::common::Status (std::shared_ptr<onnxruntime::Model>&)>::operator()(std::shared_ptr<onnxruntime::Model>&) const (
    __args#0=std::shared_ptr<onnxruntime::Model> (empty) = {...}, this=0x7fffffffb9c0) at /usr/include/c++/11/bits/std_function.h:590
#17 onnxruntime::InferenceSession::LoadWithLoader(std::function<onnxruntime::common::Status (std::shared_ptr<onnxruntime::Model>&)>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) (this=0x55555eb08c20, loader=..., event_name="model_loading_array")
    at /home/ubuntu/onnxruntime/onnxruntime/core/session/inference_session.cc:964
#18 0x00007fff4b2c24e5 in onnxruntime::InferenceSession::Load (this=0x55555eb08c20, model_data=0x55557a81e8c0, model_data_len=201454176)
    at /home/ubuntu/onnxruntime/onnxruntime/core/session/inference_session.cc:1105

and related to inefficient SparseTensorProtoToDenseTensorProto #3304 operation. It seems that sparse tensors are not fully supported.

Is there a workaround?

Thanks!

cc @yuslepukhin

To reproduce

  1. Uncompress the attached model.
  2. Try to create a session using it.

Urgency

It's quite urgent as we can't use converted models in some cases. Or we need to completely rewrite some of our code to not use sparse tensors.

Platform

Linux

OS Version

ubuntu 22

ONNX Runtime Installation

Released Package

ONNX Runtime Version or Commit ID

1.21.0

ONNX Runtime API

Python

Architecture

X64

Execution Provider

Default CPU

Execution Provider Library Version

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    staleissues that have not been addressed in a while; categorized by a bot

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions