[Build] 0.18.0 release breaks Hummingbird build pipeline #20715

ksaur · 2024-05-17T21:53:29Z

Describe the issue

With the release of 0.18.0, we are having issues with the Transpose op:

>           sess = C.InferenceSession(session_options, self._model_bytes, False, self._read_config_from_model)
E           onnxruntime.capi.onnxruntime_pybind11_state.Fail: [ONNXRuntimeError] : 1 : FAIL : Node (/_operators.0/Transpose) Op (Transpose) [TypeInferenceError] Invalid attribute perm {1, 0}, input shape = {

Can you please help point us to the directions of the changes that might have broken us? Thank you!

Please see microsoft/hummingbird#770

Urgency

This is blocking the Microsoft Hummingbird runners.

Target platform

all

Build script

This is part of the Hummingbird build which depends on onnxruntime. Can you please point us to the relevant changes in your 0.18.0 build?

Error / output

self = <onnxruntime.capi.onnxruntime_inference_collection.InferenceSession object at 0x7fb91dde3e90>
providers = [], provider_options = [], disabled_optimizers = None

    def _create_inference_session(self, providers, provider_options, disabled_optimizers=None):
        available_providers = C.get_available_providers()
    
        # Tensorrt can fall back to CUDA if it's explicitly assigned. All others fall back to CPU.
        if "TensorrtExecutionProvider" in available_providers:
            if providers and any(
                provider == "CUDAExecutionProvider"
                or (isinstance(provider, tuple) and provider[0] == "CUDAExecutionProvider")
                for provider in providers
            ):
                self._fallback_providers = ["CUDAExecutionProvider", "CPUExecutionProvider"]
            else:
                self._fallback_providers = ["CPUExecutionProvider"]
        # MIGraphX can fall back to ROCM if it's explicitly assigned. All others fall back to CPU.
        elif "MIGraphXExecutionProvider" in available_providers:
            if providers and any(
                provider == "ROCMExecutionProvider"
                or (isinstance(provider, tuple) and provider[0] == "ROCMExecutionProvider")
                for provider in providers
            ):
                self._fallback_providers = ["ROCMExecutionProvider", "CPUExecutionProvider"]
            else:
                self._fallback_providers = ["CPUExecutionProvider"]
        else:
            self._fallback_providers = ["CPUExecutionProvider"]
    
        # validate providers and provider_options before other initialization
        providers, provider_options = check_and_normalize_provider_args(
            providers, provider_options, available_providers
        )
    
        session_options = self._sess_options if self._sess_options else C.get_default_session_options()
    
        self._register_ep_custom_ops(session_options, providers, provider_options, available_providers)
    
        if self._model_path:
            sess = C.InferenceSession(session_options, self._model_path, True, self._read_config_from_model)
        else:
>           sess = C.InferenceSession(session_options, self._model_bytes, False, self._read_config_from_model)
E           onnxruntime.capi.onnxruntime_pybind11_state.Fail: [ONNXRuntimeError] : 1 : FAIL : Node (/_operators.0/Transpose) Op (Transpose) [TypeInferenceError] Invalid attribute perm {1, 0}, input shape = {

Visual Studio Version

No response

GCC / Compiler Version

No response

The text was updated successfully, but these errors were encountered:

edgchen1 · 2024-05-18T01:43:28Z

This might be where the error message is coming from:
https://github.com/onnx/onnx/blob/990217f043af7222348ca8f0301e17fa7b841781/onnx/defs/tensor/defs.cc#L1116-L1128

sophies927 · 2024-05-18T01:46:01Z

@snnn @yufenglee @jywu-msft @pranavsharma for visibility

jywu-msft · 2024-05-18T02:33:58Z

this looks like due to an update to transpose opset 21 spec.
see: https://onnx.ai/onnx/operators/text_diff_Transpose_13_21.html for difference between transpose opset 13 vs 21
this was added to the description of perms attribute
"Its length must be equal to the rank of the input."
and it looks like that is being enforced now (see @edgchen1 's link above)
from the main error message
"[TypeInferenceError] Invalid attribute perm {1, 0}, input shape = {"
so the input shape seems missing? I guess the Transpose nodes in the model don't conform to the new spec.

ksaur · 2024-05-20T04:43:44Z

Thanks so much for the response and for looking into it! :)

In digging a bit more, I see some warnings about [ShapeInferenceError] Inference error(s). Were there any changes to the way dynamic axes work? (I put some debug notes here). Thanks!!

See #770 and microsoft/onnxruntime#20715 We need to investigate what's going on with the dynamic args

ksaur added the build build issues; typically submitted using template label May 17, 2024

github-actions bot added ep:CUDA issues related to the CUDA execution provider ep:MIGraphX issues related to AMD MI GraphX execution provider ep:ROCm questions/issues related to ROCm execution provider ep:TensorRT issues related to TensorRT execution provider labels May 17, 2024

edgchen1 mentioned this issue May 18, 2024

[DO NOT UNPIN] ORT 1.18.0 Release Candidates available for testing #20558

Closed

ksaur added a commit to microsoft/hummingbird that referenced this issue May 20, 2024

Pinning onnxruntime

81dc0af

See #770 and microsoft/onnxruntime#20715 We need to investigate what's going on with the dynamic args

ksaur mentioned this issue May 20, 2024

Pinning onnxruntime to < 1.18.0 to unblock pipeline microsoft/hummingbird#771

Merged

ksaur added a commit to microsoft/hummingbird that referenced this issue May 20, 2024

Pinning onnxruntime (#771)

ae958a2

See #770 and microsoft/onnxruntime#20715 We need to investigate what's going on with the dynamic args

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Build] 0.18.0 release breaks Hummingbird build pipeline #20715

[Build] 0.18.0 release breaks Hummingbird build pipeline #20715

ksaur commented May 17, 2024

edgchen1 commented May 18, 2024

sophies927 commented May 18, 2024

jywu-msft commented May 18, 2024 •

edited

ksaur commented May 20, 2024

[Build] 0.18.0 release breaks Hummingbird build pipeline #20715

[Build] 0.18.0 release breaks Hummingbird build pipeline #20715

Comments

ksaur commented May 17, 2024

Describe the issue

Urgency

Target platform

Build script

Error / output

Visual Studio Version

GCC / Compiler Version

edgchen1 commented May 18, 2024

sophies927 commented May 18, 2024

jywu-msft commented May 18, 2024 • edited

ksaur commented May 20, 2024

jywu-msft commented May 18, 2024 •

edited