Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ONNX shape inference does not infer shapes #2903

Closed
dtch1997 opened this issue Jul 15, 2020 · 26 comments
Closed

ONNX shape inference does not infer shapes #2903

dtch1997 opened this issue Jul 15, 2020 · 26 comments
Labels
bug shape inference Issues related to shape inference

Comments

@dtch1997
Copy link

Bug Report

Describe the bug

onnx.shape_inference.infer_shapes does not correctly infer shape of each layer.

System information

  • OS Platform and Distribution: Windows 10
  • ONNX version: 1.7.0
  • Python version: 3.7.4

Reproduction instructions

  • Describe the code to reproduce the behavior.
model = onnx.load("models/conv_dummy.onnx")
onnx.checker.check_model(model)
inferred_model = onnx.shape_inference.infer_shapes(model)
print(inferred_model.graph.value_info)

output:

[name: "9"
type {
  tensor_type {
    elem_type: 1
  }
}
, name: "10"
type {
  tensor_type {
    elem_type: 1
  }
}
, name: "11"
type {
  tensor_type {
    elem_type: 1
  }
}
, name: "12"
type {
  tensor_type {
    elem_type: 1
  }
}
, name: "13"
type {
  tensor_type {
    elem_type: 1
  }
}
, name: "14"
type {
  tensor_type {
    elem_type: 1
  }
}
]

Model file: models.zip

Expected behavior

Expected each entry in model.graph.value_info to have tensor shape field which tells me the shape of that layer.

Notes

Model was exported from PyTorch using torch.onnx.export

@dtch1997 dtch1997 added the bug label Jul 15, 2020
@jcwchen
Copy link
Member

jcwchen commented Jul 15, 2020

Hi @dtch1997,
It seems like a IR gap issue. It should be fixed by #2901.
I got the following result by this PR:

[name: "9"
type {
  tensor_type {
    elem_type: 1
    shape {
      dim {
        dim_value: 10
      }
      dim {
        dim_value: 32
      }
      dim {
        dim_value: 128
      }
      dim {
        dim_value: 128
      }
    }
  }
}
, name: "10"
type {
  tensor_type {
    elem_type: 1
    shape {
      dim {
        dim_value: 10
      }
      dim {
        dim_value: 32
      }
      dim {
        dim_value: 128
      }
      dim {
        dim_value: 128
      }
    }
  }
}
, name: "11"
type {
  tensor_type {
    elem_type: 1
    shape {
      dim {
        dim_value: 10
      }
      dim {
        dim_value: 32
      }
      dim {
        dim_value: 128
      }
      dim {
        dim_value: 128
      }
    }
  }
}
, name: "12"
type {
  tensor_type {
    elem_type: 1
    shape {
      dim {
        dim_value: 10
      }
      dim {
        dim_value: 32
      }
      dim {
        dim_value: 128
      }
      dim {
        dim_value: 128
      }
    }
  }
}
, name: "13"
type {
  tensor_type {
    elem_type: 1
    shape {
      dim {
        dim_value: 10
      }
      dim {
        dim_value: 32
      }
      dim {
        dim_value: 1
      }
      dim {
        dim_value: 1
      }
    }
  }
}
, name: "14"
type {
  tensor_type {
    elem_type: 1
    shape {
      dim {
        dim_value: 10
      }
      dim {
        dim_value: 32
      }
    }
  }
}
]

@dtch1997
Copy link
Author

@jcwchen I'm having trouble building from source (as seen in #2916). is there any workaround I can use while I wait for this to be fixed? E.g. use an older ONNX version or older opset, or manually add all initializer to model input

@jcwchen
Copy link
Member

jcwchen commented Jul 22, 2020

For temporary solution, you could manually add all initializer to model input like @TMVector mentioned here. Thanks.

@dtch1997
Copy link
Author

Good workaround. Adding a line add_value_info_for_constants(model) before shape inference runs correctly.

@jcwchen
Copy link
Member

jcwchen commented Aug 11, 2020

Hi @dtch1997,
The PR #2901 has been merged. Now you can solve this issue by using the master branch of onnx. Thanks.

@deepak0896
Copy link

Hi @jcwchen ,
I'm on the latest branch of onnx but I'm still facing this issue. I'm using shufflenet onnx model which you can get from here.

@jcwchen
Copy link
Member

jcwchen commented Dec 28, 2020

Hi @deepak0896,
Did you use onnx 1.8? I can successfully shape infer with shufflenet.

@deepak0896
Copy link

deepak0896 commented Dec 28, 2020

Hi @jcwchen,
yes I'm using onnx 1.8. And I'm getting inferred shapes for some of the nodes not all.
For example, a part of output is :

name: "377"
type {
tensor_type {
elem_type: 1
shape {
dim {
dim_value: 1
}
dim {
dim_value: 116
}
dim {
dim_value: 28
}
dim {
dim_value: 28
}
}
}
}
, name: "378"
type {
tensor_type {
elem_type: 1
}
}
, name: "379"
type {
tensor_type {
elem_type: 1
}
}
, name: "380"
type {
tensor_type {
elem_type: 1
}
}
, name: "381"
type {
tensor_type {
elem_type: 1
}
}
, name: "382"
type {
tensor_type {
elem_type: 1
}
}
, name: "383"
type {
tensor_type {
elem_type: 1
}
}
, name: "384"
type {
tensor_type {
elem_type: 1
}
}
, name: "385"
type {
tensor_type {
elem_type: 1
}
}
, name: "386"
type {
tensor_type {
elem_type: 1
}
}
, name: "387"
type {
tensor_type {
elem_type: 1
}
}
, name: "388"
type {
tensor_type {
elem_type: 1
}
}
, name: "389"
type {
tensor_type {
elem_type: 7
shape {
dim {
dim_value: 5
}
}
}
}
, name: "390"
type {
tensor_type {
elem_type: 1
shape {
dim {
dim_value: 1
}
dim {
dim_value: 2
}
dim {
dim_value: 58
}
dim {
dim_value: 28
}
dim {
dim_value: 28
}
}
}
}

@jcwchen
Copy link
Member

jcwchen commented Dec 28, 2020

Thank you for the details. I found the shapes of nodes after every Split node will be missing... There might be a shape inference bug for Split-10... Perhaps #2549 can help

@jcwchen
Copy link
Member

jcwchen commented Dec 28, 2020

I have confirmed that PR can help and you can try that first. I will push that PR forward. Thanks.

@rfforelli
Copy link

I have tried all of the solutions mentioned and I'm still facing the same issue "IndexError: Input hidden_x1.0.weight is undefined!" with the ONNX model here. I'm on ONNX 1.8, do you know what I might be doing wrong?

@jcwchen
Copy link
Member

jcwchen commented Jan 29, 2021

I have tried all of the solutions mentioned and I'm still facing the same issue "IndexError: Input hidden_x1.0.weight is undefined!" with the ONNX model here. I'm on ONNX 1.8, do you know what I might be doing wrong?

Actually I can onnx.shape_inference.infer_shapes sho_filtter41 with ONNX 1.8. Are you sure you are using ONNX 1.8? Maybe try pip uninstall -y onnx several times and pip install onnx==1.8.0 again. Thank you.

@nicklhy
Copy link

nicklhy commented Oct 21, 2021

Still have this problem in onnx==1.10.1. A lot of shapes are missing when analyzing a gpt model.

@jcwchen
Copy link
Member

jcwchen commented Oct 21, 2021

Hi @nicklhy,
I guess something might be wrong and shape inference was silent -- could you please try onnx.shape_inference.infer_shapes("your_model.onnx", strict_mode=True)? If there is no error message, could you please provide the model for me to take a closer look? Thank you.

@nicklhy
Copy link

nicklhy commented Oct 21, 2021

Hi @nicklhy, I guess something might be wrong and shape inference was silent -- could you please try onnx.shape_inference.infer_shapes("your_model.onnx", strict_mode=True)? If there is no error message, could you please provide the model for me to take a closer look? Thank you.

Thanks for the quick reply ~
Already tried to enable strict_mode but it gives no errors at all. There are still some shapes missing after the shape inference. You can download my models (efficientnet+gpt2 exported from pytorch) and test script from here.

@jcwchen
Copy link
Member

jcwchen commented Oct 21, 2021

Thanks for providing the link -- It seem that I cannot open it somehow. Will try it later

@nicklhy
Copy link

nicklhy commented Oct 21, 2021

Thanks for providing the link -- It seem that I cannot open it somehow. Will try it later

just execute wget http://aivc.ks3-cn-beijing.ksyun.com/public_data/onnx_shape_infer_bug.tgz

@jcwchen
Copy link
Member

jcwchen commented Oct 21, 2021

Sorry with wget I cannot get the model either. Perhaps it cannot be accessed in the United States? Do you have another way to provide this model? Thank you!

@nicklhy
Copy link

nicklhy commented Oct 22, 2021

Sorry with wget I cannot get the model either. Perhaps it cannot be accessed in the United States? Do you have another way to provide this model? Thank you!

Try this google drive link: https://drive.google.com/file/d/1xWMrAozQEvrPAwXKJO69vOOsor66dIr7/view?usp=sharing

@jcwchen
Copy link
Member

jcwchen commented Oct 22, 2021

Thank you for providing the models! I roughly checked the produced graph.value_info after onnx.shape_inference and the result looks normal to me. Please note that actually ONNX Shape inference is not guaranteed to be complete. In particular, some dynamic behaviors block the flow of shape inference, for example a Reshape to a dynamically-provide shape. As you can see, many output from Reshape node is missing shape due to this kind of dynamic behavior.

Recently ONNX does improve shape inference for more dynamic scenarios (e.g., Reshape) by data propagation, but the supported ops are limited for now. Take your model as an example, Gemm is not supported yet so that's why enabling data propagation cannot help the following Reshape infer shape either.

However, if there is any static shape inference with registered shape inference function failed (just like Split op bug in this thread), please do let me know and let's try to resolve it. Thanks!

More reference: https://github.com/onnx/onnx/blob/master/docs/ShapeInference.md, https://github.com/onnx/onnx/blob/master/docs/proposals/SymbolicShapeInfProposal.md

@leimao
Copy link
Contributor

leimao commented Nov 11, 2021

@jcwchen Quick question, is there any convenient one-line code to remove the shape inference information from an ONNX model? Thank you.
I could keep doing pop until the ValueInfoProto object is empty. But it is inconvenient.

@jcwchen
Copy link
Member

jcwchen commented Nov 11, 2021

@leimao perhaps try something like your_model.graph.ClearField('value_info')

@leimao
Copy link
Contributor

leimao commented Nov 11, 2021

.ClearField('value_info')

That does work. Thank you. Perhaps you guys can consider adding this interface to ONNX Python API.
I suggest we can do

model.graph.value_info.clear()

which is more similar to Python list syntax, given value_info already has pop and append methods.

@jcwchen
Copy link
Member

jcwchen commented Nov 11, 2021

Actually the utilities here come from protobuf since it's a model proto. Perhaps you can raise this concern there. Thank you for the suggestion.

@mananta
Copy link

mananta commented Mar 15, 2023

The shape inference problem still persists as lot of shapes are missing when trying to parse the gpt2 onnx model from https://github.com/onnx/models/blob/main/text/machine_comprehension/gpt-2/README.md . I am using onnx version 1.13.1 . Perhaps the problem is about dynamic shape inference. Can anyone suggest any way-out to tackle dynamic shape inference?

@MaxenceBouvier
Copy link

MaxenceBouvier commented Jul 4, 2023

The shape inference problem still persists as lot of shapes are missing when trying to parse the gpt2 onnx model from https://github.com/onnx/models/blob/main/text/machine_comprehension/gpt-2/README.md . I am using onnx version 1.13.1 . Perhaps the problem is about dynamic shape inference. Can anyone suggest any way-out to tackle dynamic shape inference?

@mananta did you find any solution regarding this?
I decided to re-build small models of the missing part based on shape information known before and after the Reshape layers.
But that does not sound like a good solution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug shape inference Issues related to shape inference
Projects
None yet
Development

No branches or pull requests

9 participants