Skip to content

TensorRT EP subgraph dump option. #3604

Merged
jywu-msft merged 1 commit intomasterfrom
jywu_trt_dump
Apr 21, 2020
Merged

TensorRT EP subgraph dump option. #3604
jywu-msft merged 1 commit intomasterfrom
jywu_trt_dump

Conversation

@jywu-msft
Copy link
Copy Markdown
Member

add debug option (controlled via ORT_TENSORRT_DUMP_SUBGRAPHS environment variable)
to dump subgraphs assigned to TensorRT execution provider.
This is useful for debugging TRT Engine build failures.

@jywu-msft jywu-msft requested a review from a team as a code owner April 21, 2020 00:11
std::string string_buf;
model_proto.SerializeToString(&string_buf);

if (dump_subgraphs_) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. One more place we can dump the model for debugging,
GetSupportedList():
after the line: model_proto.SerializeToString(&string_buf)
Here we can capture onnx-tensorrt parser failures during GetCapability.

Copy link
Copy Markdown
Member Author

@jywu-msft jywu-msft Apr 21, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. One more place we can dump the model for debugging,
GetSupportedList():
after the line: model_proto.SerializeToString(&string_buf)
Here we can capture onnx-tensorrt parser failures during GetCapability.

Can I handle that in a separate PR? I have some questions about the GetCapability/GetSupportedList fail path. Also is there potential for duplicate subgraphs getting dumped then.

my main motivation for this PR is to allow dumping of the subgraph corresponding to the fused node referred to by this error message https://github.com/microsoft/onnxruntime/blob/master/onnxruntime/core/providers/tensorrt/tensorrt_execution_provider.cc#L723

also it's useful for visualizing nodes in the subgraph that were executed by TensorRT EP, even for successful executions.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two cases that error could happen in the parser. One is that parser failed when parsing the model. Dumping the model in GetCapability will help this case. The other case is that parser doesn't fail on model parsing, but fail on reporting exceptions to ORT. Then ORT compile (engine creation) will fail. But sure, we can handle the second case in this PR. I will include the first case in next PR.

@jywu-msft jywu-msft merged commit 1c37d5e into master Apr 21, 2020
@jywu-msft jywu-msft deleted the jywu_trt_dump branch April 21, 2020 03:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants