Skip to content

Conversation

@tianleiwu
Copy link
Contributor

@tianleiwu tianleiwu commented Sep 11, 2023

Description

During optimization of SDXL UNet, the prune_graph takes up to 5 minutes. The cause is to find a node in all nodes is time-consuming. This optimization will reduce the latency of prune_graph to 2 seconds.

New algorithm will use a hash table (key is first node output, value is node) to speed up.

wangyems
wangyems previously approved these changes Sep 11, 2023
@tianleiwu tianleiwu marked this pull request as draft September 11, 2023 23:42
@tianleiwu tianleiwu marked this pull request as ready for review September 12, 2023 00:50
@tianleiwu
Copy link
Contributor Author

@microsoft-github-policy-service agree

@tianleiwu
Copy link
Contributor Author

tianleiwu commented Sep 12, 2023

license/cla in Waiting for status to be reported forever. Try close and re-open pull request.

@tianleiwu tianleiwu closed this Sep 12, 2023
@tianleiwu tianleiwu reopened this Sep 12, 2023
@tianleiwu tianleiwu merged commit 49511b5 into main Sep 12, 2023
@tianleiwu tianleiwu deleted the tlwu/improve_prune_graph_perf branch September 12, 2023 18:38
@faxu faxu added triage:approved Approved for cherrypicks for release sdxl_llama labels Oct 25, 2023
tianleiwu added a commit that referenced this pull request Oct 31, 2023
During optimization of SDXL UNet, the prune_graph takes up to 5 minutes.
The cause is to find a node in all nodes is time-consuming. This
optimization will reduce the latency of prune_graph to 2 seconds.

New algorithm will use a hash table (key is first node output, value is
node) to speed up.
@tianleiwu tianleiwu removed triage:approved Approved for cherrypicks for release release:1.16.2 labels Nov 1, 2023
kleiti pushed a commit to kleiti/onnxruntime that referenced this pull request Mar 22, 2024
During optimization of SDXL UNet, the prune_graph takes up to 5 minutes.
The cause is to find a node in all nodes is time-consuming. This
optimization will reduce the latency of prune_graph to 2 seconds.

New algorithm will use a hash table (key is first node output, value is
node) to speed up.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants