Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enabling L2+ Optimizations for EPs #23517

Merged
merged 61 commits into from
Mar 7, 2025
Merged

Conversation

chilo-ms
Copy link
Contributor

@chilo-ms chilo-ms commented Jan 28, 2025

There are some requirements to modify the graph which are specific to the EP/hardware.
ORT has the hardcoded EP list for optimizations but that can't scale and it's hard be extended to enable EP custom optimizations.

This PR is to enable L2+ optimizations for EPs (The original prototype is provided by @skottmckay) as well as the TRT EP implementation for the ConstantFoldingDQ optimization.

The graph_optimizer_registry is designed for enabling L2+ graph optimizations tailored for EPs.
These optimizations are applied after the graph partitioner assigns ComputeCapability to the EP and before EP's "Compile" or fusion

Signatures for selection and optimization functions:

  - Selection: std::function<std::vector<std::unique_ptr<ComputeCapability>>(const GraphViewer&, const KeyValueConfig&, const GraphOptimizerRegistry&)>
  - Optimization: std::function<Status(const Graph&, const ComputeCapability& this_optimization, ComputeCapability& cc_to_update, const GraphOptimizerRegistry&)>

Optimizer's selection function:

  • Selects a set of nodes from a given graph for optimization. Additional key/value strings can be provided to configure the optimizer. If needed, use graph_optimizer_registry to access the session options, the CPU EP and the logger.

Optimizer's optimization function:

  • Gets the nodes in ComputeCapability from nodes_to_optimize. Use graph_optimizer_registry to access the session options, the CPU EP and the logger if needed to create the optimizer. Run optimization on the nodes/subgraph, and finally, update the ComputeCapability.

GetCapability

  • call (new) provider bridge API to lookup pre-defined optimizer by name and get selection function

    • ComputeCapability.optimize_func, i.e. optimization function, would be set by the optimizer to the function that does the optimization
  • EP has to update the returning ComputeCapability to include the optimization ComputeCapability in nodes_to_optimize. So that later ORT can perform optimization/transformation accordingly.

GraphPartitioner

  • After assigning the ComputeCapability to the EP and prior to Compile, if the ComputeCapability has nodes_to_optimize, iterate that list
    • optimization function needs to be called with
      • a mutable Graph instance
      • the ComputeCapability for the individual optimization
      • the overall ComputeCapability so it can be updated

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can commit the suggested changes from lintrunner.

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can commit the suggested changes from lintrunner.

skottmckay
skottmckay previously approved these changes Feb 28, 2025
skottmckay
skottmckay previously approved these changes Mar 7, 2025
@skottmckay skottmckay merged commit 1199dc0 into main Mar 7, 2025
98 of 100 checks passed
@skottmckay skottmckay deleted the chi/ort_enable_l2_plus_opt_for_ep branch March 7, 2025 04:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants