forked from microsoft/onnxruntime
-
Notifications
You must be signed in to change notification settings - Fork 57
Backmerging with Msft Commits #733
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
### Description Add a new ORT API `GetSessionOptionConfigEntries`. ### Motivation and Context microsoft#24887 allows plugin-EPs to interface with ORT using a binary stable interface. microsoft#24445 allows an EP to handle the extraction of EP options from the session option configurations. For an EP like VitisAI EP to comply with the requirements, it is necessary for a plugin-EPs to access all config entries in a session option. ```c++ OrtKeyValuePairs * kvps = nullptr; auto status = GetSessionOptionConfigEntries(session_option, &kvps); if(status) { throw status; } std::unique_ptr<OrtKeyValuePairs, void (*)(OrtKeyValuePairs*)> config_entries(kvps, ort_api.ReleaseKeyValuePairs); const char* const* keys = nullptr; const char* const* values = nullptr; size_t num_keys = 0; // Get keys and values from the config entries Ort::GetApi().GetKeyValuePairs(config_entries.get(), &keys, &values, &num_keys); for (size_t i = 0; i < num_keys; ++i) { // process keys[i] and values[i] } ```
…5192) ### Description This PR optimizes the Intel GPU path for the `DP4AMatMulNBitsSmallMProgram` by tuning `tile_size` and `tile_size_k_vec`. ### Motivation and Context With this change, we achieved >8% performance boost on Intel iGPUs (Xe-LP and Xe2-LPG) for phi-4-mini-accuracy4 model.
Follow up microsoft#24980 Fix microsoft#24556 Add ONNX RotaryEmbedding(23) following https://github.com/onnx/onnx/blob/main/docs/Operators.md#RotaryEmbedding. The PR uses contrib op RotaryEmbedding implementation under the hood. The main difference between this op and the contrib op is that the position_ids in ONNX RotaryEmbedding is optional. When it's not provided, cos_cache and sin_cache should be 3d.
…mization (microsoft#25296) In the context of a model containing EPContext nodes, it's highly unlikely that two EPContext nodes will produce the same results. Furthermore, the EquivalenceClass constructor includes the node and all its attributes in the hash calculation, which can be particularly time-consuming when the "ep_cache_context" attribute contains a large binary blob. Therefore, we exclude EPContext op from CSE.
…oft#25285) ### Description support smooth softmax for non-FA GQA implementation This change depends on: - microsoft#25269 Work items: - [x] support smooth softmax - [x] support bias - [x] support head sink (per-head smooth softmax) The following will not be included in this PR: - support for FlashAttention - support sliding window
### Description <!-- Describe your changes. --> 1. Fix the Build Break in NV TRT RTX EP
### Description Fix Windows build with MSVC 17.14.7 and cuda 12.9.1. The build error was like: `CUDACOMPILE : nvcc error : 'cudafe++' died with status 0xC0000005 (ACCESS_VIOLATION)` The cause is unknown (maybe cudafe bug). The code change resolved the issue. I've verified it in two machines.
…icrosoft#25308) ### Description - Infer `OrtDevice` for a plugin EP from the registered `OrtMemoryInfo` for device memory. - Fix potential `nullptr` dereference when a `PluginExecutionProvider` tries to log a message without a valid logger. Now, constructing a `PluginExecutionProvider` requires passing a valid logger. ### Motivation and Context Address a `TODO` to properly set the `OrtDevice` for a `PluginExecutionProvider` instance.
ankitm3k
approved these changes
Jul 8, 2025
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Backmerging with Msft Commits