[QNN-EP] Add additional guards for file mapping#27871
[QNN-EP] Add additional guards for file mapping#27871adrianlizarraga merged 12 commits intomicrosoft:mainfrom
Conversation
- Only use file mapping feature if context bin version is >= 3.3.3 - Also disable file mapping on a per-model basis for use cases where the model contains embedded and external EP context bins
|
/azp run Linux QNN CI Pipeline,Win_TRT_Minimal_CUDA_Test_CI,Windows ARM64 QNN CI Pipeline,Windows GPU Doc Gen CI Pipeline |
|
Azure Pipelines successfully started running 4 pipeline(s). |
There was a problem hiding this comment.
Pull request overview
Adds additional safety guards around QNN “file mapped weights” usage when loading cached QNN contexts, so the feature is only used when compatible and can be disabled per EPContext node.
Changes:
- Introduces a per-node
use_file_mappingflag (derived fromfile_mapped_weights_enabled_) to avoid globally affecting other nodes/models. - Disables file mapping for embedded cached contexts (nonzero
buffer_length). - Adds a context blob version gate intended to disable file mapping for context binaries older than 3.3.3.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 1 out of 1 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
/azp run Linux QNN CI Pipeline,Win_TRT_Minimal_CUDA_Test_CI,Windows ARM64 QNN CI Pipeline,Windows GPU Doc Gen CI Pipeline |
|
Azure Pipelines successfully started running 4 pipeline(s). |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 1 out of 1 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 1 out of 1 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 2 out of 2 changed files in this pull request and generated 5 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
GetMaxSpillFillBufferSize()
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
/azp run Linux QNN CI Pipeline,Win_TRT_Minimal_CUDA_Test_CI,Windows ARM64 QNN CI Pipeline,Windows GPU Doc Gen CI Pipeline |
|
Azure Pipelines successfully started running 4 pipeline(s). |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
creation/destruct and context binary info retrieval into new functions
|
/azp run Linux QNN CI Pipeline,Win_TRT_Minimal_CUDA_Test_CI,Windows ARM64 QNN CI Pipeline,Windows GPU Doc Gen CI Pipeline |
|
Azure Pipelines successfully started running 4 pipeline(s). |
Description
Porting over additional fixes to file mapping from the QNN EP ABI repo:
Motivation and Context
When testing based the QNN EP ABI repo, failed QNN context creation from EP context due to the EP context binary being too old prevented the QNN API from freeing all resources when file mapping is enabled. Context creation failure was due to the context binary version being older than 3.3.3, so there is now a check to disable file mapping for any EP context binaries that are too old.
Prior to these changes, if file mapping is enabled and QNN context creation fails for any reason, the feature is disabled for all other graphs. This does not account for use cases where (1) a model contains multiple EP context nodes and some of them are incompatible with the file mapping feature; and (2) when multiple sessions share the same EP context and one or more of the models used are incompatible with the file mapping feature. The code has been updated to handle this use case.