[Performance] QNN intermittent failure with error code 5005

### Describe the issue

System: 
Manufacturer  Dell
Processor	Snapdragon(R) X Elite - X1E80100 - Qualcomm(R) Oryon(TM) CPU   3.42 GHz
Installed RAM	32.0 GB (31.6 GB usable)
System type	64-bit operating system, ARM-based processor

Onnxruntime-qnn:
Version 1.22.0

Using the same notebook/file onnxruntime-qnn will throw a 5005 error. In order to resolve this I either need to shut down the terminal (if running from main.py), restart the kernel (jupyter notebook), or sometimes restart the computer and allow it to stay off for a few minutes. I believe the issue has to do with a cache not being reset but I'm not 100% sure. As I said this is a very intermittent issue
This is not due to having more than one QnnHTP.dll file installed or another QnnHTP.dll in system path. I'm only referencing the using the HTP driver that's installed with onnxruntime-qnn. Below is an example of the error

<img width="824" alt="Image" src="https://github.com/user-attachments/assets/128be64e-a5ee-4570-83ed-de000a088013" />

### To reproduce

There is only way one I'm able to force this error to occur as it's very random.
If I have an active jupyter notebook running with an InferenceSession calling the QNNExecutionProvider, then try and run main.py from command line.
You can use this repo as an example:
1. Follow instructions in README.md to download models.
2. Run /qnn_sample_apps/notebooks/llm/Deepseek_r1_7b_Optimized_Temperature_TopK.ipynb
3. Open powershell and run python /qnn_sample_apps/src/deepseek_r1/main.py --query "how to resolve this 5005 error"
4. The 5005 error will show up in terminal.

Repo: github.com/DerrickJ1612/qnn_sample_apps

### Urgency

I work for Qualcomm so this is urgent for us as we've been showcasing this workflow to run LLMs. I'm trying to narrow down if this is an onnxruntime-qnn issue or not.

### Platform

Windows

### OS Version

Windows 11

### ONNX Runtime Installation

Released Package

### ONNX Runtime Version or Commit ID

1.22.0

### ONNX Runtime API

Python

### Architecture

ARM64

### Execution Provider

Other / Unknown

### Execution Provider Library Version

QNN Execution Provider, not sure why the Execution Provider above has SNPE and not QNN

### Model File

Download from referenced repo
https://drive.google.com/drive/folders/1hCopYw7rMdeOm3zV6NC2do9orzpKqAMf

### Is this a quantized model?

Yes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Performance] QNN intermittent failure with error code 5005 #25128

Describe the issue

To reproduce

Urgency

Platform

OS Version

ONNX Runtime Installation

ONNX Runtime Version or Commit ID

ONNX Runtime API

Architecture

Execution Provider

Execution Provider Library Version

Model File

Is this a quantized model?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Performance] QNN intermittent failure with error code 5005 #25128

Description

Describe the issue

To reproduce

Urgency

Platform

OS Version

ONNX Runtime Installation

ONNX Runtime Version or Commit ID

ONNX Runtime API

Architecture

Execution Provider

Execution Provider Library Version

Model File

Is this a quantized model?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions