[Performance] GPU time exceeding CPU time

### Describe the issue

Onnxruntime does not utilize the GPU when running on it, resulting in GPU time exceeding CPU time.
![1713402897809](https://github.com/microsoft/onnxruntime/assets/25957122/347de106-1b33-442d-9884-d042ca104497)
At the same time, the GPU utilization rate is 0.
![image](https://github.com/microsoft/onnxruntime/assets/25957122/947d8713-80fd-4c42-947c-65f0819a3d7c)
But when the model is loaded, the GPU utilization rate is 100%.
![image](https://github.com/microsoft/onnxruntime/assets/25957122/adaba1a7-ff17-419c-aa01-17428438d77a)


### To reproduce

```
  import onnxruntime as ort
  import numpy as np
  import time
 
 # The model is YOLOv5 model
  session1 = ort.InferenceSession(
      path_or_bytes=r"D:\data\liujc\model\person_vehicle_onnx\1\person_vehicle_v3.4.onnx",
      providers=["CUDAExecutionProvider", "CPUExecutionProvider"],
  )
  session2 = ort.InferenceSession(
      path_or_bytes=r"D:\data\liujc\model\person_vehicle_onnx\1\person_vehicle_v3.4.onnx",
      providers=["CPUExecutionProvider"],
  )

  while True:
      img = np.random.rand(1, 3, 640, 640).astype(np.float32)
      t = time.time()
      session1.run(None, {session1.get_inputs()[0].name: img})
      print(session1.get_providers())
      print("gpu cost time %fs" % (time.time() - t))
      time.sleep(0.1)

      t = time.time()
      session2.run(None, {session2.get_inputs()[0].name: img})
      print(session2.get_providers())
      print("cpu cost time %fs" % (time.time() - t))
      time.sleep(0.1)
```

### Urgency

否

### Platform

Windows

### OS Version

windows 11

### ONNX Runtime Installation

Released Package

### ONNX Runtime Version or Commit ID

onnxruntime-gpu 1.17.1

### ONNX Runtime API

Python

### Architecture

X64

### Execution Provider

CUDA

### Execution Provider Library Version

CUDA 11.8

### Model File

_No response_

### Is this a quantized model?

Yes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Performance] GPU time exceeding CPU time #20361

Describe the issue

To reproduce

Urgency

Platform

OS Version

ONNX Runtime Installation

ONNX Runtime Version or Commit ID

ONNX Runtime API

Architecture

Execution Provider

Execution Provider Library Version

Model File

Is this a quantized model?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Performance] GPU time exceeding CPU time #20361

Description

Describe the issue

To reproduce

Urgency

Platform

OS Version

ONNX Runtime Installation

ONNX Runtime Version or Commit ID

ONNX Runtime API

Architecture

Execution Provider

Execution Provider Library Version

Model File

Is this a quantized model?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions