Skip to content

[Performance] GPU time exceeding CPU time #20361

@liujiachang

Description

@liujiachang

Describe the issue

Onnxruntime does not utilize the GPU when running on it, resulting in GPU time exceeding CPU time.
1713402897809
At the same time, the GPU utilization rate is 0.
image
But when the model is loaded, the GPU utilization rate is 100%.
image

To reproduce

  import onnxruntime as ort
  import numpy as np
  import time
 
 # The model is YOLOv5 model
  session1 = ort.InferenceSession(
      path_or_bytes=r"D:\data\liujc\model\person_vehicle_onnx\1\person_vehicle_v3.4.onnx",
      providers=["CUDAExecutionProvider", "CPUExecutionProvider"],
  )
  session2 = ort.InferenceSession(
      path_or_bytes=r"D:\data\liujc\model\person_vehicle_onnx\1\person_vehicle_v3.4.onnx",
      providers=["CPUExecutionProvider"],
  )

  while True:
      img = np.random.rand(1, 3, 640, 640).astype(np.float32)
      t = time.time()
      session1.run(None, {session1.get_inputs()[0].name: img})
      print(session1.get_providers())
      print("gpu cost time %fs" % (time.time() - t))
      time.sleep(0.1)

      t = time.time()
      session2.run(None, {session2.get_inputs()[0].name: img})
      print(session2.get_providers())
      print("cpu cost time %fs" % (time.time() - t))
      time.sleep(0.1)

Urgency

Platform

Windows

OS Version

windows 11

ONNX Runtime Installation

Released Package

ONNX Runtime Version or Commit ID

onnxruntime-gpu 1.17.1

ONNX Runtime API

Python

Architecture

X64

Execution Provider

CUDA

Execution Provider Library Version

CUDA 11.8

Model File

No response

Is this a quantized model?

Yes

Metadata

Metadata

Assignees

No one assigned

    Labels

    ep:CUDAissues related to the CUDA execution providerperformanceissues related to performance regressionsquantizationissues related to quantizationstaleissues that have not been addressed in a while; categorized by a bot

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions