Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

onnxruntime.InferenceSession.run sometimes get stuck, sometimes not #21418

Open
quarrying opened this issue Jul 19, 2024 · 5 comments
Open

onnxruntime.InferenceSession.run sometimes get stuck, sometimes not #21418

quarrying opened this issue Jul 19, 2024 · 5 comments
Labels
ep:CUDA issues related to the CUDA execution provider stale issues that have not been addressed in a while; categorized by a bot

Comments

@quarrying
Copy link

Describe the issue

I have built onnxruntime-gpu 1.4.0 following https://github.com/microsoft/onnxruntime/blob/v1.4.0/dockerfiles/Dockerfile.cuda . The output of import onnxruntime and onnxruntime.get_device() are both normal, and onnxruntime.InferenceSession() seems ok too. However, sess.run() sometimes runs smoothly but gets stuck at other times (GPU memory not full, only ~2G of 11G). I have tried various SessionOptions but the issue persists. PS: The code is running within a Docker container.

To reproduce

import os
import time
from datetime import datetime

import numpy as np
import onnxruntime

if __name__ == '__main__':

    sess_options = onnxruntime.SessionOptions()
    sess_options.log_severity_level = 1
    # sess_options.intra_op_num_threads = 1
    # sess_options.inter_op_num_threads = 1
    # sess_options.enable_profiling = True
    # sess_options.graph_optimization_level = onnxruntime.GraphOptimizationLevel.ORT_DISABLE_ALL
    # sess_options.graph_optimization_level = onnxruntime.GraphOptimizationLevel.ORT_ENABLE_ALL
    # sess_options.graph_optimization_level = onnxruntime.GraphOptimizationLevel.ORT_ENABLE_BASIC
    # sess_options.graph_optimization_level = onnxruntime.GraphOptimizationLevel.ORT_ENABLE_EXTENDED
    providers = ['CUDAExecutionProvider', 'CPUExecutionProvider'] 
    model_path = 'model.onnx'
    sess = onnxruntime.InferenceSession(model_path, sess_options, providers)

    input_names = [item.name for item in sess.get_inputs()]
    output_names = [item.name for item in sess.get_outputs()]
    
    while True:
        image = np.random.uniform(-1, 1, size=(1, 3, 1280, 1280)).astype(np.float32)
        start_time = time.time()
        print(f'{datetime.now()} starts')
        sess.run(output_names, {input_names[0]: image})
        print(f'{datetime.now()} elapsed {time.time() - start_time}')

Urgency

No response

Platform

Linux

OS Version

Ubuntu 18.04

ONNX Runtime Installation

Built from Source

ONNX Runtime Version or Commit ID

v1.4.0

ONNX Runtime API

Python

Architecture

X64

Execution Provider

CUDA

Execution Provider Library Version

CUDA 10.1, CUDNN 7.6.5, Driver 430.50, NVIDIA 2080 Ti

@github-actions github-actions bot added the ep:CUDA issues related to the CUDA execution provider label Jul 19, 2024
@tianleiwu
Copy link
Contributor

tianleiwu commented Jul 19, 2024

what do you mean gets stuck (or could you share the outputs of the above script that is not normal)?
It is likely the first inference run will take longer due to cuDNN convolution algo tuning and resource allocation, the remaining runs shall be faster.

@quarrying
Copy link
Author

what do you mean gets stuck (or could you share the outputs of the above script that is not normal)? It is likely the first inference run will take longer due to cuDNN convolution algo tuning and resource allocation, the remaining runs shall be faster.

The output is as follows:

2024-07-19 17:58:51.796326681 [I:onnxruntime:, inference_session.cc:174 ConstructorCommon] Creating and using per session threadpools since use_per_session_threads_ is true
2024-07-19 17:58:53.146773527 [I:onnxruntime:, inference_session.cc:840 Initialize] Initializing session.
2024-07-19 17:58:53.151257663 [I:onnxruntime:, reshape_fusion.cc:37 ApplyImpl] Total fused reshape node count: 0
2024-07-19 17:58:53.154013361 [I:onnxruntime:, reshape_fusion.cc:37 ApplyImpl] Total fused reshape node count: 0
2024-07-19 17:58:53.162242333 [V:onnxruntime:, inference_session.cc:679 TransformGraph] Node placements
2024-07-19 17:58:53.162261331 [V:onnxruntime:, inference_session.cc:681 TransformGraph] All nodes have been placed on [CUDAExecutionProvider].
2024-07-19 17:58:53.166021272 [V:onnxruntime:, session_state.cc:71 CreateGraphInfo] SaveMLValueNameIndexMapping
2024-07-19 17:58:53.166334752 [V:onnxruntime:, session_state.cc:116 CreateGraphInfo] Done saving OrtValue mappings.
2024-07-19 17:58:55.055747308 [I:onnxruntime:, finalize_session_state.cc:173 SaveInitializedTensors] Saving initialized tensors.
2024-07-19 17:58:55.269780199 [I:onnxruntime:, finalize_session_state.cc:225 SaveInitializedTensors] Done saving initialized tensors
2024-07-19 17:58:55.289089454 [I:onnxruntime:, inference_session.cc:954 Initialize] Session successfully initialized.
2024-07-19 17:58:55.344849 starts
2024-07-19 17:58:55.350650700 [I:onnxruntime:, sequential_executor.cc:150 Execute] Begin execution
2024-07-19 17:58:55.783378 elapsed 0.43862199783325195
2024-07-19 17:58:55.860368 starts
2024-07-19 17:58:55.869268259 [I:onnxruntime:, sequential_executor.cc:150 Execute] Begin execution
2024-07-19 17:58:55.922098 elapsed 0.06176280975341797
2024-07-19 17:58:55.984943 starts
2024-07-19 17:58:55.989988341 [I:onnxruntime:, sequential_executor.cc:150 Execute] Begin execution
2024-07-19 17:58:56.042832 elapsed 0.05792117118835449
2024-07-19 17:58:56.097794 starts
2024-07-19 17:58:56.102932070 [I:onnxruntime:, sequential_executor.cc:150 Execute] Begin execution
2024-07-19 17:58:56.154957 elapsed 0.057192087173461914
2024-07-19 17:58:56.209661 starts
2024-07-19 17:58:56.214824273 [I:onnxruntime:, sequential_executor.cc:150 Execute] Begin execution
2024-07-19 17:58:56.267699 elapsed 0.058066606521606445
2024-07-19 17:58:56.322444 starts
2024-07-19 17:58:56.327602975 [I:onnxruntime:, sequential_executor.cc:150 Execute] Begin execution
2024-07-19 17:58:56.379816 elapsed 0.05740189552307129
2024-07-19 17:58:56.434758 starts
2024-07-19 17:58:56.439920970 [I:onnxruntime:, sequential_executor.cc:150 Execute] Begin execution
2024-07-19 17:58:56.492616 elapsed 0.05788826942443848
2024-07-19 17:58:56.548453 starts
2024-07-19 17:58:56.553534271 [I:onnxruntime:, sequential_executor.cc:150 Execute] Begin execution
2024-07-19 17:58:56.605954 elapsed 0.05753040313720703
2024-07-19 17:58:56.662649 starts
2024-07-19 17:58:56.667794682 [I:onnxruntime:, sequential_executor.cc:150 Execute] Begin execution
2024-07-19 17:58:56.719645 elapsed 0.05702567100524902
2024-07-19 17:58:56.775130 starts
2024-07-19 17:58:56.780257027 [I:onnxruntime:, sequential_executor.cc:150 Execute] Begin execution
2024-07-19 17:58:56.833278 elapsed 0.05822920799255371
2024-07-19 17:58:56.891268 starts
2024-07-19 17:58:56.896471060 [I:onnxruntime:, sequential_executor.cc:150 Execute] Begin execution
2024-07-19 17:58:56.948649 elapsed 0.057410240173339844
2024-07-19 17:58:57.003492 starts
2024-07-19 17:58:57.008576873 [I:onnxruntime:, sequential_executor.cc:150 Execute] Begin execution

The program may stop at any inference time, which could be the first time, the second time, or any other time.

@tianleiwu
Copy link
Contributor

1.4 is too old.
Could you upgrade to onnxruntime-gpu 1.18.1 and cuda 11.8, cudnn 8.9?

Copy link
Contributor

This issue has been automatically marked as stale due to inactivity and will be closed in 30 days if no further activity occurs. If further support is needed, please provide an update and/or more details.

@github-actions github-actions bot added the stale issues that have not been addressed in a while; categorized by a bot label Aug 20, 2024
@andreaslenz3
Copy link

I have a similar issue for CPU Execution. The execution times rise 10x after approx 1h.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ep:CUDA issues related to the CUDA execution provider stale issues that have not been addressed in a while; categorized by a bot
Projects
None yet
Development

No branches or pull requests

3 participants