Skip to content

How to specify priority for streams used by myelin inside TensorRT #2528

@oathdruid

Description

@oathdruid

Description

We have a multi model environment, they infer on same gpu concurrently but have different priority

For model without myelin, this priority can be specific by enqueueV2 with a pre-created stream
TensorRT then run ops on this stream, with priority we want

But for model with myelin, ops may also run on internal streams managed by myelin
these internal streams seem create with default priority, and seems no public API can change that

So, do we have any method to help priority works for model with myelin?
or can we disable myelin concurrency support, to run ops all on the pre-created stream?

Tasks

Environment

TensorRT Version: 8.4.3
NVIDIA GPU: A10
NVIDIA Driver Version: 470.82.01
CUDA Version: 11.4
Operating System: Ubuntu 20.04

Relevant Files

Steps To Reproduce

Metadata

Metadata

Labels

triagedIssue has been triaged by maintainers

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions