Skip to content
This repository was archived by the owner on Jul 4, 2025. It is now read-only.
This repository was archived by the owner on Jul 4, 2025. It is now read-only.

Create API for hardware activation (Nvidia) #1603

@vansangpfiev

Description

@vansangpfiev

v1/hardware/activate

For Nvidia card we can use below API
image

Records device as the device on which the active host thread executes the device code. 
If the host thread has already initialized the CUDA runtime by calling non-device management 
runtime functions or if there exists a CUDA driver context active on the host thread, then this
call returns cudaErrorSetOnActiveProcess

Note that we may need to restart Server to apply this change

Update: Records device as the device on which the active host thread executes the device code. Seems like we can not apply the active device to all threads.

Question: Should we use CUDA_VISIBLE_DEVICE?
Answer: This is the best approach that I know. We will support TensorRT-LLM and Onnx(?), so it will reduce the complexity because we don't need to change the logic.

For AMD, I think we have ROCM_VISIBLE_DEVICE. Any environment variable can be useful?

Metadata

Metadata

Assignees

Type

No type

Projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions