Create API for hardware activation (Nvidia)

`v1/hardware/activate`

For Nvidia card we can use below API
![image](https://github.com/user-attachments/assets/2913066f-8984-4365-a866-872d68511b44)

```
Records device as the device on which the active host thread executes the device code. 
If the host thread has already initialized the CUDA runtime by calling non-device management 
runtime functions or if there exists a CUDA driver context active on the host thread, then this
call returns cudaErrorSetOnActiveProcess
```
Note that we may need to restart Server to apply this change

Update: `Records device as the device on which the active host thread executes the device code`. Seems like we can not apply the active device to all threads. 

Question: Should we use `CUDA_VISIBLE_DEVICE`?
Answer: This is the best approach that I know. We will support `TensorRT-LLM` and `Onnx`(?), so it will reduce the complexity because we don't need to change the logic.

For AMD, I think we have `ROCM_VISIBLE_DEVICE`. Any environment variable can be useful?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Create API for hardware activation (Nvidia) #1603

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Create API for hardware activation (Nvidia) #1603

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions