Feature request
NVIDIA Triton can running multiple models under the same GPU.
I didn't see this as an option for this project.
Is that possible as of today or this needs to be a new feature?
Motivation
GPU are expensive, especially when running on large model that needs the A40 and A100.
It will be great if I can move more models than the number of GPUs.
Your contribution
Sure, I can contribute if development is needed.