Feature request
Allow multiple models and/or multiple instances of the same model to execute in parallel on the same system.
Motivation
Maximize the use of GPU resources at idle to improve performance in multi GPUs system.
Your contribution
None
Comment
Is there already a way to improve performance by loading the same model on each GPU?