Run Multiple Models on the Same GPU with Amazon SageMaker Multi-Model Endpoints Powered by NVIDIA Triton Inference Server. A Java client is also provided.
-
Updated
Nov 14, 2022 - Java
Run Multiple Models on the Same GPU with Amazon SageMaker Multi-Model Endpoints Powered by NVIDIA Triton Inference Server. A Java client is also provided.
Add a description, image, and links to the triton-inference-server topic page so that developers can more easily learn about it.
To associate your repository with the triton-inference-server topic, visit your repo's landing page and select "manage topics."