Skip to content

Latest commit

 

History

History
75 lines (66 loc) · 3.41 KB

install.md

File metadata and controls

75 lines (66 loc) · 3.41 KB

Installation

The FIL backend is a part of Triton and can be installed via the methods described in the main Triton documentation. To quickly get up and running with a Triton Docker image, follow these steps.

Note: Looking for instructions to build the FIL backend yourself? Check out our build guide.

Prerequisites

Getting the container

Triton containers are available from NGC and may be pulled down via

docker pull nvcr.io/nvidia/tritonserver:22.10-py3

Note that the FIL backend cannot be used in the 21.06 version of this container; the 21.06.1 patch release is the earliest Triton version with a working FIL backend implementation.

Starting the container

In order to actually deploy a model, you will need to provide the serialized model and configuration file in a specially-structured directory called the "model repository." Check out the configuration guide for details on how to do this for your model.

Assuming your model repository is on the host system, you can bind-mount it into the container and start the server via the following command:

docker run --gpus all -p 8000:8000 -p 8001:8001 -p 8002:8002 -v ${MODEL_REPO}:/models --name tritonserver nvcr.io/nvidia/tritonserver:22.11-py3 tritonserver --model-repository=/models

Remember that bind-mounts require an absolute path to the host directory, so ${MODEL_REPO} should be replaced by the absolute path to the model repository directory on the host.

Assuming you started your container with the name "tritonserver" as in the above snippet, you can bring the server down again and remove the container with:

docker rm -f tritonserver