- Pull the Triton Inference Server Docker image
docker run --gpus=all -it --shm-size=256m \
-p8000:8000 -p8001:8001 -p8002:8002 \
-v ${PWD}:/workspace/ -v ${PWD}/model_repository:/models \
nvcr.io/nvidia/tritonserver:22.12-py3
- Inside the container, install the required packages
cd /models
pip install -r requirements.txt
- Start the Triton Inference Server
cd /opt/tritonserver
tritonserver --model-repository=/models
- Install the Triton Client
pip install tritonclient[http]
- Run demo
python client.py