Build, run, and setup scripts for the complete TensorRT-LLM pipeline on RTX A6000 Ada (SM89). Reproducible path from HuggingFace checkpoint to deployable .engine file, with FP16 baseline and FP8 quantization. Companion material to the 4-part blog series on ai-box.eu — in preparation for the NVIDIA TensorRT Edge-LLM ecosystem.
-
Updated
May 16, 2026 - Shell