Releases: triton-inference-server/model_analyzer
Releases · triton-inference-server/model_analyzer
Release 1.40.0 corresponding to NGC container 24.05
Release 1.39.0 corresponding to NGC container 24.04
New Features and Improvements
- Model Analyzer now supports profiling Large Language Models (LLMs) using GenAI-Perf
Release 1.38.0 corresponding to NGC container 24.03
v1.38.0 Update README.md for 1.38.0 / 24.03 (#848)
Release 1.37.0 corresponding to NGC container 24.02
v1.37.0 Update README.md for 24.02 (#830)
Release 1.36.0 corresponding to NGC container 24.01
New Features and Improvements
- Model Analyzer now correctly loads and optimizes ensemble models
- Model Analyzer now correctly works with SSL via gRPC
- Model Analyzer now handles the case of optimizing a model on a remote Triton server without requiring a local GPU
Release 1.35.0 corresponding to NGC container 23.12
Known issues
- Model Analyzer is not able to analyze and optimize ensemble model configs due to a bug in the way composing models are loaded.
- Model Analyzer does not work with SSL via gRPC
Release 1.34.0 corresponding to NGC container 23.11
v1.34.0 Update README and verssion for 1.34.0 / 23.11 (#788)
Release 1.33.0 corresponding to NGC container 23.10
v1.33.0 Update README and versions for 23.10 branch (#772)
Release 1.32.0 corresponding to NGC container 23.09
-
Remote mode now has the same capabilities as other modes
-
Supports profiling in both brute and quick search modes
Release 1.31.0 corresponding to NGC container 23.08
- Added Quick Start guides for Ensemble and BLS models