Release v0.1.0 Initial release of AIConfigurator · ai-dynamo/aiconfigurator

AIConfigurator is a tool designed for Dynamo to optimize disaggregated serving for generative AI models. It automatically finds optimal deployment configurations by searching thousands of candidates in tens of seconds, helping you achieve better throughput and latency in disaggregated serving.

Major Features

Automated Configuration Search: Search across thousands of deployment configurations to find optimal one of both disaggregated and aggregated system and do intelligent choice of disaggregated or aggregated deployment.
SLA-based Optimization: Optimize under TTFT (Time-To-First-Token) and TPOT (Time-Per-Output-Token) constraints to address throughput@latency problem
Dynamo Integration: Seamless integration with Dynamo by automatic generation of deployment configurations
Multi-framework Support: Compatible with NVIDIA TensorRT-LLM backend with extensible architecture for other frameworks (coming soon)

Model and System Support

Comprehensive Model Support:
- GPT
- LLAMA (2,3)
- MoE
- QWEN
- DEEPSEEK_V3
- NEMOTRON model families
System Support: H200 SXM and H100 SXM

User Interfaces

Command Line Interface (Suggested): Simple CLI with 3 basic arguments for quick start and configuration generation
Web Application: Interactive web interface for advanced configuration tuning and visualization

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.1.0 Initial release of AIConfigurator

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Major Features

Model and System Support

User Interfaces

Uh oh!