This page is accessible via roadmap.vllm.ai
This is a living document! For each item here, we intend to link the RFC as well as discussion Slack channel in the vLLM Slack
Core Themes
Path to vLLM v1.0.0
We want to fully remove the V0 engine and clean up the codebase for unpopular and unsupported features. The v1.0.0 version of vLLM will be performant and easy to maintain, as well as modular and extensible, with backward compatibility.
Cluster Scale Serving
As the model expands in size, serving them in multi-node scale-out and disaggregating prefill and decode becomes the way to go. We are fully committed to making vLLM the best engine for cluster scale serving.
vLLM for Production
vLLM is designed for production. We will continue to enhance stability and tune the systems around vLLM for optimal performance.
Features
Models
Use Case
Hardware
Optimizations
Community
vLLM Ecosystem
If any of the items you wanted is not on the roadmap, your suggestion and contribution is strongly welcomed! Please feel free to comment in this thread, open feature request, or create an RFC.
Historical Roadmap: #11862, #9006, #5805, #3861, #2681, #244
This page is accessible via roadmap.vllm.ai
This is a living document! For each item here, we intend to link the RFC as well as discussion Slack channel in the vLLM Slack
Core Themes
Path to vLLM v1.0.0
We want to fully remove the V0 engine and clean up the codebase for unpopular and unsupported features. The v1.0.0 version of vLLM will be performant and easy to maintain, as well as modular and extensible, with backward compatibility.
Jump decodingCluster Scale Serving
As the model expands in size, serving them in multi-node scale-out and disaggregating prefill and decode becomes the way to go. We are fully committed to making vLLM the best engine for cluster scale serving.
vLLM for Production
vLLM is designed for production. We will continue to enhance stability and tune the systems around vLLM for optimal performance.
Features
Models
Use Case
Hardware
Optimizations
Community
vLLM Ecosystem
Hardware Plugins
AIBrix: v0.3.0 roadmap aibrix#698
Production Stack: [Roadmap] vLLM Production Stack roadmap for 2025 Q2 production-stack#300
Ray LLM: [llm] Roadmap for Data and Serve LLM APIs ray-project/ray#51313
LLM Compressor
GuideLLM
Dynamo
Prioritized Support for RLHF Systems: veRL, OpenRLHF, TRL, OpenInstruct, Fairseq2, ...
If any of the items you wanted is not on the roadmap, your suggestion and contribution is strongly welcomed! Please feel free to comment in this thread, open feature request, or create an RFC.
Historical Roadmap: #11862, #9006, #5805, #3861, #2681, #244