Skip to content

vLLM v0.1.3

Compare
Choose a tag to compare
@WoosukKwon WoosukKwon released this 02 Aug 23:56
· 1287 commits to main since this release
aa84c92

What's Changed

Major changes

  • More model support: LLaMA 2, Falcon, GPT-J, Baichuan, etc.
  • Efficient support for MQA and GQA.
  • Changes in the scheduling algorithm: vLLM now uses a TGI-style continuous batching.
  • And many bug fixes.

All changes

New Contributors

Full Changelog: v0.1.2...v0.1.3