[v0.2.2] Release Tracker #1551

WoosukKwon · 2023-11-02T17:07:19Z

ETA: ~~Nov 3rd (Fri) - Nov 6th (Mon).~~ Nov 17th (Fri) - 19th (Sun).

Major changes

Extensive refactoring for better tensor parallelism & quantization support
Changes in scheduler: from 1D flattened input tensor to 2D tensor
Bump up to PyTorch v2.1 + CUDA 12.1
New models: Yi, ChatGLM, Phi
Added LogitsProcessor API
Preliminary support for SqueezeLLM

PRs to be merged before the release

CUDA 12 Upgrade to CUDA 12 #1527
Yarn YaRN support implementation #1264, YaRN tests #1161
Phi model [models] Microsoft Phi 1.5 #1664
~~[ ] Support embedding inputs Support generation from input embedding #1265~~

WoosukKwon · 2023-11-02T17:07:51Z

@zhuohan123 @simon-mo Please feel free to add if you have any PR that should be merged for the next release.

simon-mo · 2023-11-02T17:19:46Z

I can release the corresponding docker image as well! Hopefully we can get logits processor in as well?

esmeetu · 2023-11-06T13:23:22Z

It seems that there is something wrong with batch processing at the current main branch.
When i open a api server that serving https://huggingface.co/WizardLM/WizardCoder-1B-V1.0 model, and testing humaneval by using batch requests(164 concurrent reqs) and greedy sampling, it gives me less than 10% on main branch whereas 0.2.1-post1 gives me 23.17%.
related issue: #1570
I have no idea about this and hope this will be addressed at the coming v0.2.2. Thanks!

miko7879 · 2023-11-06T18:37:24Z

Will v0.2.2 work with CUDA 11.8?

esmeetu · 2023-11-07T03:20:32Z

It seems that there is something wrong with batch processing at the current main branch.
When i open a api server that serving https://huggingface.co/WizardLM/WizardCoder-1B-V1.0 model, and testing humaneval by using batch requests(164 concurrent reqs) and greedy sampling, it gives me less than 10% on main branch whereas 0.2.1-post1 gives me 23.17%.
related issue: #1570
I have no idea about this and hope this will be addressed at the coming v0.2.2. Thanks!

#1546 Fix this.

shuaiwang2022 · 2023-11-13T08:20:00Z

When is the v0.2.2 version scheduled to be released?

WoosukKwon · 2023-11-16T21:25:39Z

@incrementallearning It's in the process. We are planning to release it asap. ETA is Nov 17th (Fri) - 19th (Sun).

WoosukKwon added the release Related to new version release label Nov 2, 2023

WoosukKwon pinned this issue Nov 2, 2023

simon-mo mentioned this issue Nov 10, 2023

Why does this "logits_processors" parameter appear to be missing during operation?(v0.2.1.post1) #1619

Closed

WoosukKwon linked a pull request Nov 16, 2023 that will close this issue

Bump up the version to v0.2.2 #1689

Merged

This was referenced Nov 17, 2023

pip install vllm vs pip install -e . #1697

Closed

Documentation about official docker image #1709

Merged

WoosukKwon closed this as completed in #1689 Nov 19, 2023

WoosukKwon unpinned this issue Nov 19, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[v0.2.2] Release Tracker #1551

[v0.2.2] Release Tracker #1551

WoosukKwon commented Nov 2, 2023 •

edited

WoosukKwon commented Nov 2, 2023

simon-mo commented Nov 2, 2023

esmeetu commented Nov 6, 2023 •

edited

miko7879 commented Nov 6, 2023

esmeetu commented Nov 7, 2023

shuaiwang2022 commented Nov 13, 2023

WoosukKwon commented Nov 16, 2023

[v0.2.2] Release Tracker #1551

[v0.2.2] Release Tracker #1551

Comments

WoosukKwon commented Nov 2, 2023 • edited

Major changes

PRs to be merged before the release

WoosukKwon commented Nov 2, 2023

simon-mo commented Nov 2, 2023

esmeetu commented Nov 6, 2023 • edited

miko7879 commented Nov 6, 2023

esmeetu commented Nov 7, 2023

shuaiwang2022 commented Nov 13, 2023

WoosukKwon commented Nov 16, 2023

WoosukKwon commented Nov 2, 2023 •

edited

esmeetu commented Nov 6, 2023 •

edited