update readme for vLLM 0.17.0 release on Intel GPU by yma11 · Pull Request #971 · intel/ai-containers

yma11 · 2026-03-25T06:46:21Z

Description

release note for v0.17.0 release.

Related Issue

Changes Made

The code follows the project's coding standards.
No Intel Internal IP is present within the changes.
The documentation has been updated to reflect any changes in functionality.

Validation

I have tested any changes in container groups locally with test_runner.py with all existing tests passing, and I have added new tests where applicable.

Signed-off-by: Yan Ma <yan.ma@intel.com>

yma11 · 2026-03-25T06:46:49Z

@rogerxfeng8 please take a look. Thanks.

rogerxfeng8 · 2026-03-25T06:50:35Z

+| KMD Driver | 6.14.0 |
+| oneAPI | 2025.3.2.4 with hotfix |
+| PyTorch | 2.10 |
+| vllm-xpu-kernels | 0.14.0 |


rogerxfeng8 · 2026-03-25T06:52:27Z

+
+* **torch.compile**: Can be enabled for the FP16/BF16 path.
+* **speculative decoding**: Supports methods `n-gram`, `EAGLE`, `EAGLE3`, `medusa` and `suffix`. For detailed usage, refer [document](https://docs.vllm.ai/en/stable/features/speculative_decoding/).
+* **async scheduling**: Can be enabled by `--async-scheduling`. This may help reduce the CPU overheads, leading to better latency and throughput.


Async scheduling is not supported in this release.

It's disabled by default but user can explicitly set it. It doesn't fail in all cases so I think we can call it experimental.

Signed-off-by: Yan Ma <yan.ma@intel.com>

rogerxfeng8 · 2026-03-26T06:01:28Z

+In addition, features such as [reasoning_outputs](https://docs.vllm.ai/en/latest/features/reasoning_outputs.html), [structured_outputs](https://docs.vllm.ai/en/latest/features/structured_outputs.html), and [tool calling](https://docs.vllm.ai/en/latest/features/tool_calling.html) are supported. The following experimental features are also available:
+
+* **torch.compile**: Can be enabled for the FP16/BF16 path.
+* **speculative decoding**: Supports methods `n-gram`, `EAGLE`, `EAGLE3`, `medusa` and `suffix`. For detailed usage, refer [document](https://docs.vllm.ai/en/stable/features/speculative_decoding/).


refer -> refer to

Signed-off-by: Yan Ma <yan.ma@intel.com>

update readme for vLLM 0.17.0 release on Intel GPU

845ff29

Signed-off-by: Yan Ma <yan.ma@intel.com>

yma11 requested review from jitendra42, sharvil10 and sramakintel as code owners March 25, 2026 06:46

rogerxfeng8 reviewed Mar 25, 2026

View reviewed changes

address comments

068d3fd

Signed-off-by: Yan Ma <yan.ma@intel.com>

rogerxfeng8 reviewed Mar 26, 2026

View reviewed changes

update

0837dc7

Signed-off-by: Yan Ma <yan.ma@intel.com>

yma11 force-pushed the 017 branch from ec31878 to 0837dc7 Compare March 26, 2026 06:48

rogerxfeng8 approved these changes Mar 26, 2026

View reviewed changes

jitendra42 approved these changes Mar 26, 2026

View reviewed changes

jitendra42 merged commit 573910e into intel:main Mar 26, 2026
7 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

update readme for vLLM 0.17.0 release on Intel GPU#971

update readme for vLLM 0.17.0 release on Intel GPU#971
jitendra42 merged 3 commits into
intel:mainfrom
yma11:017

yma11 commented Mar 25, 2026

Uh oh!

yma11 commented Mar 25, 2026

Uh oh!

rogerxfeng8 Mar 25, 2026

Uh oh!

rogerxfeng8 Mar 25, 2026

Uh oh!

yma11 Mar 25, 2026

Uh oh!

rogerxfeng8 Mar 26, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

yma11 commented Mar 25, 2026

Description

Related Issue

Changes Made

Validation

Uh oh!

yma11 commented Mar 25, 2026

Uh oh!

rogerxfeng8 Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

rogerxfeng8 Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

yma11 Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

rogerxfeng8 Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants