Releases · ROCm/vllm

08 Nov 08:14

github-actions

v0.6.3+vllm_whl_update_testing_

e735a4b

v0.6.3+vllm_whl_update_testing_ Pre-release

Pre-release

What's Changed

Base ROCm 6.2.2 by @gshtras in #260
Upstream merge 24 11 04 by @gshtras in #262
Add gfx1201 to supported ARCH list by @qli88 in #264
[Bugfix] A fix to enable FORCED sampling again. by @Alexei-V-Ivanov-AMD in #265
Eliminated -Wswitch-bool warning and a leftover incorrect import by @gshtras in #266
Navi correctness fix 1 to 300 count by @maleksan85 in #263
Navi 1 to 300 correctness fix follow up by @maleksan85 in #267

Full Changelog: v0.6.3.post2+rocm...v0.6.3+vllm_whl_update_testing_

Contributors

gshtras, Alexei-V-Ivanov-AMD, and 2 other contributors

Assets 2

07 Nov 23:54

github-actions

v0.6.3+vllm_whl_update_testing

f804da1

v0.6.3+vllm_whl_update_testing Pre-release

Pre-release

What's Changed

Base ROCm 6.2.2 by @gshtras in #260
Upstream merge 24 11 04 by @gshtras in #262
Add gfx1201 to supported ARCH list by @qli88 in #264
[Bugfix] A fix to enable FORCED sampling again. by @Alexei-V-Ivanov-AMD in #265
Eliminated -Wswitch-bool warning and a leftover incorrect import by @gshtras in #266
Navi correctness fix 1 to 300 count by @maleksan85 in #263
Navi 1 to 300 correctness fix follow up by @maleksan85 in #267

Full Changelog: v0.6.3.post2+rocm...v0.6.3+vllm_whl_update_testing

Contributors

gshtras, Alexei-V-Ivanov-AMD, and 2 other contributors

Assets 2

04 Nov 16:53

github-actions

v0.6.3+rocm_whl_testingg

35b0cc1

v0.6.3+rocm_whl_testingg Pre-release

Pre-release

What's Changed

fp8 moe configs. Mixtral-8x(7B,22B) TP=1,2,4,8 by @divakar-amd in #250
Sccache removal from Dockerfile.rocm by @omirosh in #253
Update Dockerfile.rocm by @shajrawi in #254
Using the correct type hints by @gshtras in #256
Revert "Update Dockerfile.rocm" by @gshtras in #257

Full Changelog: v0.6.3.post1+rocm...v0.6.3+rocm_whl_testingg

Contributors

shajrawi, omirosh, and 2 other contributors

Assets 2

04 Nov 16:53

github-actions

v0.6.3+rocm_whl_testing

35b0cc1

v0.6.3+rocm_whl_testing Pre-release

Pre-release

What's Changed

fp8 moe configs. Mixtral-8x(7B,22B) TP=1,2,4,8 by @divakar-amd in #250
Sccache removal from Dockerfile.rocm by @omirosh in #253
Update Dockerfile.rocm by @shajrawi in #254
Using the correct type hints by @gshtras in #256
Revert "Update Dockerfile.rocm" by @gshtras in #257

Full Changelog: v0.6.3.post1+rocm...v0.6.3+rocm_whl_testing

Contributors

shajrawi, omirosh, and 2 other contributors

Assets 2

01 Nov 23:03

github-actions

v0.6.3.post2+rocm

733f79a

v0.6.3.post2+rocm Latest

Latest

What's Changed

fp8 moe configs. Mixtral-8x(7B,22B) TP=1,2,4,8 by @divakar-amd in #250
Sccache removal from Dockerfile.rocm by @omirosh in #253
Update Dockerfile.rocm by @shajrawi in #254
Using the correct type hints by @gshtras in #256
Revert "Update Dockerfile.rocm" by @gshtras in #257
Creating ROCm whl upon release by @gshtras in #259

Full Changelog: v0.6.3.post1+rocm...v0.6.3.post2+rocm

What's Changed

Miscellaneous cosmetic changes by @mawong-amd in #166
V5.5 upstream merge rc by @gshtras in #167
fnuz support for fbgemm fp8 by @gshtras in #169
Fixing mypy after a rushed merge by @gshtras in #171
[fix] moe padding for reading correct tuned config by @divakar-amd in #172
Upstream merge 24/9/9 by @gshtras in #174
Restoring deleted .buildkite/test-template.j2 by @Alexei-V-Ivanov-AMD in #177
Support commandr on ROCm by @shajrawi in #180
Correct type hint by @gshtras in #173
update custom PA kernel with support for fp8 kv cache dtype by @sanyalington in #87
Support Grok-1 by @kkHuang-amd in #181
Adding MLPerf optimization to 0.6.0 by @charlifu in #182
6.2 dockerfile by @gshtras in #176
[Grok1] fix the name of input scale factor for autofp8 run by @kkHuang-amd in #183
[Grok-1] fix the run-time error "Can't pickle <class 'transformers_mo… by @kkHuang-amd in #184
Upstream merge 24/09/16 by @gshtras in #187
Perf improvement: remove redundant torch slice; Match decode PA partition size to csrc by @sanyalington in #188
refactor dbrx experts to use FusedMoe layer by @divakar-amd in #186
Disable moe padding by default and enable fp8 padding by default. by @charlifu in #190
Enabling Splitting HW by Buildkite Agents by @Alexei-V-Ivanov-AMD in #191
Revert "remove redundant slice; match decode PA partition size with csrc (#188)" by @gshtras in #194
[Grok-1] 1. upload moe configuration file for moe kernel optimization… by @kkHuang-amd in #193
Removing the original text in reminder_comment.yml by @Alexei-V-Ivanov-AMD in #195
Fix PA custom and PA v2 tests and partition sizes by @mawong-amd in #196
Adding P3L measurement to the benchmarks collection tools. by @Alexei-V-Ivanov-AMD in #197
Swapping the order of sampling operations in the conditional selector. by @Alexei-V-Ivanov-AMD in #199
remove redundant slice when chunked prefill feature is disabled by @sanyalington in #201
Fixing P3L incompatibility with cython. by @Alexei-V-Ivanov-AMD in #200
Bias and more metadata in gradlib and tuned gemm by @gshtras in #202
Upstream merge 24 9 23 by @gshtras in #203
Gating n=0 case from skinny gemm by @gshtras in #204
Revert "[Kernel] changing fused moe kernel chunk size default to 32k (vllm-project#7995)" by @gshtras in #207
re-enable avoid torch slice fix when chunked prefill is disabled by @sanyalington in #209
add block_manager_v2.py into setup_cython by @sanyalington in #210
extend moe padding to DUMMY weights by @divakar-amd in #211
[Int4-AWQ] Fix AWQ Marlin check for ROCm by @hegemanjw4amd in #206
RPD Profiling by @dllehr-amd in #208
Cythonize vllm build by @maleksan85 in #214
Fix Dockerfile.rocm by @gshtras in #215
fix dbrx weight loader by @divakar-amd in #212
Upstream merge 24 09 27 0.6.2 by @gshtras in #213
Make rpdtracer import only when required by @Rohan138 in #216
Improve profiling setup and documentation, sync benchmarks with main by @AdrianAbeyta in #218
Installing the requirements before invoking setup.py since it now imports setuptools_scm by @gshtras in #221
llama3.2 + cross attn test by @maleksan85 in #220
Optimize CAR for ROCm by @iotamudelta in #225
Custom PA perf improvements by @sanyalington in #222
Upstream merge 24 10 08 by @gshtras in #226
customPA write fp8 small ctx fix; enable customPA write fp8 by default by @sanyalington in #227
added timeout for vllm build in rocm by @maleksan85 in #230
Add fp8 for dbrx by @charlifu in #231
Update Buildkite env variable by @dhonnappa-amd in #232
cuda graph + num-scheduler-steps bug fix by @seungrokj in #236
[Model] [BUG] Fix code path logic to load mllama model by @tjtanaa in #234
prefix-enabled FA perf issue by @seungrokj in #239
Custom PA Partition size 256 to improve performance by @sanyalington in #238
[Build/CI] Minor changes to fix internal CI process. by @Alexei-V-Ivanov-AMD in #235
[BUGFIX] Restored handling of ROCM FA output as before adaptation of llama3.2 by @maleksan85 in #241
Upstream merge 24 10 21 by @gshtras in #240
Using the correct datatype on prefix prefill for fp8 kv cache by @gshtras in #242
Update CMakeLists.txt by @gshtras in #244
update block_manager usage in setup_cython by @saienduri in #243
[Bugfix][Kernel][Misc] Basic support for SmoothQuant, symmetric case by @rasmith in #237
Add fp8 support for llama model family on Navi4x by @qli88 in #245
Custom all reduce fix mi250 by @omirosh in #247
Upstream merge 24 10 28 by @gshtras in #248
fp8 moe configs. Mixtral-8x(7B,22B) TP=1,2,4,8 by @divakar-amd in #250
Sccache removal from Dockerfile.rocm by @omirosh in #253
Update Dockerfile.rocm by @shajrawi in #254
Using the correct type hints by @gshtras in #256
Revert "Update Dockerfile.rocm" by @gshtras in #257
Creating ROCm whl upon release by @gshtras in #259

New Contributors

@kkHuang-amd made their first contribution in #181
@Rohan138 made their first contribution in #216
@AdrianAbeyta made their first contribution in #218
@dhonnappa-amd made their first contribution in #232
@seungrokj made their first contribution in #236
@tjtanaa made their first contribution in #234
@saienduri made their first contribution in #243
@qli88 made their first contribution in #245
@omirosh made their first contribution in #247

Full Changelog: v0.4.3_rocm...v0.6.3.post2+rocm

Contributors

rasmith, charlifu, and 19 other contributors

Assets 4

01 Nov 22:33

github-actions

v0.6.3+test_whl9

0f820f3

v0.6.3+test_whl9 Pre-release

Pre-release

What's Changed

fp8 moe configs. Mixtral-8x(7B,22B) TP=1,2,4,8 by @divakar-amd in #250
Sccache removal from Dockerfile.rocm by @omirosh in #253
Update Dockerfile.rocm by @shajrawi in #254
Using the correct type hints by @gshtras in #256
Revert "Update Dockerfile.rocm" by @gshtras in #257

Full Changelog: v0.6.3.post1+rocm...v0.6.3+test_whl9

Contributors

shajrawi, omirosh, and 2 other contributors

Assets 4

01 Nov 00:21

github-actions

v0.6.3+rocm_whl_test

c4f6b8e

v0.6.3+rocm_whl_test Pre-release

Pre-release

What's Changed

fp8 moe configs. Mixtral-8x(7B,22B) TP=1,2,4,8 by @divakar-amd in #250
Sccache removal from Dockerfile.rocm by @omirosh in #253
Update Dockerfile.rocm by @shajrawi in #254
Using the correct type hints by @gshtras in #256
Revert "Update Dockerfile.rocm" by @gshtras in #257

Full Changelog: v0.6.3.post1+rocm...v0.6.3+rocm_whl_test

Contributors

shajrawi, omirosh, and 2 other contributors

Assets 2

29 Oct 21:12

github-actions

v0.6.3.post1+rocm

7aa6982

v0.6.3.post1+rocm Pre-release

Pre-release

What's Changed

Upstream merge 24 10 21 by @gshtras in #240
Using the correct datatype on prefix prefill for fp8 kv cache by @gshtras in #242
Update CMakeLists.txt by @gshtras in #244
update block_manager usage in setup_cython by @saienduri in #243
[Bugfix][Kernel][Misc] Basic support for SmoothQuant, symmetric case by @rasmith in #237
Add fp8 support for llama model family on Navi4x by @qli88 in #245
Custom all reduce fix mi250 by @omirosh in #247
Upstream merge 24 10 28 by @gshtras in #248

New Contributors

@saienduri made their first contribution in #243
@qli88 made their first contribution in #245
@omirosh made their first contribution in #247

Full Changelog: v0.6.2.post1+rocm...v0.6.3.post1+rocm

Contributors

rasmith, omirosh, and 3 other contributors

Assets 2

23 Oct 00:14

github-actions

v0.6.2.post1+rocm

69d5e1d

v0.6.2.post1+rocm Pre-release

Pre-release

What's Changed

Make rpdtracer import only when required by @Rohan138 in #216
Improve profiling setup and documentation, sync benchmarks with main by @AdrianAbeyta in #218
Installing the requirements before invoking setup.py since it now imports setuptools_scm by @gshtras in #221
llama3.2 + cross attn test by @maleksan85 in #220
Optimize CAR for ROCm by @iotamudelta in #225
Custom PA perf improvements by @sanyalington in #222
Upstream merge 24 10 08 by @gshtras in #226
customPA write fp8 small ctx fix; enable customPA write fp8 by default by @sanyalington in #227
added timeout for vllm build in rocm by @maleksan85 in #230
Add fp8 for dbrx by @charlifu in #231
Update Buildkite env variable by @dhonnappa-amd in #232
cuda graph + num-scheduler-steps bug fix by @seungrokj in #236
[Model] [BUG] Fix code path logic to load mllama model by @tjtanaa in #234
prefix-enabled FA perf issue by @seungrokj in #239
Custom PA Partition size 256 to improve performance by @sanyalington in #238
[Build/CI] Minor changes to fix internal CI process. by @Alexei-V-Ivanov-AMD in #235
[BUGFIX] Restored handling of ROCM FA output as before adaptation of llama3.2 by @maleksan85 in #241

New Contributors

@Rohan138 made their first contribution in #216
@AdrianAbeyta made their first contribution in #218
@dhonnappa-amd made their first contribution in #232
@seungrokj made their first contribution in #236
@tjtanaa made their first contribution in #234

Full Changelog: v0.6.2+rocm...v0.6.2.post1+rocm

Contributors

charlifu, iotamudelta, and 9 other contributors

Assets 2

02 Oct 17:29

github-actions

v0.6.2+rocm

030374b

v0.6.2+rocm Pre-release

Pre-release

What's Changed

fix dbrx weight loader by @divakar-amd in #212
Upstream merge 24 09 27 0.6.2 by @gshtras in #213

Full Changelog: v0.6.1.post1+rocm...v0.6.2+rocm

Contributors

divakar-amd and gshtras

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What's Changed

Contributors

What's Changed

Contributors

What's Changed

Contributors

What's Changed

Contributors

What's Changed

What's Changed

New Contributors

Contributors

What's Changed

Contributors

What's Changed

Contributors

What's Changed

New Contributors

Contributors

What's Changed

New Contributors

Contributors

What's Changed

Contributors

Releases: ROCm/vllm

v0.6.3+vllm_whl_update_testing_

What's Changed

Contributors

v0.6.3+vllm_whl_update_testing

What's Changed

Contributors

v0.6.3+rocm_whl_testingg

What's Changed

Contributors

v0.6.3+rocm_whl_testing

What's Changed

Contributors

v0.6.3.post2+rocm

What's Changed

What's Changed

New Contributors

Contributors

v0.6.3+test_whl9

What's Changed

Contributors

v0.6.3+rocm_whl_test

What's Changed

Contributors

v0.6.3.post1+rocm

What's Changed

New Contributors

Contributors

v0.6.2.post1+rocm

What's Changed

New Contributors

Contributors

v0.6.2+rocm

What's Changed

Contributors