Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 1 addition & 5 deletions docker/Dockerfile.rocm
Original file line number Diff line number Diff line change
Expand Up @@ -65,7 +65,6 @@ COPY --from=build_vllm ${COMMON_WORKDIR}/vllm/tests /tests
COPY --from=build_vllm ${COMMON_WORKDIR}/vllm/examples /examples
COPY --from=build_vllm ${COMMON_WORKDIR}/vllm/docker/Dockerfile.rocm /docker/
COPY --from=build_vllm ${COMMON_WORKDIR}/vllm/.buildkite /.buildkite
# Centralized v1 package - copied to both test and final stages
COPY --from=build_vllm ${COMMON_WORKDIR}/vllm/vllm/v1 /vllm_v1

# -----------------------
Expand Down Expand Up @@ -98,7 +97,7 @@ RUN --mount=type=cache,target=/root/.cache/uv \
uv pip install --system hf_transfer
ENV HF_HUB_ENABLE_HF_TRANSFER=1

# Copy in the v1 package
# Copy in the v1 package (for python-only install test group)
COPY --from=export_vllm /vllm_v1 /usr/local/lib/python${PYTHON_VERSION}/dist-packages/vllm/v1

# Source code is used in the `python_only_compile.sh` test
Expand Down Expand Up @@ -130,9 +129,6 @@ RUN --mount=type=bind,from=export_vllm,src=/,target=/install \
&& pip uninstall -y vllm \
&& uv pip install --system *.whl

# Copy in the v1 package
COPY --from=export_vllm /vllm_v1 /usr/local/lib/python${PYTHON_VERSION}/dist-packages/vllm/v1

ARG COMMON_WORKDIR

# Copy over the benchmark scripts as well
Expand Down
4 changes: 2 additions & 2 deletions requirements/rocm-test.txt
Original file line number Diff line number Diff line change
Expand Up @@ -70,8 +70,8 @@ torchgeo==0.7.0
mteb==2.1.2

# Data processing
xgrammar @ git+https://github.com/mlc-ai/xgrammar.git@eafd4db51b78acc64b3f0764ef27dfd206c28628
# Test async scheduling
xgrammar==0.1.27
# Test async scheduling

# Utilities
num2words==0.5.14
Expand Down
19 changes: 19 additions & 0 deletions tests/models/multimodal/generation/conftest.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
# SPDX-License-Identifier: Apache-2.0
# SPDX-FileCopyrightText: Copyright contributors to the vLLM project
"""Pytest configuration for vLLM tests."""

import torch

from vllm.platforms import current_platform


def pytest_configure(config):
"""Disable Flash/MemEfficient SDP on ROCm to avoid HF
Transformers accuracy issues.
"""
if not current_platform.is_rocm():
return

torch.backends.cuda.enable_flash_sdp(False)
torch.backends.cuda.enable_mem_efficient_sdp(False)
torch.backends.cuda.enable_math_sdp(True)
Comment on lines +10 to +19
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Using pytest_configure to modify global state like torch.backends can be risky as it doesn't provide a teardown mechanism. This change persists for the entire test session and could unintentionally affect other tests that rely on the default SDP backend settings.

A safer approach is to use a session-scoped, auto-use fixture. This allows you to set up the workaround before tests run and, crucially, tear it down afterward by restoring the original settings. This ensures that the test environment is left in a clean state, preventing side effects on other tests.

import pytest


@pytest.fixture(scope="session", autouse=True)
def rocm_sdp_workaround():
    """Disable Flash/MemEfficient SDP on ROCm to avoid HF
    Transformers accuracy issues.
    """
    if not current_platform.is_rocm():
        yield
        return

    # Save original state
    flash_enabled = torch.backends.cuda.flash_sdp_enabled()
    mem_efficient_enabled = torch.backends.cuda.mem_efficient_sdp_enabled()
    math_enabled = torch.backends.cuda.math_sdp_enabled()

    # Apply workaround
    torch.backends.cuda.enable_flash_sdp(False)
    torch.backends.cuda.enable_mem_efficient_sdp(False)
    torch.backends.cuda.enable_math_sdp(True)

    yield

    # Restore original state
    torch.backends.cuda.enable_flash_sdp(flash_enabled)
    torch.backends.cuda.enable_mem_efficient_sdp(mem_efficient_enabled)
    torch.backends.cuda.enable_math_sdp(math_enabled)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The teardown is unnecessary here. Session-scoped fixtures tear down when the process is about to exit anyway—there's no subsequent code that would observe the "restored" state. This adds complexity without practical benefit. If individual tests needed different SDP settings, session scope wouldn't help regardless; you'd need function-scoped fixtures with their own save/restore.

Loading