Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 1 addition & 10 deletions docs/source/getting-started/installation_gpu.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,16 +44,7 @@ export PLATFORM=cuda
pip install -v -e . --no-build-isolation
```

After installation, please apply patch to ensure uc_connector can be used:

```bash
cd $(pip show vllm | grep Location | awk '{print $2}')
git apply /vllm-workspace/unified-cache-management/ucm/integration/vllm/patch/0.9.2/vllm-adapt-pc.patch
git apply /vllm-workspace/unified-cache-management/ucm/integration/vllm/patch/0.9.2/vllm-adapt-aggre.patch
git apply /vllm-workspace/unified-cache-management/ucm/integration/vllm/patch/0.9.2/vllm-adapt-sparse.patch
```

Refer to this [issue](https://github.com/vllm-project/vllm/issues/21702) to see details of this patch's changes.
**Note:** Patches are now applied automatically via dynamic patching when you import the unified-cache-management package. You no longer need to manually apply patches using `git apply`. The patches are automatically applied when you use the `UnifiedCacheConnectorV1` connector.

## Setup from docker
Download the pre-built `vllm/vllm-openai:v0.9.2` docker image and build unified-cache-management docker image by commands below:
Expand Down
11 changes: 5 additions & 6 deletions docs/source/getting-started/installation_npu.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,16 +39,15 @@ docker run --rm \
-v /root/.cache:/root/.cache \
-it $IMAGE bash
```
Codes of vLLM and vLLM Ascend are placed in /vllm-workspace, you can refer to [vLLM-Ascend Installation](https://vllm-ascend.readthedocs.io/en/latest/installation.html) for more information. After installation, please apply patches to ensure uc_connector can be used:
```bash
cd /vllm-workspace/vllm
git apply /vllm-workspace/unified-cache-management/ucm/integration/vllm/patch/0.9.2/vllm-adapt-pc.patch
git apply /vllm-workspace/unified-cache-management/ucm/integration/vllm/patch/0.9.2/vllm-adapt-aggre.patch
git apply /vllm-workspace/unified-cache-management/ucm/integration/vllm/patch/0.9.2/vllm-adapt-sparse.patch
Codes of vLLM and vLLM Ascend are placed in /vllm-workspace, you can refer to [vLLM-Ascend Installation](https://vllm-ascend.readthedocs.io/en/latest/installation.html) for more information.

**Note:** For vLLM patches, they are now applied automatically via dynamic patching when you import the unified-cache-management package. However, for vLLM-Ascend, you still need to manually apply the vLLM-Ascend specific patch:

```bash
cd /vllm-workspace/vllm-ascend
git apply /vllm-workspace/unified-cache-management/ucm/integration/vllm/patch/0.9.2/vllm-ascend-adapt.patch
```

Refer to these issues [vllm-issue](https://github.com/vllm-project/vllm/issues/21702) and [vllm-ascend-issue](https://github.com/vllm-project/vllm-ascend/issues/2057) to see details of patches' changes.

### Build from source code
Expand Down
48 changes: 48 additions & 0 deletions ucm/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
#
# MIT License
#
# Copyright (c) 2025 Huawei Technologies Co., Ltd. All rights reserved.
#
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
#
# The above copyright notice and this permission notice shall be included in all
# copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
# SOFTWARE.
#

"""
vLLM integration module for Unified Cache Management.

This module automatically applies patches to vLLM when imported,
eliminating the need for manual `git apply` commands.
"""

# Auto-apply patches when this module is imported
try:
from ucm.integration.vllm.patch.apply_patch import ensure_patches_applied

ensure_patches_applied()
except Exception as e:
# Don't fail if patches can't be applied - might be running in environment without vLLM
import warnings

warnings.warn(
f"Failed to apply vLLM patches: {e}. "
f"If you're using vLLM, ensure it's installed and patches are compatible."
)

from ucm.integration.vllm.uc_connector import UnifiedCacheConnectorV1

__all__ = ["UnifiedCacheConnectorV1"]
Empty file.
187 changes: 187 additions & 0 deletions ucm/integration/vllm/patch/apply_patch.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,187 @@
#
# MIT License
#
# Copyright (c) 2025 Huawei Technologies Co., Ltd. All rights reserved.
#
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
#
# The above copyright notice and this permission notice shall be included in all
# copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
# SOFTWARE.
#
"""
Monkey patching module for vLLM to apply UCM patches automatically.
This replaces the need for manual `git apply` commands.
"""

import sys
from typing import Optional

from ucm.logger import init_logger

logger = init_logger(__name__)

import os

PLATFORM = os.getenv("PLATFORM")


def _patch_ascend() -> bool:
return PLATFORM == "ascend"


# Track if patches have been applied
_patches_applied = False
_import_hook_installed = False
_vllm_version: Optional[str] = None
_vllm_import_hook = None


def get_vllm_version() -> Optional[str]:
"""Detect vLLM version."""
global _vllm_version
if _vllm_version is not None:
return _vllm_version

try:
# Try to get version from vllm module
import vllm as vllm_pkg

vllm_version = vllm_pkg.__version__
return vllm_version
except ImportError:
logger.warning("vLLM is not installed")
return None
except Exception as e:
logger.warning(f"Failed to detect vLLM version: {e}")
return None


def get_supported_versions() -> list[str]:
"""Get list of supported vLLM versions."""
return ["0.9.2"]


def apply_all_patches() -> None:
"""Apply all vLLM patches based on detected version."""
global _patches_applied
if _patches_applied:
return

try:
version = get_vllm_version()
if version is None:
raise ValueError("Could not detect vLLM version")

supported_versions = get_supported_versions()
if version not in supported_versions:
logger.warning(
f"vLLM version {version} is not explicitly supported. "
f"Supported versions: {', '.join(supported_versions)}. "
f"Attempting to apply 0.9.2 patches..."
)
raise ValueError(f"vLLM version {version} is not explicitly supported")

# Apply version-specific patches
if version == "0.9.1":
_apply_patches_v091()
elif version == "0.9.2":
_apply_patches_v092()
else:
raise ValueError(f"Unsupported vLLM version: {version}")

_patches_applied = True
logger.info(f"All vLLM patches applied successfully for version {version}")
except Exception as e:
logger.error(f"Failed to apply vLLM patches: {e}", exc_info=True)
raise


def _apply_patches_v091() -> None:
"""Apply patches for vLLM 0.9.1."""
from .patch_funcs.v091.vllm_adapt import _apply_adapt_patch

_apply_adapt_patch() # apply vllm-adapt-pc.patch
if _patch_ascend():
from .patch_funcs.v091.vllm_ascend_adapt import _apply_ascend_patch

_apply_ascend_patch() # apply vllm-ascend-adapt.patch


def _apply_patches_v092() -> None:
"""Apply patches for vLLM 0.9.2."""
from .patch_funcs.v092.vllm_adapt import _apply_adapt_patches

_apply_adapt_patches()

if _patch_ascend():
from .patch_funcs.v092.vllm_ascend_adapt import _apply_ascend_patch

_apply_ascend_patch() # apply vllm-ascend-adapt.patch


def install_import_hook() -> None:
"""Install an import hook to automatically apply patches when vLLM is imported."""
global _import_hook_installed, _vllm_import_hook

if _import_hook_installed:
return

try:
# Check if vLLM is already imported
if "vllm" in sys.modules:
# vLLM already imported, apply patches immediately
apply_all_patches()
_import_hook_installed = True
else:
# Install import hook by wrapping the builtin __import__ function
# This intercepts all imports and applies patches when vLLM is imported
import builtins

original_import = builtins.__import__

def import_hook(name, globals=None, locals=None, fromlist=(), level=0):
# Call original import
module = original_import(name, globals, locals, fromlist, level)

# If the main vLLM module is being imported, apply patches
# We only check for 'vllm' (not submodules) to avoid multiple patch attempts
if name == "vllm" and not _patches_applied:
try:
apply_all_patches()
except Exception as e:
logger.warning(f"Failed to apply patches during import: {e}")

return module

# Replace builtin __import__
builtins.__import__ = import_hook
_vllm_import_hook = import_hook
_import_hook_installed = True
logger.debug("Import hook installed to intercept vLLM imports")

except Exception as e:
logger.warning(f"Failed to install import hook: {e}")


def ensure_patches_applied() -> None:
"""Ensure patches are applied, installing import hook if needed."""
if not _patches_applied:
# Try to apply patches immediately
try:
apply_all_patches()
except Exception:
# If it fails (vLLM not imported yet), install hook
install_import_hook()
Empty file.
Empty file.
28 changes: 28 additions & 0 deletions ucm/integration/vllm/patch/patch_funcs/v091/vllm_adapt.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
#
# MIT License
#
# Copyright (c) 2025 Huawei Technologies Co., Ltd. All rights reserved.
#
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
#
# The above copyright notice and this permission notice shall be included in all
# copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
# SOFTWARE.
#


def _apply_adapt_patch() -> None:
"""Apply patches for vLLM 0.9.1."""
raise NotImplementedError("vLLM 0.9.1 is not supported for Ascend")
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
#
# MIT License
#
# Copyright (c) 2025 Huawei Technologies Co., Ltd. All rights reserved.
#
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
#
# The above copyright notice and this permission notice shall be included in all
# copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
# SOFTWARE.
#


def _apply_ascend_patch() -> None:
"""Apply patches for vLLM 0.9.1."""
raise NotImplementedError("vLLM 0.9.1 is not supported for Ascend")
Empty file.
Loading