ModelEngine-Group · ygwpz · Nov 17, 2025 · Nov 13, 2025
@@ -44,16 +44,7 @@ export PLATFORM=cuda
 pip install -v -e . --no-build-isolation
 ```
 
-After installation, please apply patch to ensure uc_connector can be used:
-
-```bash
-cd $(pip show vllm | grep Location | awk '{print $2}')
-git apply /vllm-workspace/unified-cache-management/ucm/integration/vllm/patch/0.9.2/vllm-adapt-pc.patch
-git apply /vllm-workspace/unified-cache-management/ucm/integration/vllm/patch/0.9.2/vllm-adapt-aggre.patch
-git apply /vllm-workspace/unified-cache-management/ucm/integration/vllm/patch/0.9.2/vllm-adapt-sparse.patch
-``` 
-
-Refer to this [issue](https://github.com/vllm-project/vllm/issues/21702) to see details of this patch's changes.
+**Note:** Patches are now applied automatically via dynamic patching when you import the unified-cache-management package. You no longer need to manually apply patches using `git apply`. The patches are automatically applied when you use the `UnifiedCacheConnectorV1` connector.
 
 ## Setup from docker
 Download the pre-built `vllm/vllm-openai:v0.9.2` docker image and build unified-cache-management docker image by commands below:

@@ -39,16 +39,15 @@ docker run --rm \
     -v /root/.cache:/root/.cache \
     -it $IMAGE bash
 ```
-Codes of vLLM and vLLM Ascend are placed in /vllm-workspace, you can refer to [vLLM-Ascend Installation](https://vllm-ascend.readthedocs.io/en/latest/installation.html) for more information. After installation, please apply patches to ensure uc_connector can be used:
-```bash
-cd /vllm-workspace/vllm
-git apply /vllm-workspace/unified-cache-management/ucm/integration/vllm/patch/0.9.2/vllm-adapt-pc.patch
-git apply /vllm-workspace/unified-cache-management/ucm/integration/vllm/patch/0.9.2/vllm-adapt-aggre.patch
-git apply /vllm-workspace/unified-cache-management/ucm/integration/vllm/patch/0.9.2/vllm-adapt-sparse.patch
+Codes of vLLM and vLLM Ascend are placed in /vllm-workspace, you can refer to [vLLM-Ascend Installation](https://vllm-ascend.readthedocs.io/en/latest/installation.html) for more information.
 
+**Note:** For vLLM patches, they are now applied automatically via dynamic patching when you import the unified-cache-management package. However, for vLLM-Ascend, you still need to manually apply the vLLM-Ascend specific patch:
+
+```bash
 cd /vllm-workspace/vllm-ascend
 git apply /vllm-workspace/unified-cache-management/ucm/integration/vllm/patch/0.9.2/vllm-ascend-adapt.patch
 ```
+
 Refer to these issues [vllm-issue](https://github.com/vllm-project/vllm/issues/21702) and [vllm-ascend-issue](https://github.com/vllm-project/vllm-ascend/issues/2057) to see details of patches' changes.
 
 ### Build from source code

@@ -0,0 +1,48 @@
+#
+# MIT License
+#
+# Copyright (c) 2025 Huawei Technologies Co., Ltd. All rights reserved.
+#
+# Permission is hereby granted, free of charge, to any person obtaining a copy
+# of this software and associated documentation files (the "Software"), to deal
+# in the Software without restriction, including without limitation the rights
+# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+# copies of the Software, and to permit persons to whom the Software is
+# furnished to do so, subject to the following conditions:
+#
+# The above copyright notice and this permission notice shall be included in all
+# copies or substantial portions of the Software.
+#
+# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+# SOFTWARE.
+#
+
+"""
+vLLM integration module for Unified Cache Management.
+
+This module automatically applies patches to vLLM when imported,
+eliminating the need for manual `git apply` commands.
+"""
+
+# Auto-apply patches when this module is imported
+try:
+    from ucm.integration.vllm.patch.apply_patch import ensure_patches_applied
+
+    ensure_patches_applied()
+except Exception as e:
+    # Don't fail if patches can't be applied - might be running in environment without vLLM
+    import warnings
+
+    warnings.warn(
+        f"Failed to apply vLLM patches: {e}. "
+        f"If you're using vLLM, ensure it's installed and patches are compatible."
+    )
+
+from ucm.integration.vllm.uc_connector import UnifiedCacheConnectorV1
+
+__all__ = ["UnifiedCacheConnectorV1"]
@@ -0,0 +1,187 @@
+#
+# MIT License
+#
+# Copyright (c) 2025 Huawei Technologies Co., Ltd. All rights reserved.
+#
+# Permission is hereby granted, free of charge, to any person obtaining a copy
+# of this software and associated documentation files (the "Software"), to deal
+# in the Software without restriction, including without limitation the rights
+# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+# copies of the Software, and to permit persons to whom the Software is
+# furnished to do so, subject to the following conditions:
+#
+# The above copyright notice and this permission notice shall be included in all
+# copies or substantial portions of the Software.
+#
+# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+# SOFTWARE.
+#
+"""
+Monkey patching module for vLLM to apply UCM patches automatically.
+This replaces the need for manual `git apply` commands.
+"""
+
+import sys
+from typing import Optional
+
+from ucm.logger import init_logger
+
+logger = init_logger(__name__)
+
+import os
+
+PLATFORM = os.getenv("PLATFORM")
+
+
+def _patch_ascend() -> bool:
+    return PLATFORM == "ascend"
+
+
+# Track if patches have been applied
+_patches_applied = False
+_import_hook_installed = False
+_vllm_version: Optional[str] = None
+_vllm_import_hook = None
+
+
+def get_vllm_version() -> Optional[str]:
+    """Detect vLLM version."""
+    global _vllm_version
+    if _vllm_version is not None:
+        return _vllm_version
+
+    try:
+        # Try to get version from vllm module
+        import vllm as vllm_pkg
+
+        vllm_version = vllm_pkg.__version__
+        return vllm_version
+    except ImportError:
+        logger.warning("vLLM is not installed")
+        return None
+    except Exception as e:
+        logger.warning(f"Failed to detect vLLM version: {e}")
+        return None
+
+
+def get_supported_versions() -> list[str]:
+    """Get list of supported vLLM versions."""
+    return ["0.9.2"]
+
+
+def apply_all_patches() -> None:
+    """Apply all vLLM patches based on detected version."""
+    global _patches_applied
+    if _patches_applied:
+        return
+
+    try:
+        version = get_vllm_version()
+        if version is None:
+            raise ValueError("Could not detect vLLM version")
+
+        supported_versions = get_supported_versions()
+        if version not in supported_versions:
+            logger.warning(
+                f"vLLM version {version} is not explicitly supported. "
+                f"Supported versions: {', '.join(supported_versions)}. "
+                f"Attempting to apply 0.9.2 patches..."
+            )
+            raise ValueError(f"vLLM version {version} is not explicitly supported")
+
+        # Apply version-specific patches
+        if version == "0.9.1":
+            _apply_patches_v091()
+        elif version == "0.9.2":
+            _apply_patches_v092()
+        else:
+            raise ValueError(f"Unsupported vLLM version: {version}")
+
+        _patches_applied = True
+        logger.info(f"All vLLM patches applied successfully for version {version}")
+    except Exception as e:
+        logger.error(f"Failed to apply vLLM patches: {e}", exc_info=True)
+        raise
+
+
+def _apply_patches_v091() -> None:
+    """Apply patches for vLLM 0.9.1."""
+    from .patch_funcs.v091.vllm_adapt import _apply_adapt_patch
+
+    _apply_adapt_patch()  # apply vllm-adapt-pc.patch
+    if _patch_ascend():
+        from .patch_funcs.v091.vllm_ascend_adapt import _apply_ascend_patch
+
+        _apply_ascend_patch()  # apply vllm-ascend-adapt.patch
+
+
+def _apply_patches_v092() -> None:
+    """Apply patches for vLLM 0.9.2."""
+    from .patch_funcs.v092.vllm_adapt import _apply_adapt_patches
+
+    _apply_adapt_patches()
+
+    if _patch_ascend():
+        from .patch_funcs.v092.vllm_ascend_adapt import _apply_ascend_patch
+
+        _apply_ascend_patch()  # apply vllm-ascend-adapt.patch
+
+
+def install_import_hook() -> None:
+    """Install an import hook to automatically apply patches when vLLM is imported."""
+    global _import_hook_installed, _vllm_import_hook
+
+    if _import_hook_installed:
+        return
+
+    try:
+        # Check if vLLM is already imported
+        if "vllm" in sys.modules:
+            # vLLM already imported, apply patches immediately
+            apply_all_patches()
+            _import_hook_installed = True
+        else:
+            # Install import hook by wrapping the builtin __import__ function
+            # This intercepts all imports and applies patches when vLLM is imported
+            import builtins
+
+            original_import = builtins.__import__
+
+            def import_hook(name, globals=None, locals=None, fromlist=(), level=0):
+                # Call original import
+                module = original_import(name, globals, locals, fromlist, level)
+
+                # If the main vLLM module is being imported, apply patches
+                # We only check for 'vllm' (not submodules) to avoid multiple patch attempts
+                if name == "vllm" and not _patches_applied:
+                    try:
+                        apply_all_patches()
+                    except Exception as e:
+                        logger.warning(f"Failed to apply patches during import: {e}")
+
+                return module
+
+            # Replace builtin __import__
+            builtins.__import__ = import_hook
+            _vllm_import_hook = import_hook
+            _import_hook_installed = True
+            logger.debug("Import hook installed to intercept vLLM imports")
+
+    except Exception as e:
+        logger.warning(f"Failed to install import hook: {e}")
+
+
+def ensure_patches_applied() -> None:
+    """Ensure patches are applied, installing import hook if needed."""
+    if not _patches_applied:
+        # Try to apply patches immediately
+        try:
+            apply_all_patches()
+        except Exception:
+            # If it fails (vLLM not imported yet), install hook
+            install_import_hook()
@@ -0,0 +1,28 @@
+#
+# MIT License
+#
+# Copyright (c) 2025 Huawei Technologies Co., Ltd. All rights reserved.
+#
+# Permission is hereby granted, free of charge, to any person obtaining a copy
+# of this software and associated documentation files (the "Software"), to deal
+# in the Software without restriction, including without limitation the rights
+# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+# copies of the Software, and to permit persons to whom the Software is
+# furnished to do so, subject to the following conditions:
+#
+# The above copyright notice and this permission notice shall be included in all
+# copies or substantial portions of the Software.
+#
+# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+# SOFTWARE.
+#
+
+
+def _apply_adapt_patch() -> None:
+    """Apply patches for vLLM 0.9.1."""
+    raise NotImplementedError("vLLM 0.9.1 is not supported for Ascend")
@@ -0,0 +1,28 @@
+#
+# MIT License
+#
+# Copyright (c) 2025 Huawei Technologies Co., Ltd. All rights reserved.
+#
+# Permission is hereby granted, free of charge, to any person obtaining a copy
+# of this software and associated documentation files (the "Software"), to deal
+# in the Software without restriction, including without limitation the rights
+# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+# copies of the Software, and to permit persons to whom the Software is
+# furnished to do so, subject to the following conditions:
+#
+# The above copyright notice and this permission notice shall be included in all
+# copies or substantial portions of the Software.
+#
+# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+# SOFTWARE.
+#
+
+
+def _apply_ascend_patch() -> None:
+    """Apply patches for vLLM 0.9.1."""
+    raise NotImplementedError("vLLM 0.9.1 is not supported for Ascend")