From 4fd830e6d8966ac749dbe1cf246b67cda250334a Mon Sep 17 00:00:00 2001
From: harrisonyhq <harrisonyhq@gmail.com>
Date: Fri, 26 Sep 2025 23:01:02 +0800
Subject: [PATCH 1/3] [Docs]Update DRAM perform data

---
 .../user-guide/prefix-cache/dram_store.md     | 27 ++++++++++++++-----
 1 file changed, 21 insertions(+), 6 deletions(-)
diff --git a/docs/source/user-guide/prefix-cache/dram_store.md b/docs/source/user-guide/prefix-cache/dram_store.md
index b51bc6cd..46a3c603 100644
--- a/docs/source/user-guide/prefix-cache/dram_store.md
+++ b/docs/source/user-guide/prefix-cache/dram_store.md
@@ -4,12 +4,27 @@ This document provides a usage example and configuration guide for the **DRAM Co
 
 ## Performance
 
-Combining UCM with vLLM delivers 3–10× improvements in latency and GPU efficiency, especially for long-context LLM tasks.
-
-<p align="center">
-  <img alt="UCM" src="../../images/dram_perform.png" width="90%">
-</p>
-
+### Overview
+The following are the multi-concurrency performance test results of UCM in the Prefix Cache scenario under a CUDA environment, showing the performance improvements of UCM on two different models.
+During the tests, HBM cache was disabled, and KV Cache was retrieved and matched only from DRAM.
+
+In the QwQ-32B model, the test used one H20 server with two GPUs.
+
+Here, Full Compute refers to pure VLLM inference, while DRAM80% indicates that after UCM pooling, the DRAM hit rate of the KV cache is 80%.
+
+The following table shows the results on the QwQ-32B model:
+|      **QwQ-32B** |                |                     |                |              |
+| ---------------: | -------------: | ------------------: | -------------: | :----------- |
+| **Input length** | **Concurrent** | **Full Compute(s)** | **DRAM80%(s)** | **Speedup**  |
+|            4 000 |              1 |              1.0269 |         0.3102 | **+230.9 %** |
+|            8 000 |              1 |              2.0902 |         0.5718 | **+265.5 %** |
+|           16 000 |              1 |              4.4852 |         1.1914 | **+276.4 %** |
+|            4 000 |              2 |              1.5383 |         0.4209 | **+265.4 %** |
+|            8 000 |              2 |              3.1323 |         0.8231 | **+280.5 %** |
+|           16 000 |              2 |              6.7984 |         1.7420 | **+290.2 %** |
+|            4 000 |              4 |              2.8173 |         0.9444 | **+198.2 %** |
+|            8 000 |              4 |              5.2643 |         1.8290 | **+187.8 %** |
+|           16 000 |              4 |             11.3651 |         3.6706 | **+209.6 %** |
 ## Features
 
 The DRAM connector supports the following functionalities:

From 0191943a71556479a4a499159ff6fbc3eead716f Mon Sep 17 00:00:00 2001
From: harrisonyhq <harrisonyhq@gmail.com>
Date: Fri, 26 Sep 2025 23:15:04 +0800
Subject: [PATCH 2/3] [Fix] fix workflow not exiting while error inside bash
 commands

---
 .github/workflows/unifiedcache_test.yml | 1 +
 1 file changed, 1 insertion(+)

diff --git a/.github/workflows/unifiedcache_test.yml b/.github/workflows/unifiedcache_test.yml
index 3242ae24..9754691e 100644
--- a/.github/workflows/unifiedcache_test.yml
+++ b/.github/workflows/unifiedcache_test.yml
@@ -43,6 +43,7 @@ jobs:
             --entrypoint /bin/bash \
             vllm/vllm-openai:v0.9.2 \
             -c "
+              set -euo pipefail
               pip install -v -e . --no-build-isolation
               cd \$(pip show vllm | grep Location | awk '{print \$2}') &&
               git apply /workspace/unified-cache-management/ucm/integration/vllm/patch/0.9.2/vllm-adapt.patch &&

From 5e2d19a36c8886d3f94a185da6a2b4bec513bbb9 Mon Sep 17 00:00:00 2001
From: harrisonyhq <harrisonyhq@gmail.com>
Date: Fri, 26 Sep 2025 23:33:26 +0800
Subject: [PATCH 3/3] [Fix] fix workflow

---
 .github/workflows/unifiedcache_test.yml | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/.github/workflows/unifiedcache_test.yml b/.github/workflows/unifiedcache_test.yml
index 9754691e..1cdb4208 100644
--- a/.github/workflows/unifiedcache_test.yml
+++ b/.github/workflows/unifiedcache_test.yml
@@ -46,8 +46,7 @@ jobs:
               set -euo pipefail
               pip install -v -e . --no-build-isolation
               cd \$(pip show vllm | grep Location | awk '{print \$2}') &&
-              git apply /workspace/unified-cache-management/ucm/integration/vllm/patch/0.9.2/vllm-adapt.patch &&
-              git apply /workspace/unified-cache-management/ucm/integration/vllm/patch/0.9.2/vllm-adapt-sparse.patch
+              git apply /workspace/unified-cache-management/ucm/integration/vllm/patch/0.9.2/vllm-adapt.patch
               cd /workspace/unified-cache-management
               python3 -m unittest discover -s test
             "