[Feature] Add DRAM Connector for uc_connector #18

harrisonyhq · 2025-07-29T05:12:06Z

Prupose

Add a DRAM Connector to enable offloading the KV cache from GPU HBM to CPU DRAM, helping reduce GPU memory pressure and support larger models or batch sizes.

Modifications

Test

Unit Test

Passes unit test in test/test_ucm_dram.py

python test/test_ucm_dram.py

Performance Test

Using llmperf with following command:

python token_benchmark_ray.py  --model "/home/models/QwQ-32B" --mean-input-tokens 16000 --mean-output-tokens 1 --max-num-completed-requests 10 --num-concurrent-requests 1

to evaluate the connector TTFT performance with a series of input token lengths, using the model QwQ-32B, got the following result:

Token length	Local Disk Connector	vllm with disabled prefix cache	DRAM Connector
8K	0.88s	2.16s	0.65s
16K	1.79s	5.01s	1.26s
32K	3.51s	12.54s	2.21s

[Fix] Add asynchronize copy non blocking argument."

harrisonyhq added 5 commits July 28, 2025 21:14

[Feat] Add ucm dram connector and test

95413be

[style] Add constant MB_TO_BYTE, add default max_cache_size as 5Gb.

8bed8dc

[Fix] Add asynchronize copy non blocking argument."

[Doc] Add doc for dram connector

f004105

[Feat] Add some arguments in config for ucm_dram

1166362

[Doc] Fix some typo in document

d5f0166

harrisonyhq requested a review from ygwpz July 29, 2025 05:12

harrisonyhq added 3 commits July 29, 2025 20:02

[Feat] Add device support for CUDA devices

c9ecc5c

[Fix] fix some bugs

aea666c

[Style] Remove MB_TO_BYTE, change config max_cache_size from MB to byte

562ea5a

ygwpz merged commit dfe3e14 into ModelEngine-Group:develop Jul 30, 2025

flesher0813 mentioned this pull request Aug 1, 2025

[Feat] Merge 0.0.1 back into main #48

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Feature] Add DRAM Connector for uc_connector #18

[Feature] Add DRAM Connector for uc_connector #18

Uh oh!

harrisonyhq commented Jul 29, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[Feature] Add DRAM Connector for uc_connector #18

[Feature] Add DRAM Connector for uc_connector #18

Uh oh!

Conversation

harrisonyhq commented Jul 29, 2025

Prupose

Modifications

Test

Unit Test

Performance Test

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants