Skip to content

Conversation

@qyh111
Copy link
Contributor

@qyh111 qyh111 commented Nov 25, 2025

Purpose

What this PR does / why we need it?

adapt GQA
modify config.yaml so we can input multiple connectors

Modifications

Does this PR introduce any user-facing change?

Test

Vllm online serve can be pulled sucessfully, we send the same promt twice , replies are almost the same
image
image

How was this patch tested?

ygwpz
ygwpz previously approved these changes Nov 25, 2025
@ygwpz ygwpz dismissed their stale review November 25, 2025 06:31

still exist some problems

@ygwpz ygwpz merged commit 0986b89 into ModelEngine-Group:dev-ucm-v1 Nov 26, 2025
3 checks passed
qyh111 added a commit to qyh111/unified-cache-management that referenced this pull request Nov 28, 2025
* adapt GQA & modify config.yaml

* move process to UCMDirectConnector

* fix comment

* modify hash function

* fix style

* code style and modify hash

* init parent_block_hash_value
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants