Skip to content

Conversation

@sumingZero
Copy link
Contributor

Purpose

What this PR does / why we need it?
Fix bug1: Check task returns -50005 during async load
Fix bug2: During PD separation, the decode instance encounters a "create task error" when performing async load.

Modifications

bug1: start_load_kv()
bug2: get_num_new_matched_tokens()

Test

How was this patch tested?
Launched a 1P1D service and sent 200 requests for testing using trace replay.
image

@ygwpz ygwpz merged commit 6e2eb03 into ModelEngine-Group:develop Sep 19, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants