[BugFix] race condition [is_fetching] causing multiple fetch requests#5238
[BugFix] race condition [is_fetching] causing multiple fetch requests#5238Jiang-Jia-Jun merged 3 commits intoPaddlePaddle:developfrom
Conversation
|
Thanks for your contribution! |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## develop #5238 +/- ##
==========================================
Coverage ? 60.53%
==========================================
Files ? 320
Lines ? 39054
Branches ? 5871
==========================================
Hits ? 23642
Misses ? 13549
Partials ? 1863
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Pull request overview
This PR addresses a race condition in the request scheduling logic where multiple fetch requests could be submitted concurrently due to the is_fetching flag being set inside the worker thread instead of before task submission. The fix correctly moves the flag assignment to occur immediately before the submit() call, eliminating the race window.
Key changes:
- Fixed race condition by setting
is_fetching = Truebeforesubmit()instead of inside the worker thread - Corrected port type from
strtointin router configuration for proper type consistency - Removed duplicate
paddle.to_tensor()call in resource manager
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
fastdeploy/router/router.py |
Changed port parameter type from str to int for type correctness |
fastdeploy/engine/sched/resource_manager_v1.py |
Removed duplicate paddle.to_tensor() call that was creating the same tensor twice |
fastdeploy/engine/common_engine.py |
Fixed race condition by moving is_fetching = True assignment from inside _fetch_request() to before thread pool submission, preventing multiple concurrent fetch requests. Applied fix to both "mixed" and non-"mixed" execution paths |
Comments suppressed due to low confidence (1)
fastdeploy/engine/common_engine.py:817
- Critical bug: If
get_request_pool.submit(_fetch_request)raises aRuntimeErrorafteris_fetching = Trueis set, the exception handler at lines 812-817 will either break or re-raise the exception without resettingis_fetchingback toFalse. This will permanently block all future fetch request submissions. You need to resetis_fetching = Falsebefore thebreakstatement (line 815) and before theraisestatement (line 817) to prevent the flag from being stuck in theTruestate.
is_fetching = True
get_request_pool.submit(_fetch_request)
except RuntimeError as e:
if "shutdown" in str(e):
llm_logger.info("Thread pool shutdown detected, exiting scheduler loop")
break
else:
raise
You can also share your feedback on Copilot code review for a chance to win a $100 gift card. Take the survey.
Motivation
Modifications
router.log traceback, port type error
race condition [is_fetching] causing multiple fetch requests
Usage or Command
bash start_v1_tp1.sh
Accuracy Tests
Checklist
[FDConfig],[APIServer],[Engine],[Scheduler],[PD Disaggregation],[Executor],[Graph Optimization],[Speculative Decoding],[RL],[Models],[Quantization],[Loader],[OP],[KVCache],[DataProcessor],[BugFix],[Docs],[CI],[Optimization],[Feature],[Benchmark],[Others],[XPU],[HPU],[GCU],[DCU],[Iluvatar],[Metax]]pre-commitbefore commit.releasebranch, make sure the PR has been submitted to thedevelopbranch, then cherry-pick it to thereleasebranch with the[Cherry-Pick]PR tag.