-
Notifications
You must be signed in to change notification settings - Fork 971
Fix data race on should_stop_ flag in LLM runner #18652
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
@@ -9,6 +9,8 @@ | |||||||||||||||
| // Generate tokens in a loop. | ||||||||||||||||
| #pragma once | ||||||||||||||||
|
|
||||||||||||||||
| #include <atomic> | ||||||||||||||||
|
|
||||||||||||||||
| #include <executorch/extension/llm/runner/stats.h> | ||||||||||||||||
| #include <executorch/extension/llm/runner/text_decoder_runner.h> | ||||||||||||||||
| #include <executorch/extension/tensor/tensor.h> | ||||||||||||||||
|
|
@@ -83,7 +85,7 @@ class ET_EXPERIMENTAL TextTokenGenerator { | |||||||||||||||
| auto tokens_managed = from_blob( | ||||||||||||||||
| token_data.data(), token_shape, executorch::aten::ScalarType::Long); | ||||||||||||||||
|
|
||||||||||||||||
| should_stop_ = false; | ||||||||||||||||
| should_stop_.store(false, std::memory_order_relaxed); | ||||||||||||||||
|
||||||||||||||||
| should_stop_.store(false, std::memory_order_relaxed); | |
| // Clear any stale stop request from a previous run without losing a | |
| // concurrent early stop for this run. If a stop was already requested, | |
| // honor it immediately for this generation call. | |
| if (should_stop_.exchange(false, std::memory_order_relaxed)) { | |
| return 0; | |
| } |
Copilot
AI
Apr 2, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are existing unit tests for the runner/token generation path (e.g., test_text_llm_runner.cpp), but none appear to cover calling stop() concurrently with generate() to validate cancellation behavior and prevent regressions of this race fix. Adding a focused test (potentially in Python bindings where the GIL is released) would better exercise the cross-thread stop path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removing
TextDecoderRunner::stop()is an API-breaking change for a public header (it’s exported and also has atorch::executoralias below). If downstream code may be calling this, consider keeping the method (even as a deprecated no-op) or providing a migration path rather than deleting it outright.