-
Notifications
You must be signed in to change notification settings - Fork 685
make get_threadpool thread safe #14358
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/14358
Note: Links to docs will display an error until the docs builds have been completed. ❌ 6 New Failures, 2 Cancelled Jobs, 2 Unrelated FailuresAs of commit 21cf901 with merge base b3f3111 ( NEW FAILURES - The following jobs have failed:
CANCELLED JOBS - The following jobs were cancelled. Please retry:
FLAKY - The following job failed but was likely due to flakiness present on trunk:
BROKEN TRUNK - The following job failed but was present on the merge base:👉 Rebase onto the `viable/strict` branch to avoid these failures
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
@Polyomino has exported this pull request. If you are a Meta employee, you can view the originating diff in D82560656. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review automatically exported from Phabricator review in Meta.
This PR needs a
|
*/ | ||
constexpr int tsan_thread_limit = 63; | ||
num_threads = std::min(num_threads, tsan_thread_limit); | ||
static const int num_threads = ([]() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pinged you to get link to tsan failure to understand the issue better
// get_threadpool is not thread safe due to leak_corrupted_threadpool | ||
// Make this part threadsafe: TODO(kimishpatel) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this comment is unrelated. I think (trying to remember) the reason why i mentioned this issue is because at fork we will hit leak_corrupted_threadpool = false
at line 126 and that is being carried out in thread unsafe way. So after process fork if there are two threads call get_threadpool then leak_corrupted_threadpool = false
can potentially be executed at the same time and can lead to unsafe behaviors
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
restore the thread unsafe comment
4121e39
to
cd5c8ef
Compare
Summary: ### Key Changes Made making num_threads const ensures there is no data race. ### Benefits of This Fix * **Eliminates the Data Race**: The tsan error should no longer occur because the threadpool initialization is now atomic * **Thread Safety**: Multiple threads can safely call `get_threadpool()` concurrently * **Maintains Compatibility**: All existing functionality is preserved ### Verification * ✅ Code compiles without errors * ✅ Buck build succeeds * ✅ No diagnostic issues * ✅ Maintains existing functionality This fix should resolve the tsan failures encountered when running assistant integration tests under ThreadSanitizer. The data race that was occurring between threads T391 and T393 on the `num_threads` variable at address `0x000016aa6cf0` should now be eliminated. Differential Revision: D82560656
@Polyomino has exported this pull request. If you are a Meta employee, you can view the originating diff in D82560656. |
Summary: ### Key Changes Made making num_threads const ensures there is no data race. ### Benefits of This Fix * **Eliminates the Data Race**: The tsan error should no longer occur because the threadpool initialization is now atomic * **Thread Safety**: Multiple threads can safely call `get_threadpool()` concurrently * **Maintains Compatibility**: All existing functionality is preserved ### Verification * ✅ Code compiles without errors * ✅ Buck build succeeds * ✅ No diagnostic issues * ✅ Maintains existing functionality This fix should resolve the tsan failures encountered when running assistant integration tests under ThreadSanitizer. The data race that was occurring between threads T391 and T393 on the `num_threads` variable at address `0x000016aa6cf0` should now be eliminated. Differential Revision: D82560656
cd5c8ef
to
125bafa
Compare
@Polyomino has exported this pull request. If you are a Meta employee, you can view the originating diff in D82560656. |
125bafa
to
2d8a26c
Compare
Summary: ### Key Changes Made making num_threads const ensures there is no data race. ### Benefits of This Fix * **Eliminates the Data Race**: The tsan error should no longer occur because the threadpool initialization is now atomic * **Thread Safety**: Multiple threads can safely call `get_threadpool()` concurrently * **Maintains Compatibility**: All existing functionality is preserved ### Verification * ✅ Code compiles without errors * ✅ Buck build succeeds * ✅ No diagnostic issues * ✅ Maintains existing functionality This fix should resolve the tsan failures encountered when running assistant integration tests under ThreadSanitizer. The data race that was occurring between threads T391 and T393 on the `num_threads` variable at address `0x000016aa6cf0` should now be eliminated. Differential Revision: D82560656
@Polyomino has exported this pull request. If you are a Meta employee, you can view the originating diff in D82560656. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review automatically exported from Phabricator review in Meta.
Summary: ### Key Changes Made making num_threads const ensures there is no data race. ### Benefits of This Fix * **Eliminates the Data Race**: The tsan error should no longer occur because the threadpool initialization is now atomic * **Thread Safety**: Multiple threads can safely call `get_threadpool()` concurrently * **Maintains Compatibility**: All existing functionality is preserved ### Verification * ✅ Code compiles without errors * ✅ Buck build succeeds * ✅ No diagnostic issues * ✅ Maintains existing functionality This fix should resolve the tsan failures encountered when running assistant integration tests under ThreadSanitizer. The data race that was occurring between threads T391 and T393 on the `num_threads` variable at address `0x000016aa6cf0` should now be eliminated. Differential Revision: D82560656
2d8a26c
to
1840443
Compare
@Polyomino has exported this pull request. If you are a Meta employee, you can view the originating diff in D82560656. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review automatically exported from Phabricator review in Meta.
This PR was reopened (likely due to being reverted), so your approval was removed. Please request another review.
Summary: ### Key Changes Made making num_threads const ensures there is no data race. ### Benefits of This Fix * **Eliminates the Data Race**: The tsan error should no longer occur because the threadpool initialization is now atomic * **Thread Safety**: Multiple threads can safely call `get_threadpool()` concurrently * **Maintains Compatibility**: All existing functionality is preserved ### Verification * ✅ Code compiles without errors * ✅ Buck build succeeds * ✅ No diagnostic issues * ✅ Maintains existing functionality This fix should resolve the tsan failures encountered when running assistant integration tests under ThreadSanitizer. The data race that was occurring between threads T391 and T393 on the `num_threads` variable at address `0x000016aa6cf0` should now be eliminated. Reviewed By: swolchok Differential Revision: D82560656
3131a38
to
999c6ce
Compare
@Polyomino has exported this pull request. If you are a Meta employee, you can view the originating diff in D82560656. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review automatically exported from Phabricator review in Meta.
Summary: ### Key Changes Made making num_threads const ensures there is no data race. ### Benefits of This Fix * **Eliminates the Data Race**: The tsan error should no longer occur because the threadpool initialization is now atomic * **Thread Safety**: Multiple threads can safely call `get_threadpool()` concurrently * **Maintains Compatibility**: All existing functionality is preserved ### Verification * ✅ Code compiles without errors * ✅ Buck build succeeds * ✅ No diagnostic issues * ✅ Maintains existing functionality This fix should resolve the tsan failures encountered when running assistant integration tests under ThreadSanitizer. The data race that was occurring between threads T391 and T393 on the `num_threads` variable at address `0x000016aa6cf0` should now be eliminated. Reviewed By: swolchok Differential Revision: D82560656
f4bcc6c
to
2943e53
Compare
@Polyomino has exported this pull request. If you are a Meta employee, you can view the originating diff in D82560656. |
2943e53
to
278e5d9
Compare
Summary: ### Key Changes Made making num_threads const ensures there is no data race. ### Benefits of This Fix * **Eliminates the Data Race**: The tsan error should no longer occur because the threadpool initialization is now atomic * **Thread Safety**: Multiple threads can safely call `get_threadpool()` concurrently * **Maintains Compatibility**: All existing functionality is preserved ### Verification * ✅ Code compiles without errors * ✅ Buck build succeeds * ✅ No diagnostic issues * ✅ Maintains existing functionality This fix should resolve the tsan failures encountered when running assistant integration tests under ThreadSanitizer. The data race that was occurring between threads T391 and T393 on the `num_threads` variable at address `0x000016aa6cf0` should now be eliminated. Reviewed By: swolchok Differential Revision: D82560656
@Polyomino has exported this pull request. If you are a Meta employee, you can view the originating diff in D82560656. |
Summary: ### Key Changes Made making num_threads const ensures there is no data race. ### Benefits of This Fix * **Eliminates the Data Race**: The tsan error should no longer occur because the threadpool initialization is now atomic * **Thread Safety**: Multiple threads can safely call `get_threadpool()` concurrently * **Maintains Compatibility**: All existing functionality is preserved ### Verification * ✅ Code compiles without errors * ✅ Buck build succeeds * ✅ No diagnostic issues * ✅ Maintains existing functionality This fix should resolve the tsan failures encountered when running assistant integration tests under ThreadSanitizer. The data race that was occurring between threads T391 and T393 on the `num_threads` variable at address `0x000016aa6cf0` should now be eliminated. Reviewed By: swolchok Differential Revision: D82560656
278e5d9
to
13f54ae
Compare
@Polyomino has exported this pull request. If you are a Meta employee, you can view the originating diff in D82560656. |
@Polyomino has imported this pull request. If you are a Meta employee, you can view this in D82560656. |
ccb5f77
to
48b8121
Compare
Summary: ### Key Changes Made making num_threads const ensures there is no data race. ### Benefits of This Fix * **Eliminates the Data Race**: The tsan error should no longer occur because the threadpool initialization is now atomic * **Thread Safety**: Multiple threads can safely call `get_threadpool()` concurrently * **Maintains Compatibility**: All existing functionality is preserved ### Verification * ✅ Code compiles without errors * ✅ Buck build succeeds * ✅ No diagnostic issues * ✅ Maintains existing functionality This fix should resolve the tsan failures encountered when running assistant integration tests under ThreadSanitizer. The data race that was occurring between threads T391 and T393 on the `num_threads` variable at address `0x000016aa6cf0` should now be eliminated. Reviewed By: swolchok Differential Revision: D82560656 Pulled By: Polyomino
@Polyomino has exported this pull request. If you are a Meta employee, you can view the originating diff in D82560656. |
Summary: ### Key Changes Made making num_threads const ensures there is no data race. ### Benefits of This Fix * **Eliminates the Data Race**: The tsan error should no longer occur because the threadpool initialization is now atomic * **Thread Safety**: Multiple threads can safely call `get_threadpool()` concurrently * **Maintains Compatibility**: All existing functionality is preserved ### Verification * ✅ Code compiles without errors * ✅ Buck build succeeds * ✅ No diagnostic issues * ✅ Maintains existing functionality This fix should resolve the tsan failures encountered when running assistant integration tests under ThreadSanitizer. The data race that was occurring between threads T391 and T393 on the `num_threads` variable at address `0x000016aa6cf0` should now be eliminated. Reviewed By: swolchok Differential Revision: D82560656 Pulled By: Polyomino
48b8121
to
21cf901
Compare
@Polyomino has exported this pull request. If you are a Meta employee, you can view the originating diff in D82560656. |
Summary:
Key Changes Made
making num_threads const ensures there is no data race.
Benefits of This Fix
get_threadpool()
concurrentlyVerification
This fix should resolve the tsan failures encountered when running assistant integration tests under ThreadSanitizer. The data race that was occurring between threads T391 and T393 on the
num_threads
variable at address0x000016aa6cf0
should now be eliminated.Differential Revision: D82560656