-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
testFWCoreConcurrencyCatch2 failing in MULTIARCH_X, ROOT632_X, ROOT6_X #45194
Comments
assign FWCore/Concurrency |
New categories assigned: core @Dr15Jones,@makortel,@smuzaffar you have been requested to review this Pull request/Issue and eventually sign? Thanks |
cms-bot internal usage |
A new Issue was created by @iarspider. @smuzaffar, @rappoccio, @Dr15Jones, @sextonkennedy, @makortel, @antoniovilela can you please review it and eventually sign/assign? Thanks. cms-bot commands are listed here |
First occurance: CMSSW_14_1_ROOT632_X_2024-06-09-2300 |
The problem does not appear to be reproducible (as is true for many threading tests). Was the machine running the tests under a high load? |
The actual failure in the log is
|
I believe there is a race-condition in |
Happened again in CMSSW_14_1_GEANT4_X_2024-07-07-2300:
|
We took a deeper look with @Dr15Jones. The problem is likely the following: The code in cmssw/FWCore/Concurrency/src/WaitingThreadPool.cc Lines 47 to 55 in 94ac0f5
expects the thisPtr_.reset() (that returns the WaitingThread object to be returned to the ReusableObjectHolder to be called before the WaitingTaskWithArenaHolder::doneWaiting() in the swap() line (that then decrements the refcount of the WaitingTask , and may lead to the contained task to be scheduled if the refcount reaches 0). If the user-provided function throws an exception, the exception is communicated incmssw/FWCore/Concurrency/interface/WaitingThreadPool.h Lines 33 to 38 in 94ac0f5
The problem is that the holder.doneWaiting(std::current_exception()) leads to the refcount to be decreased already during
that can cause the task to be scheduled before the WaitingThread is returned to the ReusableObjectHolder .
The fix would be to add a function |
Hopefully fixed by #45402 |
+core We can reopen the issue if #45402 turns out not to be sufficient |
This issue is fully signed and ready to be closed. |
In CMSSW_14_1_{MULTIARCH_X, ROOT632_X, ROOT6_X}, test testFWCoreConcurrencyCatch2 is failing:
The text was updated successfully, but these errors were encountered: