New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
IB Failure with workflow 23234.103 #35979
Comments
A new Issue was created by @qliphy Qiang Li. @Dr15Jones, @perrotta, @dpiparo, @makortel, @smuzaffar, @qliphy can you please review it and eventually sign/assign? Thanks. cms-bot commands are listed here |
assign dqm,l1 |
New categories assigned: dqm,l1 @jfernan2,@ahmad3213,@rvenditti,@emanueleusai,@rekovic,@pbo0,@cecilecaillol,@pmandrik you have been requested to review this Pull request/Issue and eventually sign? Thanks |
I believe I've traced down the problem to a thread safety issue in ROOT. The call to and in It then checks So if any other module is calling another ROOT function which also uses the "tmp" histogram then the histogram could be reset right as the call to The upshot is it is not safe to call FYI @pcanal. |
NOTE: It appears |
@Dr15Jones thanks a lot for tracking this down! Ok, I will go ahead and implement |
@drankincms it looks like if you first set ROOT to a unique new directory it will place the "tmp" into that directory which should isolate you from any other threads. |
So the ROOT code does search |
@Dr15Jones Thanks a bunch for the tip, your TContext trick seems to fix this. |
+1 |
@qliphy I think we can close this (please reopen if I'm wrong) |
It appeared firstly from IB CMSSW_12_2 2021-10-31-2300, and persists up to most recent IB.
Previously we thought it is due to #35744 , and #35941 may fix this as a byproduct
#35744 (comment)
The PR test with #35941 enabled with multithreading worked well with 23234.103
#35941 (comment)
However, after merging #35941 the issue persists up to most recent IB. We may need to wait to see the next IB results.
Let's open a thread here to track this issue.
Adding @drankincms
p.s. I tried several local tests and they work well with
runTheMatrix.py -l 23234.103 --job-reports -t -4 --ibeos
The text was updated successfully, but these errors were encountered: