New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
wf 8.0 failing in all Jenkins tests #34890
Comments
A new Issue was created by @tvami Tamas Vami. @Dr15Jones, @perrotta, @dpiparo, @makortel, @smuzaffar, @qliphy can you please review it and eventually sign/assign? Thanks. cms-bot commands are listed here |
Just to clarify, the "failure" in PR tests is in the DQM output comparisons, while the
checks only if the workflow technically runs or not. |
Following the link to the failing comparison leads to https://cmssdt.cern.ch/SDT/jenkins-artifacts/baseLineComparisons/CMSSW_12_1_X_2021-08-13-1700+7d20f9/44780/8.0_BeamHalo+BeamHalo+DIGICOS+RECOCOS+ALCABH+HARVESTCOS/Pixel_AdditionalPixelErrors.html that only says "Skipped: 0.2% (1)" without pointing what exactly was skipped. |
assign dqm |
New categories assigned: dqm @jfernan2,@kmaeshima,@rvenditti,@andrius-k,@ErnestaP,@ahmad3213 you have been requested to review this Pull request/Issue and eventually sign? Thanks |
Thanks @makortel for the clarification. Indeed following the links I see that and "Skiped Comparisons byLumiErrors" here |
Is there any way to see from RelMon the log error of those skipped/failing comparisons? The DQM bin by bin tool does not show any problem on those Pixel folders |
@jfernan2 , you can go to parent directory ( just remove the file part of the url in your browser e.g. https://cmssdt.cern.ch/SDT/jenkins-artifacts/baseLineComparisons/CMSSW_12_1_X_2021-08-13-1700+7d20f9/44780/ ) and then look for 8.0 log file. |
Hi @jfernan2 thanks for looking into this. I'm wondering why the comparison of NaN vs NaN doesnt give an agreement. |
By definition nan!=nan (or anything else) - Better if the code producing the nan would instead produce some valid result.
… On Aug 16, 2021, at 7:10 PM, Tamas Vami ***@***.***> wrote:
Hi @jfernan2 thanks for looking into this. I'm wondering why the comparison of NaN vs NaN doesnt give an agreement.
Anyway, would you then recommend to remove wf 8.0 from the Jenkins tests?
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub, or unsubscribe.
Triage notifications on the go with GitHub Mobile for iOS or Android.
|
The code can probably not give valid results in this case since there is no FED info for Pizel. cmssw/Utilities/RelMon/data/blacklist.txt Lines 75 to 76 in 6d2f660
I am adding here Tracker DQM contacts in case they shred some light: |
Its typically straightforward to protect against generating nans.. what's the code that is doing this?
(Of course will still be not useful in this workflow but then the histograms will be enabled for other workflows)
… On Aug 16, 2021, at 7:22 PM, jfernan2 ***@***.***> wrote:
The code can probably not give valid results in this case since there is no FED info for Pizel.
What about blacklisting them in RelMon? Indeed I see they are in a blacklist: https://github.com/cms-sw/cmssw/blob/6d2f66057131baacc2fcbdd203588c41c885b42c/Utilities/RelMon/data/blacklist.txt#L75-L76
I am adding here Tracker DQM contacts in case they shred some light:
@sroychow @mmusich
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or unsubscribe.
Triage notifications on the go with GitHub Mobile for iOS or Android.
|
cmssw/DQM/SiPixelMonitorDigi/src/SiPixelDigiSource.cc Lines 133 to 146 in 639903d
|
cmssw/Utilities/RelMon/data/blacklist.txt Line 50 in 6d2f660
Maybe that could be changed to |
see #34900. |
+1 |
This issue is fully signed and ready to be closed. |
wf 8.0 fails even with the simplest PRs in Jenkins, for example:
https://cmssdt.cern.ch/SDT/jenkins-artifacts/baseLineComparisons/CMSSW_12_1_X_2021-08-13-1700+7d20f9/44780/validateJR.html
I understand that this is a long standing issue, and I'm wondering how it should be resolved. I cannot reproduce the problem in local testing
leads to
1 1 1 1 1 tests passed, 0 0 0 0 0 failed
Should we maybe just remove this from the Jenkins tests? It seems to be a BeamHalo run1_mc cosmics workflow.
The text was updated successfully, but these errors were encountered: