Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Comparison failed for tests of wf 39434.911 #37315

Closed
francescobrivio opened this issue Mar 23, 2022 · 23 comments
Closed

Comparison failed for tests of wf 39434.911 #37315

francescobrivio opened this issue Mar 23, 2022 · 23 comments

Comments

@francescobrivio
Copy link
Contributor

In recent PRs test results there is a message saying:

comparisons for the following workflows were not done due to missing matrix map:
 - /data/cmsbld/jenkins/workspace/compare-root-files-short-matrix/data/PR-cadc83/39434.911_TTbar_14TeV+2026D88_DD4hep+TTbar_14TeV_TuneCP5_GenSimHLBeamSpot14+DigiTrigger+RecoGlobal+HARVESTGlobal

But nonetheless the default comparison reports more than 50k differences for this wf.
Some examples:

@cmsbuild
Copy link
Contributor

A new Issue was created by @francescobrivio .

@Dr15Jones, @perrotta, @dpiparo, @makortel, @smuzaffar, @qliphy can you please review it and eventually sign/assign? Thanks.

cms-bot commands are listed here

@francescobrivio
Copy link
Contributor Author

assign reconstruction

@cmsbuild
Copy link
Contributor

New categories assigned: reconstruction

@jpata,@slava77,@clacaputo you have been requested to review this Pull request/Issue and eventually sign? Thanks

@jpata
Copy link
Contributor

jpata commented Mar 23, 2022

.911 is dd4hep, we often had differences in it not related to reco. did something change, or is it still a geometry-related issue?

@makortel
Copy link
Contributor

I believe the message

comparisons for the following workflows were not done due to missing matrix map:
 - /data/cmsbld/jenkins/workspace/compare-root-files-short-matrix/data/PR-cadc83/39434.911_TTbar_14TeV+2026D88_DD4hep+TTbar_14TeV_TuneCP5_GenSimHLBeamSpot14+DigiTrigger+RecoGlobal+HARVESTGlobal

refers only to the "validateJR" / "reco comparisons", and not to RelMon-based comparisons or DQM bin by bin.

@makortel
Copy link
Contributor

assign geometry

@cmsbuild
Copy link
Contributor

New categories assigned: geometry

@cvuosalo,@mdhildreth,@ianna,@Dr15Jones,@makortel,@civanch you have been requested to review this Pull request/Issue and eventually sign? Thanks

@francescobrivio
Copy link
Contributor Author

.911 is dd4hep, we often had differences in it not related to reco. did something change, or is it still a geometry-related issue?

@jpata from the alca point of view there was no change in the MC dd4hep geometry. Thanks Matti for assigning geometry.

@perrotta
Copy link
Contributor

I believe the message

comparisons for the following workflows were not done due to missing matrix map:
 - /data/cmsbld/jenkins/workspace/compare-root-files-short-matrix/data/PR-cadc83/39434.911_TTbar_14TeV+2026D88_DD4hep+TTbar_14TeV_TuneCP5_GenSimHLBeamSpot14+DigiTrigger+RecoGlobal+HARVESTGlobal

refers only to the "validateJR" / "reco comparisons", and not to RelMon-based comparisons or DQM bin by bin.

There are however also quite a lot of differences in the bin by bin DQM comparisons for that workflow in the test outputs, as you can verify by opening any of the links to the PR tests listed in the issue description above.

@makortel
Copy link
Contributor

I believe the message

comparisons for the following workflows were not done due to missing matrix map:
 - /data/cmsbld/jenkins/workspace/compare-root-files-short-matrix/data/PR-cadc83/39434.911_TTbar_14TeV+2026D88_DD4hep+TTbar_14TeV_TuneCP5_GenSimHLBeamSpot14+DigiTrigger+RecoGlobal+HARVESTGlobal

refers only to the "validateJR" / "reco comparisons", and not to RelMon-based comparisons or DQM bin by bin.

There are however also quite a lot of differences in the bin by bin DQM comparisons for that workflow in the test outputs, as you can verify by opening any of the links to the PR tests listed in the issue description above.

Right, but the existence of the differences (whose cause should be identified) is a separate issue from one piece of comparisons infrastructure not recognizing this workflow.

@cvuosalo
Copy link
Contributor

39434.911 is Phase 2 D88 DD4hep. It is quite new and may be still under development.
@srimanob Could you please comment about the stability of this workflow?

This workflow runs DD4hep from XML files. That makes it very sensitive to any perturbations in the source files or test process.

@cvuosalo
Copy link
Contributor

assign upgrade

@srimanob
Copy link
Contributor

srimanob commented Mar 23, 2022

How can we test the stability of the workflow offline? I mean how do we know if some unexpected random behavior will happen somewhere without this kind of PR test.

The test was done with ttbar 9k events, but that is to compare between DDD and DD4hep. I never try to compare DD4hep with DD4hep from 2 runs to see its stability.

@francescobrivio
Copy link
Contributor Author

@cvuosalo @srimanob what I find strange is that in all the tests there are ~50k failed differences over ~100k total comparisons (see this log for example). So, if basically half of the comparisons fail, maybe it's not just a "stability" problem?

@cmsbuild
Copy link
Contributor

New categories assigned: upgrade

@AdrianoDee,@srimanob you have been requested to review this Pull request/Issue and eventually sign? Thanks

@srimanob
Copy link
Contributor

@francescobrivio
Yes, I saw it. I can comment it out for now. Clearly, this is not expected. However, as I said, I still have no idea why and how I can test it offline. SInce now we allow to create baseline in PR test (#37289), this may help when I try to enable it.

@makortel
Copy link
Contributor

@srimanob Could you elaborate what you mean with "test it offline"? Is it e.g. about comparing the DQM root files of different invocations of the workflow?

@srimanob
Copy link
Contributor

Hi @makortel
I mean I don't expect the change in .911 DD4hep phase-2 wf. With the test results on several PRs that show failure on comparison, should I test it locally somehow. For example, run twices with and without PR and compare locallly?

By the way, last fresh test of the following PR does not show perculiar on the workflow comparison,

@makortel
Copy link
Contributor

For example, run twices with and without PR and compare locallly?

Right, run twice or many times. The comparison failure appears to be random, so it is hard to say beforehand how many times to run. In the past we've seen occurrence rates between O(1 %) and O(100 %) or so with the (Run 3) DD4Hep workflow.

@clacaputo
Copy link
Contributor

comparisons for the following workflows were not done due to missing matrix map:
 - /data/cmsbld/jenkins/workspace/compare-root-files-short-matrix/data/PR-cadc83/39434.911_TTbar_14TeV+2026D88_DD4hep+TTbar_14TeV_TuneCP5_GenSimHLBeamSpot14+DigiTrigger+RecoGlobal+HARVESTGlobal

I think this problem should be addressed by this cms-sw/cms-bot@31e5f20

@srimanob
Copy link
Contributor

I've prepared the pull request to disable it in case we still face random issue of failure comparison.
#37337

@jpata
Copy link
Contributor

jpata commented May 16, 2022

+reconstruction

  • looks like the 39434.911 was disabled

@francescobrivio
Copy link
Contributor Author

Wf disabled from short matrix in #37337

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants