Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes to the ZeroBias DQM sequences #28336

Merged
merged 2 commits into from Nov 3, 2019
Merged

Conversation

fabiocos
Copy link
Contributor

@fabiocos fabiocos commented Nov 2, 2019

PR description:

This PR is a follow-up of #28325 to fix the crashes after the revision of DQM sequences in #28156 .
There are two changes:

  • the DQMMessageLogger sequence for harvesting step is updated;

  • the tracker ZeroBias sequence is moved back to the old value.

Both changes need to be validated by @jfernan2

PR validation:

workflows 134.71 and 136.856 run, the previously appearing warning on missing sequence in harvesting has disappeared.

@cmsbuild
Copy link
Contributor

cmsbuild commented Nov 2, 2019

The code-checks are being triggered in jenkins.

@cmsbuild
Copy link
Contributor

cmsbuild commented Nov 2, 2019

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-28336/12600

  • This PR adds an extra 12KB to repository

@cmsbuild
Copy link
Contributor

cmsbuild commented Nov 2, 2019

A new Pull Request was created by @fabiocos (Fabio Cossutti) for master.

It involves the following packages:

DQMOffline/Configuration

@andrius-k, @kmaeshima, @schneiml, @cmsbuild, @jfernan2, @fioriNTU can you please review it and eventually sign? Thanks.
@threus, @rociovilar this is something you requested to watch as well.
@davidlange6, @slava77, @fabiocos you are the release manager for this.

cms-bot commands are listed here

@fabiocos
Copy link
Contributor Author

fabiocos commented Nov 2, 2019

please test workflow 134.71,136.789,136.801,136.88

@cmsbuild
Copy link
Contributor

cmsbuild commented Nov 2, 2019

The tests are being triggered in jenkins.
https://cmssdt.cern.ch/jenkins/job/ib-run-pr-tests/3317/console Started: 2019/11/02 22:22

@cmsbuild
Copy link
Contributor

cmsbuild commented Nov 2, 2019

@cmsbuild
Copy link
Contributor

cmsbuild commented Nov 2, 2019

Comparison job queued.

@cmsbuild
Copy link
Contributor

cmsbuild commented Nov 3, 2019

Comparison is ready
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-730f79/3317/summary.html

@slava77 comparisons for the following workflows were not done due to missing matrix map:

  • /data/cmsbld/jenkins/workspace/compare-root-files-short-matrix/results/JR-comparison/PR-730f79/134.71_RunZeroBias2015B+RunZeroBias2015B+HLTDR2_50ns+RECODR2_50nsreHLT_ZB_HIPM+HARVESTDR2ZB
  • /data/cmsbld/jenkins/workspace/compare-root-files-short-matrix/results/JR-comparison/PR-730f79/136.789_RunZeroBias2017B+RunZeroBias2017B+HLTDR2_2017+RECODR2_2017reHLT_ZBPrompt+HARVEST2017ZB
  • /data/cmsbld/jenkins/workspace/compare-root-files-short-matrix/results/JR-comparison/PR-730f79/136.801_RunZeroBias2017C+RunZeroBias2017C+HLTDR2_2017+RECODR2_2017reHLT_ZBPrompt+HARVEST2017ZB
  • /data/cmsbld/jenkins/workspace/compare-root-files-short-matrix/results/JR-comparison/PR-730f79/136.88_RunZeroBias2018C+RunZeroBias2018C+HLTDR2_2018+RECODR2_2018reHLT_ZBOffline+HARVEST2018ZB

Comparison Summary:

  • No significant changes to the logs found
  • Reco comparison results: 2 differences found in the comparisons
  • DQMHistoTests: Total files compared: 34
  • DQMHistoTests: Total histograms compared: 2939026
  • DQMHistoTests: Total failures: 2
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 2938683
  • DQMHistoTests: Total skipped: 341
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 33 files compared)
  • Checked 147 log files, 16 edm output root files, 34 DQM output files

@fabiocos
Copy link
Contributor Author

fabiocos commented Nov 3, 2019

@jfernan2 this PR appears to fix the residual crashes in the IB. I am not sure this is what you intended to get in your revision, please have a look. In case I will integrate them in one of the next IBs and then you may go through a further update.

@fabiocos
Copy link
Contributor Author

fabiocos commented Nov 3, 2019

@smuzaffar FYI

@fabiocos
Copy link
Contributor Author

fabiocos commented Nov 3, 2019

+1

@fabiocos
Copy link
Contributor Author

fabiocos commented Nov 3, 2019

merge

@cmsbuild cmsbuild merged commit 08503e8 into cms-sw:master Nov 3, 2019
@jfernan2
Copy link
Contributor

jfernan2 commented Nov 3, 2019

+1
Thanks @fabiocos
I was away for some days, I don't understand how these workflows passed the test in my original PR

@cmsbuild
Copy link
Contributor

cmsbuild commented Nov 3, 2019

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will be automatically merged.

@fabiocos
Copy link
Contributor Author

fabiocos commented Nov 4, 2019

@jfernan2 this was a minimal solution to get this running. Is this that you intend to have finally? This was originally introduced by @mmusich but of course I am not so expert in it to judge whether it is the finally desired sequence (apart that it is what used previously)

@jfernan2
Copy link
Contributor

jfernan2 commented Nov 4, 2019

I understand, I am working now on a fix to your fix to have it as desired, I will commit a PR this morning
Thanks

@@ -1,18 +1,18 @@
autoDQM = { 'common': ['@dcs+DQMMessageLoggerSeq+@strip+@pixel+@tracking+@L1TMon+@hlt+@beam+@castor+@physics+@tau',
'PostDQMOffline',
'@dcs+DQMMessageLoggerSeq+@strip+@pixel+@tracking+@L1TMon+@hlt+@beam+@fed+@tau+dqmFastTimerServiceClient'],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fabiocos and @jfernan2 wouldn't be easier to define a new key, like:
'DQMMessageLogger' : ['DQMMessageLoggerSeq',
'PostDQMOffline',
'DQMMessageLoggerClientSeq']
and then call @DQMMessageLogger in both step1 and step2 ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was the approach I was testing now... Thanks!

@mmusich
Copy link
Contributor

mmusich commented Nov 4, 2019

I don't understand how these workflows passed the test in my original PR

I am not sure the ZB workflows are part of the limited or short matrices (so might have escaped internal scrutiny).
@fabiocos as ZB are a key ingredient of tracker/tracking data certification would it be possible to pick at least one of them in the matrix run for PR tests?

@fabiocos
Copy link
Contributor Author

fabiocos commented Nov 4, 2019

@mmusich no, these ZB workflows are a bit special and not part of the current short matrix:

found a total of  803  workflows:
      of which the following 34 were selected:
4.22   RunCosmics2011A+RECOCOSD+ALCACOSD+SKIMCOSD+HARVESTDC 
4.53   RunPhoton2012B+HLTD+RECODR1reHLT+HARVESTDR1reHLT 
5.1    TTbarFS+HARVESTFS                   
7.3    CosmicsSPLoose_UP18+DIGICOS_UP18+RECOCOS_UP18+ALCACOS_UP18+HARVESTCOS_UP18 
8.0    BeamHalo+DIGICOS+RECOCOS+ALCABH+HARVESTCOS 
9.0    Higgs200ChargedTaus+DIGI+RECO+HARVEST 
25.0   TTbar+DIGI+RECOAlCaCalo+HARVEST+ALCATT 
135.4  ZEEFS_13+HARVESTUP15FS+MINIAODMCUP15FS 
136.731 RunSinglePh2016B+HLTDR2_2016+RECODR2_2016reHLT_skimSinglePh_HIPM+HARVESTDR2 
136.7611 RunJetHT2016E_reminiaod+REMINIAOD_data2016_HIPM+HARVESTDR2_REMINIAOD_data2016_HIPM 
136.788 RunSinglePh2017B+HLTDR2_2017+RECODR2_2017reHLT_skimSinglePh_Prompt+HARVEST2017 
136.8311 RunJetHT2017F_reminiaod+REMINIAOD_data2017+HARVEST2017_REMINIAOD_data2017 
136.85 RunEGamma2018A+HLTDR2_2018+RECODR2_2018reHLT_skimEGamma_Offline_L1TEgDQM+HARVEST2018_L1TEgDQM 
140.53 RunHI2011+RECOHID11+HARVESTDHI      
140.56 RunHI2018+RECOHID18+HARVESTDHI18    
158.0  HydjetQ_B12_5020GeV_2018_ppReco+DIGIHI2018PPRECO+RECOHI2018PPRECO+ALCARECOHI2018PPRECO+HARVESTHI2018PPRECO 
1306.0 SingleMuPt1_UP15+DIGIUP15+RECOUP15+HARVESTUP15 
1325.7 TTbar_13_94Xv2NanoAODINPUT+NANOEDMMC2017_94XMiniAODv2+HARVESTNANOAODMC2017_94XMiniAODv2 
1330.0 ZMM_13+DIGIUP15+RECOUP15_L1TMuDQM+HARVESTUP15_L1TMuDQM+NANOUP15 
101.0  SingleElectronE120EHCAL             
25202.0 TTbar_13+DIGIUP15_PU25+RECOUP15_PU25+HARVESTUP15_PU25+NANOUP15_PU25 
1000.0 RunMinBias2011A+TIER0+SKIMD+HARVESTDfst2+ALCASPLIT 
1001.0 RunMinBias2011A+TIER0EXP+ALCAEXP+ALCAHARVDSIPIXELCALRUN1+ALCAHARVD1+ALCAHARVD2+ALCAHARVD3+ALCAHARVD4+ALCAHARVD5 
10024.0 TTbar_13TeV_TuneCUETP8M1_2017_GenSimFull+DigiFull_2017+RecoFull_2017+HARVESTFull_2017+ALCAFull_2017+NanoFull_2017 
10042.0 ZMM_13TeV_TuneCUETP8M1_2017_GenSimFull+DigiFull_2017+RecoFull_2017+HARVESTFull_2017+ALCAFull_2017+NanoFull_2017 
10224.0 TTbar_13TeV_TuneCUETP8M1_2017PU_GenSimFull+DigiFullPU_2017PU+RecoFullPU_2017PU+HARVESTFullPU_2017PU+NanoFull_2017PU 
10824.0 TTbar_13TeV_TuneCUETP8M1_2018_GenSimFull+DigiFull_2018+RecoFull_2018+HARVESTFull_2018+ALCAFull_2018+NanoFull_2018 
11634.0 TTbar_14TeV_TuneCUETP8M1_2021_GenSimFull+DigiFull_2021+RecoFull_2021+HARVESTFull_2021+ALCAFull_2021 
12434.0 TTbar_14TeV_TuneCUETP8M1_2023_GenSimFull+DigiFull_2023+RecoFull_2023+HARVESTFull_2023+ALCAFull_2023 
20034.0 TTbar_14TeV_TuneCUETP8M1_2026D35_GenSimHLBeamSpotFull14+DigiFullTrigger_2026D35+RecoFullGlobal_2026D35+HARVESTFullGlobal_2026D35 
20434.0 TTbar_14TeV_TuneCUETP8M1_2026D41_GenSimHLBeamSpotFull14+DigiFullTrigger_2026D41+RecoFullGlobal_2026D41+HARVESTFullGlobal_2026D41 
21234.0 TTbar_14TeV_TuneCUETP8M1_2026D44_GenSimHLBeamSpotFull14+DigiFullTrigger_2026D44+RecoFullGlobal_2026D44+HARVESTFullGlobal_2026D44 
22034.0 TTbar_14TeV_TuneCUETP8M1_2026D46_GenSimHLBeamSpotFull14+DigiFullTrigger_2026D46+RecoFullGlobal_2026D46+HARVESTFullGlobal_2026D46 
250202.181 TTbar_13UP18+PREMIXUP18_PU25+DIGIPRMXLOCALUP18_PU25+RECOPRMXUP18_PU25+HARVESTUP18_PU25 

I am not terribly concerned about this, as they are run in any IB, and this is likely the reason why @jfernan2 missed it, unless he did an explicit test. Breaking the IB for a few rounds for such a large refactoring is not something terrible, provided we fix it quickly, and I try to wait the last moment before an important build for this kind of migrations. And if we pick one we might miss an issue in another (there were two distinct categories of problems in this case).

It is clear that whatever new implementation will have to be explicitly tested against this subset of dedicated workflows.

@jfernan2
Copy link
Contributor

jfernan2 commented Nov 4, 2019

I second the proposal by @mmusich . Despite I made runThematrix -i all it seems I totally missed these failures in the jungle of outputs provided by the command which went beyond the total capacity of my local account. Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants