Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

addition to Join Opt Meeting 2017.10.20 action item : rollback HLTDQM #21231

Merged
merged 10 commits into from Nov 9, 2017

Conversation

davidlange6
Copy link
Contributor

Arrange changes in #20980 so that the extra hlt histograms can be enabled also in workflows not running the "standard dqm"

@cmsbuild
Copy link
Contributor

cmsbuild commented Nov 9, 2017

The code-checks are being triggered in jenkins.

@cmsbuild
Copy link
Contributor

cmsbuild commented Nov 9, 2017

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/PR-21231/1891

@cmsbuild
Copy link
Contributor

cmsbuild commented Nov 9, 2017

A new Pull Request was created by @davidlange6 (David Lange) for master.

It involves the following packages:

Configuration/PyReleaseValidation
Configuration/StandardSequences
DQMOffline/Configuration
DQMOffline/Trigger
HLTriggerOffline/Btag
HLTriggerOffline/Common
HLTriggerOffline/Muon

@prebello, @vazzolini, @dmitrijus, @kmaeshima, @kpedro88, @fabozzi, @cmsbuild, @franzoni, @jfernan2, @GurpreetSinghChahal, @vanbesien, @davidlange6 can you please review it and eventually sign? Thanks.
@ghellwig, @felicepantaleo, @abbiendi, @Martin-Grunewald, @threus, @battibass, @makortel, @acaudron, @jhgoh, @HuguesBrun, @ferencek, @trocino, @rociovilar, @GiacomoSguazzoni, @rovere, @VinInn, @ebrondol, @mtosi, @dgulhan, @swertz, @imarches, @calderona, @mverzett, @JyothsnaKomaragiri, @pvmulder this is something you requested to watch as well.
@davidlange6, @slava77 you are the release manager for this.

cms-bot commands are listed here

@davidlange6
Copy link
Contributor Author

please test

@cmsbuild
Copy link
Contributor

cmsbuild commented Nov 9, 2017

The tests are being triggered in jenkins.
https://cmssdt.cern.ch/jenkins/job/ib-any-integration/24303/console Started: 2017/11/09 13:23

@cmsbuild
Copy link
Contributor

cmsbuild commented Nov 9, 2017

@cmsbuild
Copy link
Contributor

cmsbuild commented Nov 9, 2017

Comparison job queued.

@cmsbuild
Copy link
Contributor

cmsbuild commented Nov 9, 2017

Comparison is ready
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-21231/24303/summary.html

Comparison Summary:

  • No significant changes to the logs found
  • Reco comparison results: 18 differences found in the comparisons
  • DQMHistoTests: Total files compared: 27
  • DQMHistoTests: Total histograms compared: 2829115
  • DQMHistoTests: Total failures: 6
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 2828931
  • DQMHistoTests: Total skipped: 178
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram bins added: 0.0 ( 6 files compared)
  • Checked 111 log files, 8 edm output root files, 27 DQM output files

@davidlange6
Copy link
Contributor Author

merge

@cmsbuild cmsbuild merged commit 34487ba into cms-sw:master Nov 9, 2017
@Dr15Jones
Copy link
Contributor

I'm pretty sure this pull request broke the IB RelVal workflow 136.7802. See
https://cmssdt.cern.ch/SDT/cgi-bin/buildlogs/slc7_amd64_gcc630/CMSSW_10_0_X_2017-11-09-2300/pyRelValMatrixLogs/run/136.7802_RunHLTPhy2017B_AODextra+RunHLTPhy2017B_AODextra+DQMHLTonAODextra_2017+HARVESTDQMHLTonAOD_2017/step2_RunHLTPhy2017B_AODextra+RunHLTPhy2017B_AODextra+DQMHLTonAODextra_2017+HARVESTDQMHLTonAOD_2017.log

The problem boiled down to

An exception of category 'ScheduleExecutionFailure' occurred while
   [0] Calling beginJob
Exception Message:
Unrunnable schedule
Module run order problem found: 
egmGsfElectronIDsForDQM after VBFSUSYmonitoring [path dqmoffline_step],
...{all the other modules on path dqmoffline_step}...
 SinglePhoton300_monitoring after B2GegHLTDQMOfflineTnPSource [path dqmoffline_step],
 B2GegHLTDQMOfflineTnPSource consumes egmGsfElectronIDsForDQM
 Running in the threaded framework would lead to indeterminate results.
 Please change order of modules in mentioned Path(s) to avoid inconsistent module ordering.
----- End Fatal Exception -------------------------------------------------

The file containing B2GegHLTDQMOfflineTnPSource was modified by this pull request.

@davidlange6
Copy link
Contributor Author

davidlange6 commented Nov 10, 2017 via email

@makortel
Copy link
Contributor

I think we should (in addition) move DQM modules from EndPath to Path as much as possible (also for many other reasons than creating so easily unrunnable schedule).

@Dr15Jones
Copy link
Contributor

@makortel I've been thinking about adding another path type such as ParallelPath (or AsyncPath) which works like an EndPath (i.e. filtering is ignored) but all modules are started simultaneously. This would help the framework schedule efficiently but still allow people to group modules they want to run together in a job.

@davidlange6
Copy link
Contributor Author

davidlange6 commented Nov 10, 2017 via email

@dmitrijus
Copy link
Contributor

@makortel I've been thinking about adding another path type such as ParallelPath (or AsyncPath) which works like an EndPath (i.e. filtering is ignored) but all modules are started simultaneously. This would help the framework schedule efficiently but still allow people to group modules they want to run together in a job.

Yes, YES!

@Dr15Jones Would it be possible to have a chat with you, let's say, tomorrow, skype/vidyo/real life?

Yes, I agree - the dqm seems not to support this philosophy unfortunately...

I personally support that fully.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants