Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make HLT use "schedule" #35858

Merged
merged 2 commits into from Nov 5, 2021
Merged

make HLT use "schedule" #35858

merged 2 commits into from Nov 5, 2021

Conversation

missirol
Copy link
Contributor

@missirol missirol commented Oct 27, 2021

PR description:

The first step towards the resolution of #35842 could be the renaming of HLTSchedule to schedule, using one of the standard HLT customisation functions. This PR is an attempt at that.

The final solution will have to come eventually with a change in ConfDB.

Standalone configs in test/ that use/assume HLTSchedule have been ignored for the moment (except for one that I know is used used to run in unit tests).

Merely technical; no changes expected.

FYI: @fwyzard @Sam-Harper

PR validation:

addOnTests.py and a few runTheMatrix wfs passed.

If this PR is a backport, please specify the original PR and why you need to backport that PR:

N/A

For more information, see
cms-sw#35842
@cmsbuild
Copy link
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-35858/26238

  • This PR adds an extra 48KB to repository

@cmsbuild
Copy link
Contributor

A new Pull Request was created by @missirol (Marino Missiroli) for master.

It involves the following packages:

  • Configuration/Applications (operations)
  • HLTrigger/Configuration (hlt)
  • RecoTauTag/HLTProducers (hlt)

@perrotta, @Martin-Grunewald, @cmsbuild, @missirol, @qliphy, @fabiocos, @davidlange6 can you please review it and eventually sign? Thanks.
@silviodonato, @Martin-Grunewald, @makortel, @azotz, @mbluj, @fabiocos this is something you requested to watch as well.
@perrotta, @dpiparo, @qliphy you are the release manager for this.

cms-bot commands are listed here

@missirol
Copy link
Contributor Author

enable gpu

To test some of the wfs that originally gave problems in #35624

@missirol
Copy link
Contributor Author

please test

@cmsbuild
Copy link
Contributor

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-e1900e/19972/summary.html
COMMIT: 957ed95
CMSSW: CMSSW_12_1_X_2021-10-26-2300/slc7_amd64_gcc900
Additional Tests: GPU
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/35858/19972/install.sh to create a dev area with all the needed externals and cmssw changes.

GPU Comparison Summary

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 0 differences found in the comparisons
  • DQMHistoTests: Total files compared: 4
  • DQMHistoTests: Total histograms compared: 19782
  • DQMHistoTests: Total failures: 8
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 19774
  • DQMHistoTests: Total skipped: 0
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 3 files compared)
  • Checked 12 log files, 9 edm output root files, 4 DQM output files
  • TriggerResults: no differences found

Comparison Summary

@slava77 comparisons for the following workflows were not done due to missing matrix map:

  • /data/cmsbld/jenkins/workspace/compare-root-files-short-matrix/data/PR-e1900e/11634.914_TTbar_14TeV+2021_DDDDB+TTbar_14TeV_TuneCP5_GenSim+Digi+Reco+HARVEST+ALCA
  • /data/cmsbld/jenkins/workspace/compare-root-files-short-matrix/data/PR-e1900e/38634.0_TTbar_14TeV+2026D86+TTbar_14TeV_TuneCP5_GenSimHLBeamSpot14+DigiTrigger+RecoGlobal+HARVESTGlobal

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 6 differences found in the comparisons
  • DQMHistoTests: Total files compared: 42
  • DQMHistoTests: Total histograms compared: 2901440
  • DQMHistoTests: Total failures: 229
  • DQMHistoTests: Total nulls: 1
  • DQMHistoTests: Total successes: 2901188
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.004 KiB( 41 files compared)
  • DQMHistoSizes: changed ( 312.0 ): 0.004 KiB MessageLogger/Warnings
  • Checked 177 log files, 37 edm output root files, 42 DQM output files
  • TriggerResults: no differences found

@missirol
Copy link
Contributor Author

There are indeed some changes in outputs from this PR.
As it is, the new order of the paths in the schedule is changed wrt baseline in some wfs: when HLT is one of the wf steps, the current impl puts the HLT paths first in the schedule, instead of following exactly the order of the wf steps.
I can look into a way to fix this if needed (suggestions from experts are welcome, obviously).

@Martin-Grunewald
Copy link
Contributor

Hmm, the order needs to be maintained, for example, we run in some tests the steps L1REPACK followed by HLT (using the repacked L1 information), so all path(s) of L1REPACK must be run before all paths of HLT.

@missirol
Copy link
Contributor Author

Okay, thanks. I think it's anyway better to have really no changes in outputs. Working on it. Apologies for the iteration..

@makortel
Copy link
Contributor

Thanks @missirol!

About

Hmm, the order needs to be maintained, for example, we run in some tests the steps L1REPACK followed by HLT (using the repacked L1 information), so all path(s) of L1REPACK must be run before all paths of HLT.

just to clarify, the framework runs all Paths and EndPaths in the Schedule concurrently regardless of their order. Modules are run primarily in the order of their data dependencies (modules in Path, as opposed to Task, also in the order they are in Path). So from the framework module scheduling point of view, the order of Paths in the Schedule does not matter.

There may, of course, be other reasons (like convincing ourselves that this PR is indeed purely technical, as mentioned above) to prefer to preserve the earlier order of Paths.

@Martin-Grunewald
Copy link
Contributor

Martin-Grunewald commented Oct 27, 2021

Hmm, I can't believe that as L1REPACK is used to change the L1 menu and then the HLT is using the changed L1 seeds. If L1REPACK is not run before HLT, HLT would crash with invalid L1 seeds...
Or we run --step=DIGI,L1,DIGI2RAW,HLT and that works only if DIGI2RAW has run before HLT (as HLT expects the FEDRAWDataCollection created by DIGI2RAW)

@makortel
Copy link
Contributor

Or we run --step=DIGI,L1,DIGI2RAW,HLT and that works only if DIGI2RAW has run before HLT (as HLT expects the FEDRAWDataCollection created by DIGI2RAW)

Framework runs the module producing FEDRAWDataCollection before any module that consumes it (or really, had declared to consume it). This is what I meant with "framework runs modules in the order of their data dependencies".

@cmsbuild
Copy link
Contributor

cmsbuild commented Nov 3, 2021

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-e1900e/20217/summary.html
COMMIT: 8cc72bd
CMSSW: CMSSW_12_2_X_2021-11-02-2300/slc7_amd64_gcc900
Additional Tests: GPU
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/35858/20217/install.sh to create a dev area with all the needed externals and cmssw changes.

GPU Comparison Summary

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 0 differences found in the comparisons
  • DQMHistoTests: Total files compared: 4
  • DQMHistoTests: Total histograms compared: 19782
  • DQMHistoTests: Total failures: 7
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 19775
  • DQMHistoTests: Total skipped: 0
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 3 files compared)
  • Checked 12 log files, 9 edm output root files, 4 DQM output files
  • TriggerResults: no differences found

Comparison Summary

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 0 differences found in the comparisons
  • DQMHistoTests: Total files compared: 42
  • DQMHistoTests: Total histograms compared: 2901890
  • DQMHistoTests: Total failures: 0
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 2901868
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 41 files compared)
  • Checked 177 log files, 37 edm output root files, 42 DQM output files
  • TriggerResults: no differences found

@missirol
Copy link
Contributor Author

missirol commented Nov 3, 2021

As expected, I see no differences in outputs from this PR (modulo the unrelated numerical differences in 11634.506, as discussed previously). I'll wait another bit in case there are any comments from others, then I would sign this for HLT.

@missirol
Copy link
Contributor Author

missirol commented Nov 4, 2021

+hlt

  • technical update specific to HLT

  • name of process schedule is changed from HLTSchedule to schedule (when needed) inside the standard HLT customisation function customizeHLTforAll; other HLT customisation functions are updated accordingly (HLTSchedule still appears in some test/ configs; this could be cleaned up in a future PR)

  • this is a preliminary step towards the resolution of Enforce that Process can have at most one Schedule object, and the label must be schedule #35842; the final solution will require a (simple) change in the ConfDB source code (will follow up in the GH issue)

  • no changes in outputs, as intended

@perrotta
Copy link
Contributor

perrotta commented Nov 5, 2021

+1

@cmsbuild
Copy link
Contributor

cmsbuild commented Nov 5, 2021

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will be automatically merged.

@cmsbuild cmsbuild merged commit 06f15e3 into cms-sw:master Nov 5, 2021
@qliphy
Copy link
Contributor

qliphy commented Nov 6, 2021

@missirol There is a unit test failure in IB after merging this PR, would you please have a look?

https://cmssdt.cern.ch/SDT/cgi-bin/logreader/slc7_amd64_gcc900/CMSSW_12_2_X_2021-11-05-2300/unitTestLogs/SLHCUpgradeSimulations/Geometry#/

@missirol missirol deleted the devel_hltSchedule branch November 6, 2021 07:35
@missirol
Copy link
Contributor Author

missirol commented Nov 6, 2021

@qliphy @perrotta

Indeed, my mistake. One of the test/ configs I didn't modify was used in a unit-test, so that failed.
#36019 fixes the problem. In a future PR I will remove all references to HLTSchedule, just in case.

Comment on lines -233 to -235
if "HLTSchedule" in process.__dict__:
ind = process.HLTSchedule.index(process.AlCa_LumiPixelsCounts_Random_v1)
process.HLTSchedule.remove(process.AlCa_LumiPixelsCounts_Random_v1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@missirol these adding/removing of the paths to/from the schedule are no longer necessary, because the "official" schedule object handles them automatically ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's my understanding from Matti's explanations (and I verified it by hand on a few examples). Since in this PR I used customizeHLTforAll to do the trick, that should always be used before other customisations, and afaik that's always the case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants