Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add combined premixing workflows for Run 3 #27753

Merged
merged 4 commits into from Aug 23, 2019
Merged

Conversation

makortel
Copy link
Contributor

@makortel makortel commented Aug 12, 2019

PR description:

On an e-mail thread on #27672 it was noticed that there are no premixing workflows in the matrix for Run 3. This PR adds them for the upgrade matrix in a generic way, and brings three of them to the standard matrix (for years 2021, 2023, 2024).

I have two questions for which this PR is an RFC for now.

PR validation:

New workflows run

NANO is configured to use step3_inMINIAODSIM.root input file, but in
combined premixing workflow the file is step4_inMINIAODSIM.root.
Simplest option is to just drop the NANO step from these workflows.
@cmsbuild
Copy link
Contributor

The code-checks are being triggered in jenkins.

@cmsbuild
Copy link
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-27753/11412

  • This PR adds an extra 16KB to repository

@cmsbuild
Copy link
Contributor

A new Pull Request was created by @makortel (Matti Kortelainen) for master.

It involves the following packages:

Configuration/PyReleaseValidation

@cmsbuild, @prebello, @chayanit, @zhenhu, @kpedro88, @pgunnell can you please review it and eventually sign? Thanks.
@Martin-Grunewald this is something you requested to watch as well.
@davidlange6, @slava77, @fabiocos you are the release manager for this.

cms-bot commands are listed here

@makortel
Copy link
Contributor Author

@cmsbuild, please test workflow 11834.99,12634.99,13034.99

@cmsbuild
Copy link
Contributor

cmsbuild commented Aug 12, 2019

The tests are being triggered in jenkins.
https://cmssdt.cern.ch/jenkins/job/ib-run-pr-tests/1981/console Started: 2019/08/12 21:35

@cmsbuild
Copy link
Contributor

@cmsbuild
Copy link
Contributor

Comparison job queued.

@cmsbuild
Copy link
Contributor

Comparison is ready
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-9ec1f4/1981/summary.html

@slava77 comparisons for the following workflows were not done due to missing matrix map:

  • /build/cmsbld/jenkins/workspace/compare-root-files-short-matrix/results/JR-comparison/PR-9ec1f4/11834.99_TTbar_14TeV+TTbar_14TeV_TuneCUETP8M1_2021PU_GenSimFull+SingleNuE10_cf_2021PU_PremixFullPU_Premix+DigiFullPUPRMXCombined_2021PU+RecoFullPUPRMX_2021PU+HARVESTFullPU_2021PU
  • /build/cmsbld/jenkins/workspace/compare-root-files-short-matrix/results/JR-comparison/PR-9ec1f4/12634.99_TTbar_14TeV+TTbar_14TeV_TuneCUETP8M1_2023PU_GenSimFull+SingleNuE10_cf_2023PU_PremixFullPU_Premix+DigiFullPUPRMXCombined_2023PU+RecoFullPUPRMX_2023PU+HARVESTFullPU_2023PU
  • /build/cmsbld/jenkins/workspace/compare-root-files-short-matrix/results/JR-comparison/PR-9ec1f4/13034.99_TTbar_14TeV+TTbar_14TeV_TuneCUETP8M1_2024PU_GenSimFull+SingleNuE10_cf_2024PU_PremixFullPU_Premix+DigiFullPUPRMXCombined_2024PU+RecoFullPUPRMX_2024PU+HARVESTFullPU_2024PU

Comparison Summary:

  • No significant changes to the logs found
  • Reco comparison results: 0 differences found in the comparisons
  • DQMHistoTests: Total files compared: 34
  • DQMHistoTests: Total histograms compared: 2939508
  • DQMHistoTests: Total failures: 1
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 2939166
  • DQMHistoTests: Total skipped: 341
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 33 files compared)
  • Checked 142 log files, 14 edm output root files, 34 DQM output files

@makortel
Copy link
Contributor Author

An additional question that came to my mind: would it make sense to replace the combined premixing workflow in the limited matrix with the one for 2021?

@fabiocos
Copy link
Contributor

@makortel in future perspective yes IMO, I would say that for the UL reprocessing that part is now frozen (@civanch @mdhildreth @srimanob may comment further)
From the RECO point of view I do not think this should matter @slava77 @perrotta please comment in case

@cmsbuild
Copy link
Contributor

Comparison is ready
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-9ec1f4/2065/summary.html

Comparison Summary:

  • No significant changes to the logs found
  • Reco comparison results: 0 differences found in the comparisons
  • DQMHistoTests: Total files compared: 34
  • DQMHistoTests: Total histograms compared: 2939508
  • DQMHistoTests: Total failures: 1
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 2939166
  • DQMHistoTests: Total skipped: 341
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 33 files compared)
  • Checked 142 log files, 14 edm output root files, 34 DQM output files

@chayanit
Copy link

Done. Now .98 and .99 are defined for all PU workflows for the years 2021, 2023, 2024, and 2026 for the "upgrade" matrix (but only TTbar .99 are still brought in "2017" matrix).

There are still couple of issues to be addressed before .98 (and .97) become generally usable

Right. I have noticed this as well. will look how to implement it in.

would be needed (or the mechanism extended to cover premixed pileup library as well)

  • The "GEN fragment" name in .97 is "SingleNuE10", whereas in "premix" matrix the PREMIXUP18_PU25 (etc) notation is used. This string ends up in the dataset name in DBS, right? Should the PREMIX* be kept there (or at least attempted to)? (I find some value from the pileup library dataset name being different from the usual datasets)

Yes. I found this part is a bit tricky as .97 seems to take GEN fragment as you pointed. Any suggestion?

@chayanit
Copy link

+1

@makortel
Copy link
Contributor Author

  • The "GEN fragment" name in .97 is "SingleNuE10", whereas in "premix" matrix the PREMIXUP18_PU25 (etc) notation is used. This string ends up in the dataset name in DBS, right? Should the PREMIX* be kept there (or at least attempted to)? (I find some value from the pileup library dataset name being different from the usual datasets)

Yes. I found this part is a bit tricky as .97 seems to take GEN fragment as you pointed. Any suggestion?

I might have found a way to replace "SingleNuE10" with "PREMIX" in the workflow name, but let's leave that to a subsequent PR.

From where exactly is the dataset name picked up? E.g. the directory created by runTheMatrix has the form <num>_<name>+<step>[+<step2>], is it the <name> part? Or from one of the steps?

@kpedro88
Copy link
Contributor

+upgrade

@cmsbuild
Copy link
Contributor

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @davidlange6, @slava77, @smuzaffar, @fabiocos (and backports should be raised in the release meeting by the corresponding L2)

@kpedro88
Copy link
Contributor

@makortel is this ready to be merged?

@makortel
Copy link
Contributor Author

@kpedro88 Yes.

(the only follow up that I recall would be #27753 (comment), and that I'd really leave for a subsequent PR)

@kpedro88
Copy link
Contributor

+1

@chayanit
Copy link

  • The "GEN fragment" name in .97 is "SingleNuE10", whereas in "premix" matrix the PREMIXUP18_PU25 (etc) notation is used. This string ends up in the dataset name in DBS, right? Should the PREMIX* be kept there (or at least attempted to)? (I find some value from the pileup library dataset name being different from the usual datasets)

Yes. I found this part is a bit tricky as .97 seems to take GEN fragment as you pointed. Any suggestion?

I might have found a way to replace "SingleNuE10" with "PREMIX" in the workflow name, but let's leave that to a subsequent PR.

From where exactly is the dataset name picked up? E.g. the directory created by runTheMatrix has the form <num>_<name>+<step>[+<step2>], is it the <name> part? Or from one of the steps?

@makortel I would like to come back to this after we had been busy on UL campaigns for several months. Could you point me how to replace "SingleNuE10" with "PREMIX" in the workflow name and change the dataset name to have "PREMIX" instead?

Now Phase2 (2026) workflows also start testing premixing workflows but as we noticed that *.98 still need to include --pileup_input.

@makortel
Copy link
Contributor Author

I would like to come back to this after we had been busy on UL campaigns for several months. Could you point me how to replace "SingleNuE10" with "PREMIX" in the workflow name and change the dataset name to have "PREMIX" instead?

Let me try to find out what I did back then...

@makortel
Copy link
Contributor Author

@chayanit
Let's move the discussion to #28470.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants