Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve autoAlca dictionaries layout #37360

Merged
merged 1 commit into from Apr 7, 2022

Conversation

francescobrivio
Copy link
Contributor

@francescobrivio francescobrivio commented Mar 26, 2022

PR description:

Following suggestion in #37306 (comment) this PR restructures the layout of the AlCaRecoMatrix dictionaries in autoAlCa.py.

AlCaRecoMatrix is maintained unchanged, while AlCaRecoMatrix2017, AlCaRecoMatrix2018 and AlCaRecoMatrixRereco (now renamed AlCaRecoMatrix2016) are copied from AlCaRecoMatrix dict and then customized for the specific changes in 2016, 2017 and 2018.
Special customization dicts are used to define the specific customizations for each year and, as suggested during the review, the update method of dict is used.

PR validation:

Code compiles.

Backport:

Not a backport but a backport for 12_3_X might be needed.

FYI @cms-sw/alca-l2 @mmusich

@cmsbuild
Copy link
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-37360/29010

  • This PR adds an extra 12KB to repository

@cmsbuild
Copy link
Contributor

A new Pull Request was created by @francescobrivio for master.

It involves the following packages:

  • Configuration/AlCa (alca)

@cmsbuild, @malbouis, @tvami, @yuanchao, @francescobrivio can you please review it and eventually sign? Thanks.
@Martin-Grunewald, @missirol, @mmusich, @fabiocos, @tocheng this is something you requested to watch as well.
@perrotta, @dpiparo, @qliphy you are the release manager for this.

cms-bot commands are listed here

@mmusich
Copy link
Contributor

mmusich commented Mar 26, 2022

I'm now wondering if your suggestion was rather to have a AlCaRecoMatrixDefault dictionary as baseline and then ad hoc customizations for:

@francescobrivio, no, this is what I meant (anyway AlCaRecoMatrix is the only one that counts, as it is used in RelVals).
Incidentally:

  • using update method of dict should make the changes more compact (?)
  • what about AlCaRecoMatrixRereco ?

@francescobrivio
Copy link
Contributor Author

no, this is what I meant

Ok! I was just wondering because with the current changes each time we modify the AlCaRecoMatrix (which should not be too often anyway) we should add a new AlCaRecoMatrix[YEAR] and check if the customizations for 2017, 2018.... need update as well, right? While having a "static" AlCaRecoMatrixDefault dictionary, on which all the others are based on, would allow to only modify the standard AlCaRecoMatrix...

  • using update method of dict should make the changes more compact (?)

Thanks! I'll look into using the update method.

  • what about AlCaRecoMatrixRereco?

Yea I was about to do it, but then I was doubious if this was the best way to implement the changes and I wanted to get feedback before doing more work.

@francescobrivio
Copy link
Contributor Author

as it is used in RelVals

btw I think it's also used in the configBuilder IIUC. See:

from Configuration.AlCa.autoAlca import autoAlca, AlCaNoConcurrentLumis

@mmusich
Copy link
Contributor

mmusich commented Mar 26, 2022

While having a "static" AlCaRecoMatrixDefault dictionary, on which all the others are based on, would allow to only modify the standard AlCaRecoMatrix...

A small price to pay to avoid obfuscation of the actual main dictionary IMHO.

btw I think it's also used in the configBuilder IIUC.

Among other things yes, though the effect on @allForPrompt is what it makes it more "visible" in PR tests.

@mmusich
Copy link
Contributor

mmusich commented Mar 26, 2022

Yea I was about to do it, but then I was doubious if this was the best way to implement the changes and I wanted to get feedback before doing more work.

I think that was used by PdmV only for the 2016 rereco, so you could take the occasion to rename according to the convention of the other ones

@francescobrivio
Copy link
Contributor Author

A small price to pay to avoid obfuscation of the actual main dictionary IMHO.

I totally agree!

@yuanchao
Copy link
Contributor

@cmsbuild please test

@francescobrivio
Copy link
Contributor Author

@cmsbuild please abort

  • @yuanchao I still need to apply the suggestions from Marco :)

@cmsbuild
Copy link
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-37360/29036

  • This PR adds an extra 12KB to repository

@francescobrivio
Copy link
Contributor Author

@smuzaffar are the tests for this PR stuck? They were triggered 15 hours ago.

@smuzaffar
Copy link
Contributor

sorry there was a bug introduced in bot which cause few PRs jobs to fail ( this PR is one of those https://cmssdt.cern.ch/jenkins/job/ib-schedule-pr-tests/56047/console ) . I have started the tested now

@francescobrivio
Copy link
Contributor Author

sorry there was a bug introduced in bot which cause few PRs jobs to fail ( this PR is one of those https://cmssdt.cern.ch/jenkins/job/ib-schedule-pr-tests/56047/console ) . I have started the tested now

ok thanks a lot!

@cmsbuild
Copy link
Contributor

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-a5d256/23481/summary.html
COMMIT: 3b11879
CMSSW: CMSSW_12_4_X_2022-03-28-1100/slc7_amd64_gcc10
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/37360/23481/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 12 differences found in the comparisons
  • DQMHistoTests: Total files compared: 48
  • DQMHistoTests: Total histograms compared: 3589460
  • DQMHistoTests: Total failures: 19
  • DQMHistoTests: Total nulls: 1
  • DQMHistoTests: Total successes: 3589418
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: -0.004 KiB( 47 files compared)
  • DQMHistoSizes: changed ( 312.0 ): -0.004 KiB MessageLogger/Warnings
  • Checked 200 log files, 45 edm output root files, 48 DQM output files
  • TriggerResults: no differences found

@francescobrivio
Copy link
Contributor Author

@tvami @malbouis feel free to sign this:

  • A part from the few know issues (in MTD and MessageLogger comparisons), tests are clean
  • Just for reference we discussed internally in @cms-sw/alca-l2 and we think that, once the final alcareco matrix for Run 3 is built, we could also clean up the AlCaRecoMatrix[YEAR] dicts (which are currently unused and only kept for bookkeeping) and move them to some other documentation

@mmusich
Copy link
Contributor

mmusich commented Mar 29, 2022

and we think that, once the final alcareco matrix for Run 3 is built, we could also clean up the AlCaRecoMatrix[YEAR] dicts (which are currently unused and only kept for bookkeeping) and move them to some other documentation

the (good) reason why these have been introduced, was in order to have releases for the EOY re-recos which are self-contained, such that PdmV could fetch the matrix to run directly from here, instead of having to piece it together manually over twiki or e-mail exchanges in a highly error prone way. While for the past it is maybe less important (though I don't see exactly the compelling reason to delete them), I would encourage to maintain the good habit for future re-recos.

@tvami
Copy link
Contributor

tvami commented Mar 29, 2022

Hi All, sorry for joining late on the discussion here (I'm back from vacation for good now). Do I understand this correctly that every time when a new stream is added for the future, we'll need to edit all the past matrices so that we remove something that was never present in that year? That sounds a bit counterintuitive to me.

The argument against having a default ALCARECO matrix and adding new lines to that when a new stream appears, and then in the first line of the file, equaling the latest to the AlCaRecoMatrix is that the only relevant AlCaRecoMatrix part is present in the code only on the first line? And by not reading the code carefully one would assume that the default is used? What if the default is just named well (AlCaRecoMatrix2016), isnt that problem just gone?

@mmusich
Copy link
Contributor

mmusich commented Mar 29, 2022

Do I understand this correctly that every time when a new stream is added for the future, we'll need to edit all the past matrices so that we remove something that was never present in that year? That sounds a bit counterintuitive to me.

see #37360 (comment)

The argument against having a default ALCARECO matrix and adding new lines to that when a new stream appears, and then in the first line of the file, equaling the latest to the AlCaRecoMatrix is that the only relevant AlCaRecoMatrix part is present in the code only on the first line? And by not reading the code carefully one would assume that the default is used? What if the default is just named well (AlCaRecoMatrix2016), isnt that problem just gone?

I have honestly difficulty to unpack what you mean here :D can you try to re-phrase?

@tvami
Copy link
Contributor

tvami commented Mar 29, 2022

I have honestly difficulty to unpack what you mean here :D can you try to re-phrase?

Hahaha, sorry.

Something like

AlCaRecoMatrix = AlCaRecoMatrix2021.copy()

AlCaRecoMatrix2021 = AlCaRecoMatrix2018.copy()
AlCaRecoMatrix2021.update(customizationFor2021)

AlCaRecoMatrix2018 = AlCaRecoMatrix2017.copy()
AlCaRecoMatrix2018.update(customizationFor2018)

AlCaRecoMatrix2017 = AlCaRecoMatrix2016.copy()
AlCaRecoMatrix2016.update(customizationFor2017)

and then defining AlCaRecoMatrix2016 as the original dict, and then all the customizationFor* just adds the new streams

@mmusich
Copy link
Contributor

mmusich commented Mar 29, 2022

and then defining AlCaRecoMatrix2016 as the original dict, and then all the customizationFor* just adds the new streams

repeating myself, IMHO it obfuscates the values used in the AlCaRecoMatrix constructed in this way because you can't read it directly but you need to run it through python. Matter of taste I guess.

@tvami
Copy link
Contributor

tvami commented Apr 5, 2022

+alca

@cmsbuild
Copy link
Contributor

cmsbuild commented Apr 5, 2022

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @perrotta, @dpiparo, @qliphy (and backports should be raised in the release meeting by the corresponding L2)

@malbouis
Copy link
Contributor

malbouis commented Apr 5, 2022

+alca

  • the autoAlCa script has been cleaned up.
  • there might be other changes in the future but for now AlCaDB thinks it is good as is.

@qliphy
Copy link
Contributor

qliphy commented Apr 6, 2022

please test
just to refresh the previous one done one week ago.

@cmsbuild
Copy link
Contributor

cmsbuild commented Apr 6, 2022

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-a5d256/23670/summary.html
COMMIT: 3b11879
CMSSW: CMSSW_12_4_X_2022-04-05-1100/slc7_amd64_gcc10
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/37360/23670/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 6 differences found in the comparisons
  • DQMHistoTests: Total files compared: 48
  • DQMHistoTests: Total histograms compared: 3593039
  • DQMHistoTests: Total failures: 14
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3593003
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 47 files compared)
  • Checked 200 log files, 45 edm output root files, 48 DQM output files
  • TriggerResults: no differences found

@qliphy
Copy link
Contributor

qliphy commented Apr 7, 2022

+1

@cmsbuild cmsbuild merged commit 9c08edf into cms-sw:master Apr 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants