Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updated parameters of the Mustache SC algorithm, optimized for Run 3 #31823

Merged
merged 2 commits into from Nov 4, 2020

Conversation

rchatter
Copy link

@rchatter rchatter commented Oct 16, 2020

This PR updates the hard coded parameters defining the Mustache superclustering algorithm used to cluster electrons and photons in the CMS ECAL. Here the optimization has been performed based on expected Run 3 conditions (but doesn't break anything if used with Run 2 conditions).

A more elegant implementation is underway where these parameters would be loaded from the DB, but this will not meet the last 11_2_X pre-release deadline next Wednesday, hence the hardcoded parameters are being updated.

These parameters have been approved by the ECAL DPG. The details of derivation and validation of the same may be seen in these slides from N. Marinelli and L. Zygala https://indico.cern.ch/event/949294/contributions/3988389/attachments/2091573/3514649/2020_08_26_Clustering.pdf

@nancymarinelli @lzygala @thomreis @fcouderc @simonepigazzini @mdonega

@cmsbuild
Copy link
Contributor

The code-checks are being triggered in jenkins.

@cmsbuild
Copy link
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-31823/19124

  • This PR adds an extra 12KB to repository

@cmsbuild
Copy link
Contributor

A new Pull Request was created by @rchatter (Rajdeep Mohan Chatterjee) for master.

It involves the following packages:

RecoEcal/EgammaCoreTools

@perrotta, @jpata, @cmsbuild, @slava77 can you please review it and eventually sign? Thanks.
@Sam-Harper, @jainshilpi, @argiro, @sobhatta, @thomreis, @afiqaize, @lgray, @varuns23 this is something you requested to watch as well.
@silviodonato, @dpiparo, @qliphy you are the release manager for this.

cms-bot commands are listed here

@jpata
Copy link
Contributor

jpata commented Oct 16, 2020

@cmsbuild please test

@cmsbuild
Copy link
Contributor

cmsbuild commented Oct 16, 2020

The tests are being triggered in jenkins.

@@ -68,25 +68,25 @@ namespace reco {
bool inDynamicDPhiWindow(
const float seedEta, const float seedPhi, const float ClustE, const float ClusEta, const float ClusPhi) {
// from Rishi's fits 06 June 2013 in log base 10
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new parameters are not taken from the Rishi's fit of more than 7 years ago, I imagine: either you update the comment, or better to remove it

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the comment to add the correct reference.
Thanks,
Rajdeep

@cmsbuild
Copy link
Contributor

+1
Tested at: 8ed0959
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-5a93d1/10017/summary.html
CMSSW: CMSSW_11_2_X_2020-10-15-2300
SCRAM_ARCH: slc7_amd64_gcc820

@cmsbuild
Copy link
Contributor

Comparison job queued.

@cmsbuild
Copy link
Contributor

The code-checks are being triggered in jenkins.

@cmsbuild
Copy link
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-31823/19129

  • This PR adds an extra 12KB to repository

@cmsbuild
Copy link
Contributor

+1
Tested at: 191a216
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-5a93d1/10022/summary.html
CMSSW: CMSSW_11_2_X_2020-10-15-2300
SCRAM_ARCH: slc7_amd64_gcc820

@cmsbuild
Copy link
Contributor

Comparison job queued.

@cmsbuild
Copy link
Contributor

Comparison is ready
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-5a93d1/10022/summary.html

Comparison Summary:

  • No significant changes to the logs found
  • Reco comparison results: 15979 differences found in the comparisons
  • DQMHistoTests: Total files compared: 35
  • DQMHistoTests: Total histograms compared: 2543752
  • DQMHistoTests: Total failures: 16116
  • DQMHistoTests: Total nulls: 7
  • DQMHistoTests: Total successes: 2527607
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 34 files compared)
  • DQMHistoSizes: changed ( 1000.0,... ): -0.004 KiB JetMET/SUSYDQM
  • DQMHistoSizes: changed ( 136.793 ): 0.008 KiB JetMET/SUSYDQM
  • Checked 149 log files, 22 edm output root files, 35 DQM output files

@jpata
Copy link
Contributor

jpata commented Oct 26, 2020

@rchatter we ran some additional validation on this PR with gun samples to be sure.

We see an increase of photon multiplicity per jet (16.0):
DQMData__Run1__JetMET__Runsummary__JetValidation__slimmedJets__photonMultiplicity.pdf

We also see a slightly better jet resolution (19.0):
DQMData__Run1__JetMET__Runsummary__JetValidation__ak4PFJetsCHS__PtRecoOverGen_E_20_40.pdf

Here's a list of all the DQM plots with MET or JetValidation:
http://jpata.web.cern.ch/jpata/reco/31823/16.0/
http://jpata.web.cern.ch/jpata/reco/31823/19.0/

Let me know if this is in line with your expectations and if you see anything else out of the ordinary in the DQM plots. Thanks!

@jpata
Copy link
Contributor

jpata commented Oct 26, 2020

Here's phase2 upgrade workflow DQM, please check this one, instead of the Run1 I posted above (thx Slava for noticing!)

DQMData__Run1__JetMET__Runsummary__JetValidation__ak4PFJetsCHS__photonMultiplicity

http://jpata.web.cern.ch/jpata/reco/31823/11705.0/
http://jpata.web.cern.ch/jpata/reco/31823/11602.0/

@simonepigazzini
Copy link
Contributor

Hi @jpata, do you have DQM plots for ECAL superclusters or pfPhotons/Electrons with a e/gamma gun sample? The mustache is really only used in the reco of e/gamma so any effect on jets in an indirect one. My guess here is that having a tighter parameters in the mustache superclustering means that we cluster less energy from PU into superclusters and therefore the jets peaks up more photon.

Thank you,

simone

@simonepigazzini
Copy link
Contributor

Hi again,

there should be a Hgg sample run as part of the standard validation matrix, @nancymarinelli could have a look at that if you saved the DQM from that workflow.

Here https://lzygala.web.cern.ch/lzygala/Envelope_OptPFRecHitThresholds34/mustache_comparison_SingleSetParams_Weighted/h_SC_nPFClusters.pdf you can also see the number of pfClusters (i.e. the same thing as photon multiplicity but for ECAL SC) with the old and new set of parameters. For ECAL SC the number of constituents decreases with the new parameters so it can be reasonable to expect that the rejected pfClusters are clustered by nearby jets, this would explain the plot above.

Could you also please confirm that these are Run2 DQM plots and which release is "ref"?

Thank you,

simone

@jpata
Copy link
Contributor

jpata commented Oct 28, 2020

Thx for the suggestion @simonepigazzini! The idea here was to check downstream things that might not otherwise have been in the DPG view in the validations, but we can also produce plots for specific EGamma validation. For the photon gun sample, here are a few that see bigger changes with this update. The rest of the plots are at the cern link above.

DQMData__Run1__EgammaV__Runsummary__pfPhotonValidator__Efficiencies__convEffVsEtTotal
DQMData__Run1__Egamma__Runsummary__gedPhotonAnalyzer__AllPhotons__Etabove20GeV__h_15_e1x5VsEt

Could you also please confirm that these are Run2 DQM plots and which release is "ref"?

11602.0 and 11705.0 are run3 workflows as far as I understand (2021), ref was CMSSW_11_2_X_2020-10-22-1100 without this PR included.

From the reco point of view, I would say as long as the changes are understood and noticed by the relevant POGs also downstream, this is acceptable and we can proceed. :)

@jpata
Copy link
Contributor

jpata commented Nov 3, 2020

Just to be sure of no surprises downstream, I ran some extended tests. While I don't find any major differences in downstream jet or met outputs, there are some fluctuations.

For example, 11630.0_QCD_Pt_3000_3500_14TeV+2021 on CMSSW_11_2_X_2020-11-01-2300:
DQMData__Run1__JetMET__Runsummary__METValidation__pfMet__METResolution_GenMETTrue_InMETBins
DQMData__Run1__JetMET__Runsummary__JetValidation__slimmedJets__photonMultiplicity

The full DQM output for JetValidation\|METValidation can be found here:
http://jpata.web.cern.ch/jpata/reco/31823/11630.0/
http://jpata.web.cern.ch/jpata/reco/31823/11671.0/
http://jpata.web.cern.ch/jpata/reco/31823/11705.0/

Overall, I think we can proceed with this PR from the reco side.

@rchatter
Copy link
Author

rchatter commented Nov 3, 2020

Hi Joosep,

Just to understand the second plot: Is the comparison on precisely the same events with and w/o this update, in which case the difference seems a "little" large. Also I don't precisely recall how exactly the photon multiplicity in a jet is evaluated. But most likely these are not photons using the Mustache SC algorithm so like you said shouldn't be a cause for worry [unless JETMET thinks otherwise].

Best,
Rajdeep

@jpata
Copy link
Contributor

jpata commented Nov 3, 2020

These should be the same events, generated with the workflows mentioned above. It is my understanding that running the GEN-SIM workflow twice with the same seed should give exactly the same outputs, modulo changes brought in by this PR.

@simonepigazzini
Copy link
Contributor

Hi All,

I think that despite the events being the same a change in what goes into the superclusters means that we can end up looking at different jets. I would say that there are no major trend differences w/ and w/o this PR. On the ECAL side we are confident that the new parameters are ok for the superclusters and we will soon have a deeper look at what happens with jets (validation with more events). I think the PR can be merged, keeping in mind that soon we will have a new PR moving these parameters to the database, at which point we will be easily able to fix any problem with the parameters on the fly through a new GT.

Best,

simone

@jpata
Copy link
Contributor

jpata commented Nov 3, 2020

+reconstruction

  • updates ECAL Moustache supercluster parameters for Run3 directly in the algorithm implementation
  • jenkins tests and DQM result in a large number of differences for EGamma-related outputs, which are understood
  • EGamma performance has been validated in the ECAL DPG
  • based on quick tests on the reco side, we don't observe major trends in downstream variables such as jets or MET

@cmsbuild
Copy link
Contributor

cmsbuild commented Nov 3, 2020

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @silviodonato, @dpiparo, @qliphy (and backports should be raised in the release meeting by the corresponding L2)

@qliphy
Copy link
Contributor

qliphy commented Nov 4, 2020

+1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants