Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

remove vertex type from PFCandidate (addresses #30895) #31298

Closed
wants to merge 2 commits into from

Conversation

veelken
Copy link
Contributor

@veelken veelken commented Aug 31, 2020

this commit addressed the GitHub issue #30895 :

  • set vertex position of PFCandidate in PFAlgo, PFEGammaAlgo, and PFMuonAlgo
  • remove vertex related member-functions from PFCandidate class (use the member-fuctions defined in the LeafCandidate base-class instead)
  • updated ClassDefs in classes_def.xml files accordingly

PR description:

This PR adresses the GitHub issue #30895 .

No changes are expected in the output, except (maybe) for PFCandidates of type electron.
This is because the enum value PFCandidate::kGSFVertex was not used before, causing the function PFCandidate::vertex() to never return the vertex associated with the gsfTrackRef(), which I believe is a (minor) bug. In this PR, the vertex of PFCandidates of type electron is set to the vertex assocciated with the gsfTrackRef() in PFEGammaAlgo, which I believe is the intended behaviour.

PR validation:

if this PR is a backport please specify the original PR and why you need to backport that PR:

Before submitting your pull requests, make sure you followed this checklist:

- set vertex position of PFCandidate in PFAlgo, PFEGammaAlgo, and PFMuonAlgo
- remove vertex related member-functions from PFCandidate class (use the member-fuctions defined in the LeafCandidate base-class instead)
- updated ClassDefs in classes_def.xml files accordingly
@cmsbuild cmsbuild changed the base branch from CMSSW_11_2_X to master August 31, 2020 09:09
@cmsbuild
Copy link
Contributor

@veelken, CMSSW_11_2_X branch is closed for direct updates. cms-bot is going to move this PR to master branch.
In future, please use cmssw master branch to submit your changes.

@cmsbuild
Copy link
Contributor

The code-checks are being triggered in jenkins.

@cmsbuild
Copy link
Contributor

-code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-31298/18025

  • This PR adds an extra 92KB to repository

Code check has found code style and quality issues which could be resolved by applying following patch(s)

@slava77
Copy link
Contributor

slava77 commented Aug 31, 2020

Code check has found code style and quality issues which could be resolved by applying following patch(s)

these should be addressed before further tests of this PR can proceed

@slava77
Copy link
Contributor

slava77 commented Aug 31, 2020

@veelken
please edit the PR title so that it at least includes #30895 in full (a URL can be replaced by #30895 ),
although something more direct is better, e.g. "remove vertex type from PFCandidate (addresses #20895)"

@slava77
Copy link
Contributor

slava77 commented Aug 31, 2020

@hatakeyamak @bendavid
please check this and clarify if the changes are OK for PF.
Thank you.

@veelken veelken changed the title this commit addressed the GitHub issue https://github.com/cms-sw/cmss… remove vertex type from PFCandidate (addresses #30895) Aug 31, 2020
@cmsbuild
Copy link
Contributor

The code-checks are being triggered in jenkins.

@veelken
Copy link
Contributor Author

veelken commented Aug 31, 2020

Hi,

I have renamed the title of the PR and applied the patches code-format.patch and code-format.patch .

@cmsbuild
Copy link
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-31298/18031

  • This PR adds an extra 92KB to repository

@cmsbuild
Copy link
Contributor

A new Pull Request was created by @veelken (Christian Veelken) for master.

It involves the following packages:

DataFormats/ParticleFlowCandidate
DataFormats/PatCandidates
RecoParticleFlow/PFProducer

@perrotta, @jpata, @cmsbuild, @santocch, @slava77 can you please review it and eventually sign? Thanks.
@mmarionncern, @gouskos, @cbernet, @rovere, @lgray, @hatakeyamak, @gpetruc, @peruzzim, @seemasharmafnal this is something you requested to watch as well.
@silviodonato, @dpiparo, @qliphy you are the release manager for this.

cms-bot commands are listed here

@jpata
Copy link
Contributor

jpata commented Aug 31, 2020

@veelken thanks for preparing a first implementation for the issue #30895, as suggested in #30895 (comment).

However, if you are planning to hand this over to someone else, they will not be able to update your PR.

Hence, if the plan is that the PF group takes over this proposed modification, it would be best to close this PR and simply link your new branch (good to put it with a descriptive branch name, like veelken:issue_30895) in the issue. Then the next person can check out your branch and open a new PR that they can update.

@cmsbuild
Copy link
Contributor

cmsbuild commented Sep 1, 2020

Comparison job queued.

@cmsbuild
Copy link
Contributor

cmsbuild commented Sep 1, 2020

Comparison is ready
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-b9c548/9032/summary.html

Comparison Summary:

  • No significant changes to the logs found
  • Reco comparison results: 38 differences found in the comparisons
  • DQMHistoTests: Total files compared: 35
  • DQMHistoTests: Total histograms compared: 2609667
  • DQMHistoTests: Total failures: 7
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 2609638
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 34 files compared)
  • Checked 149 log files, 22 edm output root files, 35 DQM output files

@jpata
Copy link
Contributor

jpata commented Sep 1, 2020

@veelken @hatakeyamak as reported by the test, there are a number of minor differences in the isolation values, at least, see the example below:

image
https://cmssdt.cern.ch/SDT/jenkins-artifacts/baseLineComparisons/CMSSW_11_2_X_2020-08-31-2300+b9c548/38573/validateJR/all_OldVSNew_RunJetHT2016EreMINIAODwf136p7611/all_OldVSNew_RunJetHT2016EreMINIAODwf136p7611c_patElectrons_slimmedElectrons__PAT_obj___isolations__4_.png

Given that this PR is expected to be technical (no reco change), please let us know how you want to move forward.

@veelken
Copy link
Contributor Author

veelken commented Sep 2, 2020

Hi,

just to clarify: a small change for electrons is ok, I think, resulting from the fact that the code so far took the vertex position for electrons from the KF track and not from the GSF track, due to the issue with the enum value PFCandidate::kGSFVertex not being used before. @hatakeyamak , do you agree that the changes are small enough to be ok ?

@jpata
Copy link
Contributor

jpata commented Sep 2, 2020

Note that there are differences also in e.g. muon and photon isolation values:
https://cmssdt.cern.ch/SDT/jenkins-artifacts/baseLineComparisons/CMSSW_11_2_X_2020-08-31-2300+b9c548/38573/validateJR/all_OldVSNew_RunJetHT2018DreMINIAODULwf136p88811/

Let's wait for @hatakeyamak decision on how to proceed. If this PR goes ahead, I will do a code review.

@hatakeyamak
Copy link
Contributor

hatakeyamak commented Sep 3, 2020

Thank you @veelken and @jpata, and sorry for delays from my side.
To me this looks good, and I don't see the issue with moving forward.
@bendavid ok with you as well?

The use of vertex assocciated with the gsfTrackRef() in PFEGammaAlgo seems like a right thing to do, but @bendavid is more familiar with the PFEGamma related code than me.

@jpata
Copy link
Contributor

jpata commented Sep 3, 2020

I've gone through the code and it looks fine to me. Vertex is just a point, which previously was loaded from a referenced collection, but is now saved directly. Since we no longer save the vertex type, the event size decreases. What I don't understand is why it was implemented through the references in the first place, rather than in the straightforward way as this PR does.

Do you know @hatakeyamak?

@hatakeyamak
Copy link
Contributor

hatakeyamak commented Sep 3, 2020

Nope. So, to me this (in this PR) looks like a fine approach.

@jpata
Copy link
Contributor

jpata commented Sep 4, 2020

@slava77 @perrotta what do you think, can you recall any reason why the vertex location was loaded from a referenced collection, rather than saved in the candidate directly? Otherwise, the code looks fine to me.

@perrotta
Copy link
Contributor

perrotta commented Sep 4, 2020

@slava77 @perrotta what do you think, can you recall any reason why the vertex location was loaded from a referenced collection, rather than saved in the candidate directly? Otherwise, the code looks fine to me.

No idea, and looking at the history available doesn't help either

@jpata
Copy link
Contributor

jpata commented Sep 7, 2020

Thanks! Before proceeding with this, I just want to check also with EGamma contacts @afiqaize @SohamBhattacharya. Do you have any thoughts on this change to PFCandidate vertex, which has some minor effects on electron and photon isolation?

@afiqaize
Copy link
Contributor

afiqaize commented Sep 7, 2020

We agree that the change for electrons is the intended behavior. For photons, we are wondering if the changes are understood (maybe the changes happen only to converted photons etc)?

@jpata
Copy link
Contributor

jpata commented Sep 9, 2020

note: I'm digging through the isolation code to understand why the photon, muon and tau isolation changes, but haven't hit the reason yet.

@veelken
Copy link
Contributor Author

veelken commented Sep 9, 2020

Hi Joosep,

the tau isolation takes into account all charged PFCandidates within a dz < 0.2 cm window around the tau production vertex. My hypothesis that using the vertex associated to the GSF track instead of the KF track for electrons causes some electrons to either move into or move out of the dz < 0.2 cm window. I am not an expert for the isolation of photons and muons, but I believe they can be subject to a similar effect (arising from a dz cut that is applied in order to suppress pileup contributions to their isolation pT-sum).

@slava77
Copy link
Contributor

slava77 commented Sep 11, 2020

@jpata
in https://user-images.githubusercontent.com/69717/91840145-e61a9380-ec58-11ea-9867-21e6a8f22dae.png
isolations_[4] is PfChargedHadronIso .
These are filled in https://github.com/cms-sw/cmssw/blob/master/PhysicsTools/PatAlgos/plugins/PATElectronProducer.cc#L894-L909
Depending on the electron being PF or not, it will come from pfChargedHadrons = cms.InputTag("elPFIsoValueCharged04PFIdPAT") or "elPFIsoValueCharged04NoPFIdPAT".
It may be useful to check if the diffs are all from PFId or from NoPFId.

@jpata
Copy link
Contributor

jpata commented Sep 14, 2020

Thanks Slava! Following the thread for process.phPFIsoDepositChargedPAT, it seems that CandViewExtractor in

if ((dR < theDR_Max) && (dR > theDR_Veto) && (std::abs(it->vz() - cand.vz()) < theDiff_z) &&
is looping over pions which all now have vtx()=(0,0,0), causing incorrect behaviour in the dz and rho cuts in workflow 136.7611, thus the isolation is different.

this PR                                                                                            | reference
%MSG-d CandViewExtractor:  CandIsoDepositProducer:phPFIsoDepositChargedPAT  14-Sep-2020 15:50:59 EE|  %MSG-d CandViewExtractor:  CandIsoDepositProducer:phPFIsoDepositChargedPAT  14-Sep-2020 15:50:58 EE
cand eta=2.28516 phi=-0.894788 vtx=(0.0601291,0.102278,-1.90386)                                   |  cand eta=2.28516 phi=-0.894788 vtx=(0.0601291,0.102278,-1.90386)
%MSG                                                                                               |  %MSG
%MSG-d CandViewExtractor:  CandIsoDepositProducer:phPFIsoDepositChargedPAT  14-Sep-2020 15:50:59 EE|  %MSG-d CandViewExtractor:  CandIsoDepositProducer:phPFIsoDepositChargedPAT  14-Sep-2020 15:50:58 EE
pdgid=-211 vtx=(0,0,0) dR=4.01993 dvz=1.90386 drho=0.118644                                        |  pdgid=-211 vtx=(0.0555814,0.0833093,-1.9246) dR=4.01993 dvz=-0.0207404 drho=0.0195064              
%MSG                                                                                               |  %MSG
%MSG-d CandViewExtractor:  CandIsoDepositProducer:phPFIsoDepositChargedPAT  14-Sep-2020 15:50:59 EE|  %MSG-d CandViewExtractor:  CandIsoDepositProducer:phPFIsoDepositChargedPAT  14-Sep-2020 15:50:58 EE
pdgid=-211 vtx=(0,0,0) dR=1.74093 dvz=1.90386 drho=0.118644                                        |  pdgid=-211 vtx=(0.0673064,0.0994989,-1.89405) dR=1.74093 dvz=0.00980163 drho=0.00769662            
%MSG                                                                                               |  %MSG

I guess this has to do with requiring the vertex being saved in the RECO step that runs PF, which makes later workflows that do not run reconstruction in-place incorrect.
As we need to be able to process old files, I think a check might be needed with fallback to the old behaviour if we want to adopt this new mechanism going forward.

Thoughts?

@jpata
Copy link
Contributor

jpata commented Sep 14, 2020

Suggestion from Slava: add IO read rule to the class def XML, such that the old version of the class would be able to load the correct vertex collection in the case of e.g. reMINIAOD jobs.

@jpata
Copy link
Contributor

jpata commented Sep 14, 2020

@veelken will you do this, or should we (reco) take over at this point?

@veelken
Copy link
Contributor Author

veelken commented Sep 14, 2020

@jpata It would be great if you (reco) could take over at this point. Thank you!

@jpata
Copy link
Contributor

jpata commented Sep 14, 2020

-1

OK. I think the easiest will be that you close this PR, and I open a new one with a branch I can change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants