Muon arbitration #18321

drkovalskyi · 2017-04-11T23:20:55Z

Enforced Tracker Muon arbitration during reconstruction to reduce number of fake muons produced during reconstruction.

cmsbuild · 2017-04-11T23:21:10Z

A new Pull Request was created by @drkovalskyi for master.

It involves the following packages:

RecoMuon/MuonIdentification

@perrotta, @cmsbuild, @slava77, @davidlange6 can you please review it and eventually sign? Thanks.
@bellan, @abbiendi, @jhgoh, @echapon, @calderona, @HuguesBrun, @battibass, @trocino, @bachtis, @rociovilar this is something you requested to watch as well.
@Muzaffar, @davidlange6, @smuzaffar you are the release manager for this.

cms-bot commands are listed here #13028

slava77 · 2017-04-11T23:24:05Z

@cmsbuild please test

cmsbuild · 2017-04-11T23:24:22Z

The tests are being triggered in jenkins.
https://cmssdt.cern.ch/jenkins/job/ib-any-integration/19119/console Started: 2017/04/12 01:28

cmsbuild · 2017-04-12T00:09:54Z

+1
Tested at: c105343
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-18321/19119/summary.html

cmsbuild · 2017-04-12T00:09:57Z

Comparison job queued.

cmsbuild · 2017-04-12T02:29:02Z

Comparison is ready
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-18321/19119/summary.html

Comparison Summary:

No significant changes to the logs found
Reco comparison results: 4485 differences found in the comparisons
DQMHistoTests: Total files compared: 23
DQMHistoTests: Total histograms compared: 1921491
DQMHistoTests: Total failures: 21756
DQMHistoTests: Total nulls: 0
DQMHistoTests: Total successes: 1899562
DQMHistoTests: Total skipped: 173
DQMHistoTests: Total Missing objects: 0
Checked 94 log files, 14 edm output root files, 23 DQM output files

slava77 · 2017-04-14T12:33:22Z

RecoMuon/MuonIdentification/plugins/MuonIdProducer.cc

@@ -637,6 +636,10 @@ void MuonIdProducer::produce(edm::Event& iEvent, const edm::EventSetup& iSetup)
     }
   }

+   // muon arbitration
+   fillArbitrationInfo( outputMuons.get() );


if (fillMatching_) was removed.
Does the flag fillMatching_ still make any sense or is it redundant?

It's redundant. Didn't want to break anything removing it completely. Will remove it in future.

slava77 · 2017-04-14T12:35:34Z

RecoMuon/MuonIdentification/plugins/MuonIdProducer.cc

@@ -637,6 +636,10 @@ void MuonIdProducer::produce(edm::Event& iEvent, const edm::EventSetup& iSetup)
     }
   }

+   // muon arbitration
+   fillArbitrationInfo( outputMuons.get() );
+   arbitrateMuons( outputMuons.get(), caloMuons.get() );


should this always be done or should there be an option to not do it (for studies etc).
Recall that this module is used also to make muons as seeds for inside-out tracking iteration.
I can guess that for that case it's OK to have a larger number of options.

similar question (as for earlyMuons used in iter tk seeding) for earlyDisplacedMuons.
And, also for muonsFromCosmics, muonsFromCosmics1Leg which are running both in pp and in proper cosmic data taking.
At least for proper cosmics it may be better to keep unarbitrated parts.

I agree, providing an option can be of some use, but I don't think we need it. Point me to a single case where anyone is using non-arbitrated muons of any time intentionally.

my example for iterative tracking seeding with "earlyMuons" still stands as an example of somewhat intentional use of non-arbitrated muons: seeding should better be loose

IMO, it's easier to add an option and apply arbitration just to the default _muons_ than to prove there are no issues and no cost in all alternative reconstructions.
But we can go the other way and then this PR needs tests for all of these cases (cosmics, seeding, displaced muons).

Agree. Will add this option. Do I need to submit a new PR in this case?

I looked at implementing it - it's too much hassle. The module is used in tons of places (HLT) and cloned in even larger number of places. Providing it as an untracked parameter with a default is wrong, because it's important parameter. So I think it should stay as is, i.e. TrackerMuons will get arbitrated. I don't see why earlyMuons should not be arbitrated if they are TrackerMuons. All the zoo of muons that we have now won't be affected unless they are TrackerMuons, which is a specific algorithm and it can change.

please use fillDescriptions to set a default value for a parameter.
The default can be overridden by specifying the parameter in a module instance where it's needed.

slava77 · 2017-04-14T12:53:25Z

RecoMuon/MuonIdentification/plugins/MuonIdProducer.cc

+	}
+      } 
+    }
+    muon++;


what happens here if muon = muons->erase(muon); above was done on the last element?
It looks like an undefined behavior. Better change to if (muon!=muons->end()) muon++;

if it happens on the last element muon=muons->end() and for-loop exits. See no reason to modify it.

erase happens on the last element, then in muon = muons->erase(muon); the muon is the same as the end() and in this line you will apply end()++, which is ill-defined and will not match the for loop condition

There is a continue statement after muons->erase(muon). There won't be end()++ call.

slava77 · 2017-04-14T13:01:35Z

RecoMuon/MuonIdentification/python/muons1stStep_cfi.py

@@ -20,7 +20,7 @@

    fillEnergy = cms.bool(True),
    # OR
-    maxAbsPullX = cms.double(4.0),
+    maxAbsPullX = cms.double(3.0),


this will probably be noticeable in B=0T setup as a loss of 1/4 at lower p values. IIRC default track p in B=0T is 5 GeV. So, below this value there will be some signal loss.
... probably OK.

Are you worried about 1GeV or less muons in B=0T case getting multiple scattering underestimated? It's non-issue in my opinion. 4sigma matching cut was very loose when it was introduced before Run1. All TrackerMuon selectors use at least 3 sigma cut or tighter. 4-sigma gives too many fakes in real data.

slava77 · 2017-04-15T08:46:29Z

@drkovalskyi
comments to the code need a follow up before this PR can be merged.
Please check.
Thank you.

slava77 · 2017-04-15T18:46:00Z

On 4/15/17 11:42 AM, drkovalskyi wrote: ***@***.**** commented on this pull request. ------------------------------------------------------------------------ In RecoMuon/MuonIdentification/python/muons1stStep_cfi.py <#18321 (comment)>: > @@ -20,7 +20,7 @@ fillEnergy = cms.bool(True), # OR - maxAbsPullX = cms.double(4.0), + maxAbsPullX = cms.double(3.0), Are you worried about 1GeV or less muons in B=0T case getting multiple scattering underestimated? It's non-issue in my opinion. 4sigma matching cut was very loose when it was introduced before Run1. All TrackerMuon selectors use at least 3 sigma cut or tighter. 4-sigma gives too many fakes in real data.

p = 1 GeV muons do not get through the calorimeter. I was talking about those in range ~2.5 (rangeout) to 5 GeV. For a 2.5 GeV muon, since track is reported as a 5 GeV, the pulls are off by a factor of 2 and this cut becomes effectively a 1.5 sigma compared to the current 2 (4/2) for these muons

…

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#18321 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AEdcbjI-PFHi5t-OGsWyqGSLe8t48YXGks5rwQ-KgaJpZM4M6xA6>.

slava77 · 2017-04-15T18:47:11Z

On 4/15/17 11:46 AM, drkovalskyi wrote: There is a continue statement after muons->erase(muon). There won't be end()++ call.

ah, thanks. I missed it

drkovalskyi · 2017-04-15T18:49:17Z

I think we can live with a small inefficiency of tighter matching for low pt muons at B=0. It's not a reason to waste space for real data. Beside, we have other algorithms to recover if TrackerMuon fails.

slava77 · 2017-04-15T18:53:15Z

On 4/15/17 11:49 AM, drkovalskyi wrote: I think we can live with a small inefficiency of tighter matching for low pt muons at B=0. It's not a reason to waste space for real data. Beside, we have other algorithms to recover if TrackerMuon fails.

what are the other algorithms? (standalone doesn't work here)

…

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#18321 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AEdcbt3p70-_ca0n4jS7f1ytQ4g8GmrGks5rwREtgaJpZM4M6xA6>.

slava77 · 2017-04-15T19:17:51Z

In cosmic muons the number of matches now returns zero. Looks like something didn't get filled.

e.g. from muon gun wf 10009.0

compared to that, the regular muons look OK

these efficiency plots look suspicious (I haven't checked the definitions, just showing the plots):
muon gun pt10 (wf 10007.0, 1K events)

ttbar PU35 (wf 10224.0 100 events)

[The same events are reconstructed so, the correlation is high.]

slava77 · 2017-04-18T17:04:40Z

@drkovalskyi
Please provide comments on the last post #18321 (comment) with plots showing some potential loss in efficiency and also missing matches in cosmic muons

cmsbuild · 2017-04-22T18:25:23Z

+1
Tested at: aa23e33
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-18321/19332/summary.html

cmsbuild · 2017-04-22T18:25:25Z

Comparison job queued.

cmsbuild · 2017-04-22T19:42:01Z

Comparison is ready
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-18321/19332/summary.html

Comparison Summary:

No significant changes to the logs found
Reco comparison results: 5612 differences found in the comparisons
DQMHistoTests: Total files compared: 23
DQMHistoTests: Total histograms compared: 1826502
DQMHistoTests: Total failures: 31870
DQMHistoTests: Total nulls: 0
DQMHistoTests: Total successes: 1794459
DQMHistoTests: Total skipped: 173
DQMHistoTests: Total Missing objects: 0
Checked 94 log files, 14 edm output root files, 23 DQM output files

slava77 · 2017-04-23T02:04:17Z

RecoMuon/MuonIdentification/plugins/MuonIdProducer.cc

@@ -637,6 +637,11 @@ void MuonIdProducer::produce(edm::Event& iEvent, const edm::EventSetup& iSetup)
     }
   }

+   if (arbitrateTrackerMuons_){
+     fillArbitrationInfo( outputMuons.get() );


in case arbitrateTrackerMuons_ is true, what will happen with the call at L721

fillArbitrationInfo( outputMuons.get(), reco::Muon::TrackerMuon );

is the repeated call harmless or do we need an extra protection?

There is a protection already in place in fillArbitrationInfo. There is a check if a segment has already been arbitrated: https://github.com/drkovalskyi/cmssw/blob/muon-arbitration/RecoMuon/MuonIdentification/plugins/MuonIdProducer.cc#L1037

drkovalskyi · 2017-04-24T10:51:24Z

Is there any open issues holding merging of this request?

slava77 · 2017-04-24T20:24:32Z

RecoMuon/MuonIdentification/python/muons1stStep_cfi.py

+    ),
+
+    # tracker muon arbitration
+    arbitrateTrackerMuons = cms.bool(True)


The effect of this flag is that it ends up being set in all reconstructions (earlyMuons, a variety of cosmics etc).
Since the default is false, it stays unset in HLT, but even that is probably until the next re-parsing of confDB.

I guess we can be "proactive" with this PR and have this change essentially universal.
The cases that will find some issues in the following more broad validation can have this flag set to false where needed.

The option is there. If we ever find a case where it needs to be set false it's easy to do. I really doubt that it will ever happen.

slava77 · 2017-04-24T21:52:51Z

The efficiency plots posted in #18321 (comment) are based on MC truth asocciation and apparently have SIM in the denominator.
The "_Trk" in the plot corresponds to "isTrackerMuon" in the numerator.
So, the absolute efficiency drops by 5% in ttbar sample (likely by a larger value after taking out the W->[tau->]muon entries).
The distribution vs pt indeed shows that the 0-15 GeV bin efficiency for tracker muons is down from about 57% to 49%.
I guess something like J/psi reconstruction in ttbar b-jets will have large losses (this is a clean enough reco that wouldn't really need arbitration).

Slides sent in private emails stated that the impact is much smaller. It's unclear though if the difference is in the numerator definitions (e.g. global muons cover for some of the inefficiency in the post-arbitration tracker) or a sample like ttbar was not used to make final conclusions.
@drkovalskyi please comment

drkovalskyi · 2017-04-25T11:00:07Z

What plot do you refer to? Where do you get all these numbers?
In my study of the arbitration inefficiency I clearly see that it's often recovered by global muon algorithm. For muons at Pt>2GeV I see 3-4% efficiency loss, which is reduced to 1-2% overall efficiency loss taking into account all different muon types.
Regarding J/psi, can you point me to a single analysis doing what you said? As far as I can see people either use soft muons that rely on arbitration or have their own Muon ID from scratch like Bs->mm. B-tagging is also not using just any muon that they can find as far as I can see.
The bottom line is that while indeed there can be some efficiency loss that theoretically can be detected we have huge contribution of fake muons from not using arbitration causing problems in physics analyses (fake MET) and large amount of data stored in AOD, which doesn't fit well in CMS budget anymore.

slava77 · 2017-04-25T12:40:20Z

On 4/25/17 4:00 AM, drkovalskyi wrote: What plot do you refer to? Where do you get all these numbers?

The last plot in the thread posted on April 15 Here is a direct link https://cloud.githubusercontent.com/assets/4676718/25066226/3d1c2fc4-21d5-11e7-8212-9474a2601a57.png

In my study of the arbitration inefficiency I clearly see that it's often recovered by global muon algorithm. For muons at Pt>2GeV I see 3-4% efficiency loss, which is reduced to 1-2% overall efficiency loss taking into account all different muon types.

Which samples were tested, just to have a good reference? It would be nice to have a link to your slides/study posted in the PR description for self-documentation.

Regarding J/psi, can you point me to a single analysis doing what you said? As far as I can see people either use soft muons that rely on arbitration or have their own Muon ID from scratch like Bs->mm. B-tagging is also not using just any muon that they can find as far as I can see. The bottom line is that while indeed there can be some efficiency loss that theoretically can be detected we have huge contribution of fake muons from not using arbitration causing problems in physics analyses (fake MET) and large amount of data stored in AOD, which doesn't fit well in CMS budget anymore.

I only follow the argument about the data storage so far (which appears to be about 2%). I miss the point about effect on MET etc: on one hand you claim that all analyses are using arbitrated muons, on the other hand there is some huge contribution from not using arbitration causing problems. Where are the non-arbitrated muons used? (and why can't that code simply start using arbitration)

…

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#18321 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AEdcbkPbhlXqJo2VZ1bpEJCVtQ7i7HErks5rzdI3gaJpZM4M6xA6>.

drkovalskyi · 2017-04-25T15:47:02Z

I think all these questions are addressed here:
https://www.dropbox.com/s/htproz4z4j4vnn1/muon_cleanup_20170410.pdf?dl=0

If we find later that we can recover efficiency somewhere and can afford it it's just a matter of change one flag from true to false.

slava77 · 2017-04-25T16:05:31Z

Thank you for the link.
Which ttbar and ZMM samples were used (full names)?

drkovalskyi · 2017-04-25T18:58:56Z

/RelValTTbarLepton_13/CMSSW_9_1_0_pre2-PU25ns_90X_upgrade2017_realistic_v20-v1/GEN-SIM-RECO
/RelValZMM_13/CMSSW_9_1_0_pre2-PU25ns_90X_upgrade2017_realistic_v20-v1/GEN-SIM-RECO

slava77 · 2017-04-25T19:28:15Z

On 4/25/17 11:58 AM, drkovalskyi wrote: /RelValTTbarLepton_13/CMSSW_9_1_0_pre2-PU25ns_90X_upgrade2017_realistic_v20-v1/GEN-SIM-RECO /RelValZMM_13/CMSSW_9_1_0_pre2-PU25ns_90X_upgrade2017_realistic_v20-v1/GEN-SIM-RECO

Thanks. This should be well representative.

…

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#18321 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AEdcbhs-U0EnECWkveeCwhujkzxUGf3sks5rzkJxgaJpZM4M6xA6>.

drkovalskyi · 2017-04-27T02:46:43Z

What's holding this request from being merged?

slava77 · 2017-04-27T16:17:05Z

this PR has a strictly negative impact on downstream physics at a cost of gaining 1% in AOD size.
So, the remaining acceptable way for this PR to go in is to show that this loss is actually small.
(arguments about cleaning for MET and jets don't work because to first order PF already applies cleaning and further refinement could have been done in PF algo if that matters).

My initial quick check pointed that there could be a significant loss in efficiency.
I took some time to increase the test sample size to confirm your findings and check another sample (H->bb was easily available) to be sure that the observation of small impact stands for "non-prompt" muons.

More details later.

drkovalskyi · 2017-04-27T16:41:58Z

In order to reproduce my numbers you need to do matching of muons to the generated muons. Otherwise you can observe a larger "inefficiency", which is simply fake reduction and not efficiency loss.

slava77 · 2017-04-27T16:46:37Z

On 4/27/17 9:41 AM, drkovalskyi wrote: In order to reproduce my numbers you need to do matching of muons to the generated muons. Otherwise you can observe a larger "inefficiency", which is simply fake reduction and not efficiency loss.

That's somewhat clear. The DQM plots that gave me concerns (posted on April 15) have truth-level matching.

…

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#18321 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AEdcbmpCPyATzolhbc1o2s7NVMP73sw6ks5r0MVXgaJpZM4M6xA6>.

slava77 · 2017-04-27T21:45:05Z

+1

for #18321 aa23e33

changes are in line with description and provided slides: non-arbitrated muons are removed in existing reco configurations (with an option to allow them if needed).
- Benefits: disk space savings as well as increase of purity of reconstructed muons
- (minor) drawback: small unrecoverable loss of efficiency for real muons (mainly inside jets and not passing current recommended POG analysis selections).
jenkins tests pass and comparisons with baseline show changes in the number of muons, but show little effect on downstream monitored quantities.
local tests with higher stats and additional samples to those running in jenkins tests (includes e.g. high-pt dijets) show significant effects only on recoMuons_muons count and their direct monitoring plots and very little and all insignificant effect on downstream objects (e.g. pf muons, jets, MET). Efficiency loss for plain "tracker muon" selection is small and is considered acceptable based on the slides provided and the related discussion.

TTbar PU35 (phase-1 setup; same as the PR reference slides) on 1K events.
(This is an extension of the 100 events test with the "suspicious" plots posted on Apr 15. In retrospect, if that test didn't have a downward fluctuation, this would have been signed much earlier.)
This PR is in red.

tracker muons efficiency

"tracker or global" muons efficiency

So, the Apr 15 large difference was a fluc. The effect is closer to only 1% on tracker muons alone and is reduced significantly by using "tracker or global"

The same test done in PU70 has basically the same results, to confirm that increasing PU is not a concern.

VBF H->BB sample in phase-0 (2016) setup with PU35 on 600 events

the eff loss is about 1% at lower PT and seems acceptable.
The "tracker or global" has the same curve (no improvement).

the same on PU70 (mind the larger tracking fake rates here compared to the phase-1 ttbar above)

The "tracker or global" helps to probably acceptable level (considering that the PU70 for phase-0 is not a very realistic sample).

In PU200 (100 events only) no efficiency change is observed. The sample size is small though and apparently only signal muons are considered here.

The impact of this PR on workflow resource use is significant only in the output collection size. There is a small effect reducing CPU use in modules downstream from the muonID, but since many non-arbitrated muons were already filtered out, the reduction is small (maybe 0.5% on the total reco, well within 1 sigma of typical comparison precision).
The number of muons decreases

in ttbar PU35: by about a factor of 2 total; about x3 considering those failing TM ID (using last station angle tight TMLSAT)
in ttbar PU70: by a factor of 3; about x3.5 in trackerMuons overall; about 6 in those failing TMLSAT
in ttbar PU200: by 15% for the total muon count; by a factor of 6 in tracker muons. The rest is apparently dominated by the ME0/GEM/RPC muons.

cmsbuild · 2017-04-27T21:45:24Z

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request requires discussion in the ORP meeting before it's merged. @Muzaffar, @davidlange6, @smuzaffar

slava77 · 2017-08-03T12:30:20Z

this is a change in wf 10224

I'm adding this since the question came up in
https://hypernews.cern.ch/HyperNews/CMS/get/recoDevelopment/1541/1.html

cmsbuild added this to the Next CMSSW_9_1_X milestone Apr 11, 2017

cmsbuild added comparison-pending orp-pending pending-signatures reconstruction-pending tests-pending labels Apr 11, 2017

cmsbuild added tests-started and removed tests-pending labels Apr 11, 2017

cmsbuild added tests-approved and removed tests-started labels Apr 12, 2017

cmsbuild added comparison-available and removed comparison-pending labels Apr 12, 2017

slava77 reviewed Apr 14, 2017

View reviewed changes

slava77 mentioned this pull request Apr 18, 2017

Muon puppi iso miniAOD 91x #18275

Merged

cmsbuild added tests-approved and removed tests-started labels Apr 22, 2017

cmsbuild added comparison-available and removed comparison-pending labels Apr 22, 2017

slava77 reviewed Apr 23, 2017

View reviewed changes

slava77 mentioned this pull request Apr 24, 2017

Items needed but not yet signed for pre3 #18449

Closed

slava77 reviewed Apr 24, 2017

View reviewed changes

cmsbuild added fully-signed reconstruction-approved and removed pending-signatures reconstruction-pending labels Apr 27, 2017

davidlange6 merged commit 366951b into cms-sw:master Apr 28, 2017

Muon arbitration #18321

Muon arbitration #18321

Conversation

drkovalskyi commented Apr 11, 2017

cmsbuild commented Apr 11, 2017

slava77 commented Apr 11, 2017

cmsbuild commented Apr 11, 2017 • edited

cmsbuild commented Apr 12, 2017

cmsbuild commented Apr 12, 2017

cmsbuild commented Apr 12, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

slava77 commented Apr 15, 2017

slava77 commented Apr 15, 2017 via email

slava77 commented Apr 15, 2017 via email

drkovalskyi commented Apr 15, 2017

slava77 commented Apr 15, 2017 via email

slava77 commented Apr 15, 2017

slava77 commented Apr 18, 2017

cmsbuild commented Apr 22, 2017

cmsbuild commented Apr 22, 2017

cmsbuild commented Apr 22, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

drkovalskyi commented Apr 24, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

slava77 commented Apr 24, 2017

drkovalskyi commented Apr 25, 2017

slava77 commented Apr 25, 2017 via email

drkovalskyi commented Apr 25, 2017

slava77 commented Apr 25, 2017

drkovalskyi commented Apr 25, 2017

slava77 commented Apr 25, 2017 via email

drkovalskyi commented Apr 27, 2017

slava77 commented Apr 27, 2017

drkovalskyi commented Apr 27, 2017

slava77 commented Apr 27, 2017 via email

slava77 commented Apr 27, 2017

cmsbuild commented Apr 27, 2017

slava77 commented Aug 3, 2017

cmsbuild commented Apr 11, 2017 •

edited