Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Muon arbitration #18321

Merged
merged 6 commits into from Apr 28, 2017
Merged

Muon arbitration #18321

merged 6 commits into from Apr 28, 2017

Conversation

drkovalskyi
Copy link
Contributor

Enforced Tracker Muon arbitration during reconstruction to reduce number of fake muons produced during reconstruction.

@cmsbuild
Copy link
Contributor

A new Pull Request was created by @drkovalskyi for master.

It involves the following packages:

RecoMuon/MuonIdentification

@perrotta, @cmsbuild, @slava77, @davidlange6 can you please review it and eventually sign? Thanks.
@bellan, @abbiendi, @jhgoh, @echapon, @calderona, @HuguesBrun, @battibass, @trocino, @bachtis, @rociovilar this is something you requested to watch as well.
@Muzaffar, @davidlange6, @smuzaffar you are the release manager for this.

cms-bot commands are listed here #13028

@slava77
Copy link
Contributor

slava77 commented Apr 11, 2017

@cmsbuild please test

@cmsbuild
Copy link
Contributor

cmsbuild commented Apr 11, 2017

The tests are being triggered in jenkins.
https://cmssdt.cern.ch/jenkins/job/ib-any-integration/19119/console Started: 2017/04/12 01:28

@cmsbuild
Copy link
Contributor

@cmsbuild
Copy link
Contributor

Comparison job queued.

@cmsbuild
Copy link
Contributor

Comparison is ready
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-18321/19119/summary.html

Comparison Summary:

  • No significant changes to the logs found
  • Reco comparison results: 4485 differences found in the comparisons
  • DQMHistoTests: Total files compared: 23
  • DQMHistoTests: Total histograms compared: 1921491
  • DQMHistoTests: Total failures: 21756
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 1899562
  • DQMHistoTests: Total skipped: 173
  • DQMHistoTests: Total Missing objects: 0
  • Checked 94 log files, 14 edm output root files, 23 DQM output files

@@ -637,6 +636,10 @@ void MuonIdProducer::produce(edm::Event& iEvent, const edm::EventSetup& iSetup)
}
}

// muon arbitration
fillArbitrationInfo( outputMuons.get() );
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if (fillMatching_) was removed.
Does the flag fillMatching_ still make any sense or is it redundant?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's redundant. Didn't want to break anything removing it completely. Will remove it in future.

@@ -637,6 +636,10 @@ void MuonIdProducer::produce(edm::Event& iEvent, const edm::EventSetup& iSetup)
}
}

// muon arbitration
fillArbitrationInfo( outputMuons.get() );
arbitrateMuons( outputMuons.get(), caloMuons.get() );
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this always be done or should there be an option to not do it (for studies etc).
Recall that this module is used also to make muons as seeds for inside-out tracking iteration.
I can guess that for that case it's OK to have a larger number of options.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

similar question (as for earlyMuons used in iter tk seeding) for earlyDisplacedMuons.
And, also for muonsFromCosmics, muonsFromCosmics1Leg which are running both in pp and in proper cosmic data taking.
At least for proper cosmics it may be better to keep unarbitrated parts.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, providing an option can be of some use, but I don't think we need it. Point me to a single case where anyone is using non-arbitrated muons of any time intentionally.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my example for iterative tracking seeding with "earlyMuons" still stands as an example of somewhat intentional use of non-arbitrated muons: seeding should better be loose

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, it's easier to add an option and apply arbitration just to the default _muons_ than to prove there are no issues and no cost in all alternative reconstructions.
But we can go the other way and then this PR needs tests for all of these cases (cosmics, seeding, displaced muons).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree. Will add this option. Do I need to submit a new PR in this case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked at implementing it - it's too much hassle. The module is used in tons of places (HLT) and cloned in even larger number of places. Providing it as an untracked parameter with a default is wrong, because it's important parameter. So I think it should stay as is, i.e. TrackerMuons will get arbitrated. I don't see why earlyMuons should not be arbitrated if they are TrackerMuons. All the zoo of muons that we have now won't be affected unless they are TrackerMuons, which is a specific algorithm and it can change.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please use fillDescriptions to set a default value for a parameter.
The default can be overridden by specifying the parameter in a module instance where it's needed.

}
}
}
muon++;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what happens here if muon = muons->erase(muon); above was done on the last element?
It looks like an undefined behavior. Better change to if (muon!=muons->end()) muon++;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if it happens on the last element muon=muons->end() and for-loop exits. See no reason to modify it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

erase happens on the last element, then in muon = muons->erase(muon); the muon is the same as the end() and in this line you will apply end()++, which is ill-defined and will not match the for loop condition

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a continue statement after muons->erase(muon). There won't be end()++ call.

@@ -20,7 +20,7 @@

fillEnergy = cms.bool(True),
# OR
maxAbsPullX = cms.double(4.0),
maxAbsPullX = cms.double(3.0),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this will probably be noticeable in B=0T setup as a loss of 1/4 at lower p values. IIRC default track p in B=0T is 5 GeV. So, below this value there will be some signal loss.
... probably OK.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you worried about 1GeV or less muons in B=0T case getting multiple scattering underestimated? It's non-issue in my opinion. 4sigma matching cut was very loose when it was introduced before Run1. All TrackerMuon selectors use at least 3 sigma cut or tighter. 4-sigma gives too many fakes in real data.

@slava77
Copy link
Contributor

slava77 commented Apr 15, 2017

@drkovalskyi
comments to the code need a follow up before this PR can be merged.
Please check.
Thank you.

@slava77
Copy link
Contributor

slava77 commented Apr 15, 2017 via email

@slava77
Copy link
Contributor

slava77 commented Apr 15, 2017 via email

@drkovalskyi
Copy link
Contributor Author

I think we can live with a small inefficiency of tighter matching for low pt muons at B=0. It's not a reason to waste space for real data. Beside, we have other algorithms to recover if TrackerMuon fails.

@slava77
Copy link
Contributor

slava77 commented Apr 15, 2017 via email

@slava77
Copy link
Contributor

slava77 commented Apr 15, 2017

In cosmic muons the number of matches now returns zero. Looks like something didn't get filled.

e.g. from muon gun wf 10009.0
all_sign898vsorig_singlemupt1000in2017wf10009p0c_recomuons_muonsfromcosmics1leg__reco_obj_numberofmatches

all_sign898vsorig_singlemupt1000in2017wf10009p0c_recomuons_muonsfromcosmics__reco_obj_numberofmatches

compared to that, the regular muons look OK
all_sign898vsorig_singlemupt1000in2017wf10009p0c_recomuons_muons__reco_obj_numberofmatches

these efficiency plots look suspicious (I haven't checked the definitions, just showing the plots):
muon gun pt10 (wf 10007.0, 1K events)
wf10007_eff_eta
ttbar PU35 (wf 10224.0 100 events)
wf10224_eff_eta
[The same events are reconstructed so, the correlation is high.]

@slava77
Copy link
Contributor

slava77 commented Apr 18, 2017

@drkovalskyi
Please provide comments on the last post #18321 (comment) with plots showing some potential loss in efficiency and also missing matches in cosmic muons

@cmsbuild
Copy link
Contributor

@cmsbuild
Copy link
Contributor

Comparison job queued.

@cmsbuild
Copy link
Contributor

Comparison is ready
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-18321/19332/summary.html

Comparison Summary:

  • No significant changes to the logs found
  • Reco comparison results: 5612 differences found in the comparisons
  • DQMHistoTests: Total files compared: 23
  • DQMHistoTests: Total histograms compared: 1826502
  • DQMHistoTests: Total failures: 31870
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 1794459
  • DQMHistoTests: Total skipped: 173
  • DQMHistoTests: Total Missing objects: 0
  • Checked 94 log files, 14 edm output root files, 23 DQM output files

@@ -637,6 +637,11 @@ void MuonIdProducer::produce(edm::Event& iEvent, const edm::EventSetup& iSetup)
}
}

if (arbitrateTrackerMuons_){
fillArbitrationInfo( outputMuons.get() );
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in case arbitrateTrackerMuons_ is true, what will happen with the call at L721

 fillArbitrationInfo( outputMuons.get(), reco::Muon::TrackerMuon );

is the repeated call harmless or do we need an extra protection?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a protection already in place in fillArbitrationInfo. There is a check if a segment has already been arbitrated: https://github.com/drkovalskyi/cmssw/blob/muon-arbitration/RecoMuon/MuonIdentification/plugins/MuonIdProducer.cc#L1037

@drkovalskyi
Copy link
Contributor Author

Is there any open issues holding merging of this request?

),

# tracker muon arbitration
arbitrateTrackerMuons = cms.bool(True)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The effect of this flag is that it ends up being set in all reconstructions (earlyMuons, a variety of cosmics etc).
Since the default is false, it stays unset in HLT, but even that is probably until the next re-parsing of confDB.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we can be "proactive" with this PR and have this change essentially universal.
The cases that will find some issues in the following more broad validation can have this flag set to false where needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The option is there. If we ever find a case where it needs to be set false it's easy to do. I really doubt that it will ever happen.

@slava77
Copy link
Contributor

slava77 commented Apr 24, 2017

The efficiency plots posted in #18321 (comment) are based on MC truth asocciation and apparently have SIM in the denominator.
The "_Trk" in the plot corresponds to "isTrackerMuon" in the numerator.
So, the absolute efficiency drops by 5% in ttbar sample (likely by a larger value after taking out the W->[tau->]muon entries).
The distribution vs pt indeed shows that the 0-15 GeV bin efficiency for tracker muons is down from about 57% to 49%.
I guess something like J/psi reconstruction in ttbar b-jets will have large losses (this is a clean enough reco that wouldn't really need arbitration).

Slides sent in private emails stated that the impact is much smaller. It's unclear though if the difference is in the numerator definitions (e.g. global muons cover for some of the inefficiency in the post-arbitration tracker) or a sample like ttbar was not used to make final conclusions.
@drkovalskyi please comment

@drkovalskyi
Copy link
Contributor Author

What plot do you refer to? Where do you get all these numbers?
In my study of the arbitration inefficiency I clearly see that it's often recovered by global muon algorithm. For muons at Pt>2GeV I see 3-4% efficiency loss, which is reduced to 1-2% overall efficiency loss taking into account all different muon types.
Regarding J/psi, can you point me to a single analysis doing what you said? As far as I can see people either use soft muons that rely on arbitration or have their own Muon ID from scratch like Bs->mm. B-tagging is also not using just any muon that they can find as far as I can see.
The bottom line is that while indeed there can be some efficiency loss that theoretically can be detected we have huge contribution of fake muons from not using arbitration causing problems in physics analyses (fake MET) and large amount of data stored in AOD, which doesn't fit well in CMS budget anymore.

@slava77
Copy link
Contributor

slava77 commented Apr 25, 2017 via email

@drkovalskyi
Copy link
Contributor Author

I think all these questions are addressed here:
https://www.dropbox.com/s/htproz4z4j4vnn1/muon_cleanup_20170410.pdf?dl=0

If we find later that we can recover efficiency somewhere and can afford it it's just a matter of change one flag from true to false.

@slava77
Copy link
Contributor

slava77 commented Apr 25, 2017

Thank you for the link.
Which ttbar and ZMM samples were used (full names)?

@drkovalskyi
Copy link
Contributor Author

/RelValTTbarLepton_13/CMSSW_9_1_0_pre2-PU25ns_90X_upgrade2017_realistic_v20-v1/GEN-SIM-RECO
/RelValZMM_13/CMSSW_9_1_0_pre2-PU25ns_90X_upgrade2017_realistic_v20-v1/GEN-SIM-RECO

@slava77
Copy link
Contributor

slava77 commented Apr 25, 2017 via email

@drkovalskyi
Copy link
Contributor Author

What's holding this request from being merged?

@slava77
Copy link
Contributor

slava77 commented Apr 27, 2017

this PR has a strictly negative impact on downstream physics at a cost of gaining 1% in AOD size.
So, the remaining acceptable way for this PR to go in is to show that this loss is actually small.
(arguments about cleaning for MET and jets don't work because to first order PF already applies cleaning and further refinement could have been done in PF algo if that matters).

My initial quick check pointed that there could be a significant loss in efficiency.
I took some time to increase the test sample size to confirm your findings and check another sample (H->bb was easily available) to be sure that the observation of small impact stands for "non-prompt" muons.

More details later.

@drkovalskyi
Copy link
Contributor Author

In order to reproduce my numbers you need to do matching of muons to the generated muons. Otherwise you can observe a larger "inefficiency", which is simply fake reduction and not efficiency loss.

@slava77
Copy link
Contributor

slava77 commented Apr 27, 2017 via email

@slava77
Copy link
Contributor

slava77 commented Apr 27, 2017

+1

for #18321 aa23e33

  • changes are in line with description and provided slides: non-arbitrated muons are removed in existing reco configurations (with an option to allow them if needed).
    • Benefits: disk space savings as well as increase of purity of reconstructed muons
    • (minor) drawback: small unrecoverable loss of efficiency for real muons (mainly inside jets and not passing current recommended POG analysis selections).
  • jenkins tests pass and comparisons with baseline show changes in the number of muons, but show little effect on downstream monitored quantities.
  • local tests with higher stats and additional samples to those running in jenkins tests (includes e.g. high-pt dijets) show significant effects only on recoMuons_muons count and their direct monitoring plots and very little and all insignificant effect on downstream objects (e.g. pf muons, jets, MET). Efficiency loss for plain "tracker muon" selection is small and is considered acceptable based on the slides provided and the related discussion.

TTbar PU35 (phase-1 setup; same as the PR reference slides) on 1K events.
(This is an extension of the 100 events test with the "suspicious" plots posted on Apr 15. In retrospect, if that test didn't have a downward fluctuation, this would have been signed much earlier.)
This PR is in red.

tracker muons efficiency
wf10224hs_mu_trk_effpt
"tracker or global" muons efficiency
wf10224hs_mu_trkorglb_effpt
So, the Apr 15 large difference was a fluc. The effect is closer to only 1% on tracker muons alone and is reduced significantly by using "tracker or global"

The same test done in PU70 has basically the same results, to confirm that increasing PU is not a concern.

VBF H->BB sample in phase-0 (2016) setup with PU35 on 600 events
wf25213hs_mu_trk_effpt
the eff loss is about 1% at lower PT and seems acceptable.
The "tracker or global" has the same curve (no improvement).

the same on PU70 (mind the larger tracking fake rates here compared to the phase-1 ttbar above)
wf25213pu70hs_mu_trk_effpt
wf25213pu70hs_mu_trkorglb_effpt
The "tracker or global" helps to probably acceptable level (considering that the PU70 for phase-0 is not a very realistic sample).

In PU200 (100 events only) no efficiency change is observed. The sample size is small though and apparently only signal muons are considered here.

The impact of this PR on workflow resource use is significant only in the output collection size. There is a small effect reducing CPU use in modules downstream from the muonID, but since many non-arbitrated muons were already filtered out, the reduction is small (maybe 0.5% on the total reco, well within 1 sigma of typical comparison precision).
The number of muons decreases

  • in ttbar PU35: by about a factor of 2 total; about x3 considering those failing TM ID (using last station angle tight TMLSAT)
  • in ttbar PU70: by a factor of 3; about x3.5 in trackerMuons overall; about 6 in those failing TMLSAT
  • in ttbar PU200: by 15% for the total muon count; by a factor of 6 in tracker muons. The rest is apparently dominated by the ME0/GEM/RPC muons.

@cmsbuild
Copy link
Contributor

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request requires discussion in the ORP meeting before it's merged. @Muzaffar, @davidlange6, @smuzaffar

@slava77
Copy link
Contributor

slava77 commented Aug 3, 2017

this is a change in wf 10224
wf10224_nummatches
I'm adding this since the question came up in
https://hypernews.cern.ch/HyperNews/CMS/get/recoDevelopment/1541/1.html

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants