Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MkFit] Consolidate nan checks and filtering, expose propagation fail flag to steering code #40681

Merged
merged 1 commit into from Feb 9, 2023

Conversation

osschar
Copy link
Contributor

@osschar osschar commented Feb 3, 2023

PR description:

Changes introduced in this PR are mostly technical but they do include some bugfixes and improvements of corner cases that were potentially leading to failed propagations, Kalman updates and consequently to trashed/nan-ed output track parameters.

Further, there are some changes in standalone/ parts of the code, mostly introduction of interactive shell to be used for debugging.

1. Nan filtering

In MkBuilder::filter_comb_cands() add a flag allowing other TrackCands to be tested, not just the best one. If a later one passes, copy it over the first one as that will be the one used for backward fit/search or for export.

Properly handle not-best TrackCands in filtering after backward search.

  • After backward search representation of TrackCand hits is branching at the bottom and they need to be re-connected for scoring / filtering algorithms to process them correctly.

  • Call nan filters and other config specified filters at the same time through definition of a lambda when both of them need to be run.

2. Propagation fail flag & propagation changes
  • Added MPlexQI MkBase::m_FailFlag (parent class of MkFinder). One has to manually clear it when desired (before propagation). Now only 0 and 1 are used ... could have different error codes, eg., failed to reach, step too large, etc.

  • The fail-flag mplex is passed into all propagation and propaget+kalman_op functions (barrel and endcap - existing code had limited internal handling with forced restore for barrel propagation before, state was not visible outside).

  • Added WSR_Failed as member of enum WithinSensitiveRegion_e. Handled in selectHitIndices() and findTracksCloneEngine() -> stop the candidate.

  • In propagateToR reduce the dot-product correction for in-going tracks to prevent them from overshooting.

  • At this stage, changes that were leading to major physics changes have been disabled and are marked with PROP-FAIL-ENABLE comments. *
3. Changes in standalone/
  • Consolidate debug printouts, introduce mkfit::g_debug for additional, top-level debug output control.

  • Clarify --help and --all-seeds in usage, exit on --help.

  • Implement mkFit::Shell to allow targeted, interactive debugging through ROOT prompt. Includes:

    • event navigation, iteration selection, processing flag setting;
    • options for processing of a single seed (or a sequence of seeds for seed indices) based on label or seed position before/after cleaning;
    • some track comparison functions.

PR validation:

MTV plots, compare blue (baseline) vs. black:
http://uaf-10.t2.ucsd.edu/~mmasciov/MIC/updatedPR111/MTV_ttbarPU_mkFit_5iter_updatedPR111_update_fixPropagation/

… flag to steering code.

1. Nan filtering

In MkBuilder::filter_comb_cands() add a flag allowing other TrackCands to be tested, not just the best one.
If a later one passes, copy it over the first one as that will be the one used for backward fit/search or for export.

Properly handle not-best TrackCands in filtering after backward search.

- After backward search representation of TrackCand hits is branching at the
  bottom and they need to be re-connected for scoring / filtering algorithms
  to process them correctly.

- Call nan filters and other config specified filters at the same time through
  definition of a lambda when both of them need to be run.

2. Propagation fail flag

*** At this stage physics changing changes are not active, they are marked with
PROP-FAIL-ENABLE comments. ***

Added  MPlexQI MkBase::m_FailFlag (parent class of MkFinder). One has to
manually clear it when desired (before propagation).
  Now only 0 and 1 are used ... could have differnt error codes, eg.
    failed to reach, step too large, etc.

The fail-flag mplex is passed into all propagation and propaget+kalman_op
functions (barrel and endcap - there was limited handling with forced restore
for barrel propagation before, state was not visible outside).
  Are there some error states in KalmanOp? We could pass that in as well.

Added WSR_Failed as member of enum WithinSensitiveRegion_e
  Handled in selectHitIndices() and findTracksCloneEngine() -> stop the candidate.

With this we can now stop also endcap tracks when they are about to reach
apex. Until now endcap tracks went on looping. Another check done is how large
the helix angle along the step was attempted (pi/2 forces error).
  Other propToR/Z checks can be added / tried. fail codes, kamlnaOp fail

In backward fit the endcap fails are somewhat accidental -- trying to fix them
does not really help as hits are already chosen. So we only do rollback in barrel.

We can probably remove "can-reach-radius" check in
MkBuilder::fund_track_unroll_candidates()

Note that Config::useTrigApprox = true -- reconsider.

Retry backward-search with non-swapped first/second layers in the plan?
  b-tagging efficiency / resolution / number of hits

3. In propagateToR reduce the dot-product correction for ingoing tracks to prevent
them from overshooting.

4. Consolidate debug printouts, introduce mkfit::g_debug for additional, top-level
debug output control.

5. Changes in standalone/

Clarify --help and --all-seeds in usage, exit on --help.

Implement mkFit::Shell to allow targeted, interactive debugging through ROOT prompt.
Includes:
- event navigation, iteration selection, processing flag setting;
- options for processing of a single seed (or a sequence of seeds for seed
  indices) based on label or seed position before/after cleaning;
- some track comparison functions.
@cmsbuild
Copy link
Contributor

cmsbuild commented Feb 3, 2023

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-40681/34031

@cmsbuild
Copy link
Contributor

cmsbuild commented Feb 3, 2023

A new Pull Request was created by @osschar (Matevž Tadel) for master.

It involves the following packages:

  • RecoTracker/MkFitCMS (reconstruction)
  • RecoTracker/MkFitCore (reconstruction)

@cmsbuild, @mandrenguyen, @clacaputo can you please review it and eventually sign? Thanks.
@VourMa, @makortel, @felicepantaleo, @GiacomoSguazzoni, @JanFSchulte, @rovere, @VinInn, @missirol, @gpetruc, @mmusich, @mtosi, @dgulhan this is something you requested to watch as well.
@perrotta, @dpiparo, @rappoccio you are the release manager for this.

cms-bot commands are listed here

@slava77
Copy link
Contributor

slava77 commented Feb 3, 2023

@cmsbuild please test

@cmsbuild
Copy link
Contributor

cmsbuild commented Feb 3, 2023

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-477216/30372/summary.html
COMMIT: eef793e
CMSSW: CMSSW_13_0_X_2023-02-02-2300/el8_amd64_gcc11
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/40681/30372/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

  • You potentially added 59 lines to the logs
  • Reco comparison results: 7360 differences found in the comparisons
  • DQMHistoTests: Total files compared: 49
  • DQMHistoTests: Total histograms compared: 3555495
  • DQMHistoTests: Total failures: 28379
  • DQMHistoTests: Total nulls: 2
  • DQMHistoTests: Total successes: 3527092
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 48 files compared)
  • Checked 213 log files, 164 edm output root files, 49 DQM output files
  • TriggerResults: no differences found

@slava77
Copy link
Contributor

slava77 commented Feb 3, 2023

Reco comparison results: 7360 differences found in the comparisons

somewhat random check (in a folder with more differences) for 136.793
image
most differences appear in the detachedTriplet (and are still pretty tiny/small) as expected

@slava77
Copy link
Contributor

slava77 commented Feb 6, 2023

@cms-sw/reconstruction-l2
please clarify on the status/plans of the review, in particular if it can converge in time for building pre4
Thank you.

@clacaputo
Copy link
Contributor

Hi @slava77 , I'm not sure I'll be able to review all the new code by tomorrow

@osschar
Copy link
Contributor Author

osschar commented Feb 6, 2023

Yes, there's quite a few of changes in there, let me break them up a little:

  1. A lot of changes (most by volume) are in standalone/ subdirectories, eg., the new Shell class.

  2. Changes in the signature of propagation functions to get the propagation fail-flag out are spread out a bit but purely technical.

  3. Changes to processing of candidate filtering are localized to class MkBuilder and files TrackStructures.h/.cc and runFunctions.cc (for running in cmsRun, there are corresponding changes in standalone/buildtestMPlex.cc).

@perrotta
Copy link
Contributor

perrotta commented Feb 8, 2023

urgent

  • marking as "urgent" what was requested for 13_0_0_pre4 during the ORP meeting of Feb 7

@cmsbuild cmsbuild added the urgent label Feb 8, 2023
@clacaputo
Copy link
Contributor

Yes, there's quite a few of changes in there, let me break them up a little:

  1. A lot of changes (most by volume) are in standalone/ subdirectories, eg., the new Shell class.
  2. Changes in the signature of propagation functions to get the propagation fail-flag out are spread out a bit but purely technical.
  3. Changes to processing of candidate filtering are localized to class MkBuilder and files TrackStructures.h/.cc and runFunctions.cc (for running in cmsRun, there are corresponding changes in standalone/buildtestMPlex.cc).

Hi @osschar , thanks a lot for the walkthrough. I've checked the code and I don't have any comments.

@clacaputo
Copy link
Contributor

+reconstruction

  • PS is mostly technical with some bugfixes and improvements of corner cases that were potentially leading to failed propagations
  • reco changes in generalTracks are rather small and percolate to final products, but no pathological behaviour

@cmsbuild
Copy link
Contributor

cmsbuild commented Feb 8, 2023

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @perrotta, @dpiparo, @rappoccio (and backports should be raised in the release meeting by the corresponding L2)

@perrotta
Copy link
Contributor

perrotta commented Feb 9, 2023

+1

  • Most of the updates are in "standalone", and for the rest the modifications in reco outputs are widespread but tiny, as expected

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants