MTD reconstruction: speed up TrackExtenderWithMTD #36793

fabiocos · 2022-01-25T09:38:39Z

PR description:

This PR proposes an update of TrackExtenderWithMTD following the presentation at the TRK POG meeting https://indico.cern.ch/event/1108413/contributions/4662751/attachments/2373693/4054455/TRK_MTD_20220117.pdf and the following discussion.

About 1/3 of the time used by this module is taken by retrieving the Trajectory for a given general track using the TrackTransformer, see https://cms-reco-profiling.web.cern.ch/cms-reco-profiling/cgi-bin/igprof-navigator/releases/CMSSW_12_2_0_pre3/34834.21/step3/cpu_endjob/92 . This can be avoided by reusing the Trajectory optionally stored in a transient form into the event by the iterative tracking, an option by default disabled.
Of course this comes at some cost in memory usage, which should be balanced with the CPU gain benefit.

The activation of the flags to save the trajectories in the event if made only for the selection of modules entering into the definition of the generalTracks output, which at present constitutes the input of the TrackExtenderWithMTD

PR validation:

Tests have been done using the benchmark workflow 34834.21 (D76 scenario TTbar with PU, prod-like), using a home-made MinBias pile-up sample of 10k events. A simple memory and CPU profiling (single stream mode) on 100 events shows a gain in CPU:

TimeReport> Time report complete in 5809.45 seconds
 Time Summary: 
 - Min event:   34.1598
 - Max event:   83.5143
 - Avg event:   55.9624
 - Total loop:  5753.35
 - Total init:  55.8676
 - Total job:   5809.45
 - EventSetup Lock: 0
 - EventSetup Get:  0
 Event Throughput: 0.0173812 ev/s
 CPU Summary: 
 - Total loop:     5633.37
 - Total init:     30.6902
 - Total extra:    0
 - Total children: 0.184089
 - Total job:      5664.15
 Processing Summary: 
 - Number of Events:  100
 - Number of Global Begin Lumi Calls:  1
 - Number of Global Begin Run Calls: 1

vs previous

TimeReport> Time report complete in 6020.16 seconds
 Time Summary: 
 - Min event:   32.6102
 - Max event:   85.8682
 - Avg event:   57.9717
 - Total loop:  5935.13
 - Total init:  84.8239
 - Total job:   6020.16
 - EventSetup Lock: 0
 - EventSetup Get:  0
 Event Throughput: 0.0168488 ev/s
 CPU Summary: 
 - Total loop:     5861.23
 - Total init:     34.564
 - Total extra:    0
 - Total children: 0.168424
 - Total job:      5895.85
 Processing Summary: 
 - Number of Events:  100
 - Number of Global Begin Lumi Calls:  1
 - Number of Global Begin Run Calls: 1

with similar CPU load in both tests, and an extra memory load quantified in about 5%:

MemoryReport> Peak virtual size 6426.03 Mbytes
 Key events increasing vsize: 
[0] run: 0 lumi: 0 event: 0  vsize = 0 deltaVsize = 0 rss = 0 delta = 0
[1] run: 1 lumi: 1 event: 1  vsize = 5458.03 deltaVsize = 0 rss = 3278.55 delta = 0
[2] run: 1 lumi: 1 event: 2  vsize = 5970.03 deltaVsize = 512 rss = 3375.28 delta = 96.7305
[7] run: 1 lumi: 1 event: 7  vsize = 6426.03 deltaVsize = 456 rss = 3375.51 delta = 0.234375
[0] run: 0 lumi: 0 event: 0  vsize = 0 deltaVsize = 0 rss = 0 delta = 0
[9] run: 1 lumi: 1 event: 9  vsize = 6426.03 deltaVsize = 0 rss = 3393.05 delta = 17.5352
[8] run: 1 lumi: 1 event: 8  vsize = 6426.03 deltaVsize = 0 rss = 3178.45 delta = -197.059
[7] run: 1 lumi: 1 event: 7  vsize = 6426.03 deltaVsize = 456 rss = 3375.51 delta = 0.234375

vs previous

MemoryReport> Peak virtual size 6418.28 Mbytes
 Key events increasing vsize: 
[1] run: 1 lumi: 1 event: 1  vsize = 5458.28 deltaVsize = 0 rss = 3093.7 delta = 0
[7] run: 1 lumi: 1 event: 7  vsize = 5906.28 deltaVsize = 448 rss = 3178.19 delta = 84.4922
[19] run: 1 lumi: 1 event: 19  vsize = 6418.28 deltaVsize = 512 rss = 3202.4 delta = 24.207
[0] run: 0 lumi: 0 event: 0  vsize = 0 deltaVsize = 0 rss = 0 delta = 0
[0] run: 0 lumi: 0 event: 0  vsize = 0 deltaVsize = 0 rss = 0 delta = 0
[21] run: 1 lumi: 1 event: 21  vsize = 6418.28 deltaVsize = 0 rss = 2888.15 delta = -314.25
[20] run: 1 lumi: 1 event: 20  vsize = 6418.28 deltaVsize = 0 rss = 3077.96 delta = -124.441
[19] run: 1 lumi: 1 event: 19  vsize = 6418.28 deltaVsize = 512 rss = 3202.4 delta = 24.207

An igprof PERF_TICKS profiling on 10 events shows the expected variation in the internal module CPU use, with the modue weighting now for 2.2% of the total CPU (vs more than 3% previously):

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
            2.2  .........     119.17 / 4'386.72     edm::stream::EDProducerAdaptorBase::doEvent(edm::EventTransitionInfo const&, edm::ActivityRegistry*, edm::ModuleCallingContext const*) [16]
[118]       2.2     119.17       1.68 / 117.49     TrackExtenderWithMTDT<std::vector<reco::Track, std::allocator<reco::Track> > >::produce(edm::Event&, edm::EventSetup const&)
            0.8  .........      44.98 / 59.01        TrackTransformer::transform(reco::TransientTrack const&, std::vector<std::shared_ptr<TrackingRecHit const>, std::allocator<std::shared_ptr<TrackingRecHit const> > >&) const [187]
            0.5  .........      29.13 / 29.13        TrackExtenderWithMTDT<std::vector<reco::Track, std::allocator<reco::Track> > >::tryETLLayers(TrajectoryStateOnSurface const&, Trajectory const&, float, float, (anonymous namespace)::TrackSegments const&, edmNew::DetSetVector<MTDTrackingRecHit> const&, MTDDetLayerGeometry const*, MagneticField const*, Propagator const*, reco::BeamSpot const&, float, bool, (anonymous namespace)::MTDHitMatchingInfo&) const [clone .constprop.0] [344]
            0.3  .........      15.16 / 15.16        TrackExtenderWithMTDT<std::vector<reco::Track, std::allocator<reco::Track> > >::buildTrack(reco::Track const&, Trajectory const&, Trajectory const&, reco::BeamSpot const&, MagneticField const*, Propagator const*, bool, float&, float&, float&, float&, float&, float&) const [926]
            0.2  .........       9.94 / 15.63        (anonymous namespace)::trackPathLength(Trajectory const&, TrajectoryStateClosestToBeamLine const&, Propagator const*, float&, (anonymous namespace)::TrackSegments&) [833]
            0.1  .........       7.55 / 16.30        (anonymous namespace)::getTrajectoryStateClosestToBeamLine(Trajectory const&, reco::BeamSpot const&, Propagator const*, TrajectoryStateClosestToBeamLine&) [676]
            0.0  .........       2.27 / 31.25        TrackExtenderWithMTDT<std::vector<reco::Track, std::allocator<reco::Track> > >::fillMatchingHits(DetLayer const*, TrajectoryStateOnSurface const&, Trajectory const&, float, float, (anonymous namespace)::TrackSegments const&, edmNew::DetSetVector<MTDTrackingRecHit> const&, Propagator const*, reco::BeamSpot const&, float const&, bool, std::vector<std::shared_ptr<TrackingRecHit const>, std::allocator<std::shared_ptr<TrackingRecHit const> > >&, (anonymous namespace)::MTDHitMatchingInfo&) const [325]
            0.0  .........       1.91 / 2.11         TrackTransformer::getTransientRecHits(reco::TransientTrack const&) const [2341]
            0.0  .........       0.90 / 58.40        std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release() [190]
            0.0  .........       0.88 / 0.91         Traj2TrackHits::operator()(Trajectory const&, edm::OwnVector<TrackingRecHit, edm::ClonePolicy<TrackingRecHit> >&, std::vector<LocalTrajectoryParameters, std::allocator<LocalTrajectoryParameters> >&, std::vector<unsigned char, std::allocator<unsigned char> >&) const [3214]
            0.0  .........       0.54 / 0.54         TrackExtenderWithMTDT<std::vector<reco::Track, std::allocator<reco::Track> > >::buildTrackExtra(Trajectory const&) const [3891]
            0.0  .........       0.52 / 1.04         edm::AssociationMap<edm::OneToOne<std::vector<Trajectory, std::allocator<Trajectory> >, std::vector<reco::Track, std::allocator<reco::Track> >, unsigned short> >::get(unsigned long) const [3050]
            0.0  .........       0.49 / 0.49         void std::vector<reco::TrackExtra, std::allocator<reco::TrackExtra> >::_M_realloc_insert<reco::TrackExtra>(__gnu_cxx::__normal_iterator<reco::TrackExtra*, std::vector<reco::TrackExtra, std::allocator<reco::TrackExtra> > >, reco::TrackExtra&&) [4043]
            0.0  .........       0.40 / 0.95         reco::HitPattern::appendHit(TrackingRecHit const&, TrackerTopology const&) [3147]
            0.0  .........       0.33 / 5.01         reco::TransientTrack::TransientTrack(edm::Ref<std::vector<reco::Track, std::allocator<reco::Track> >, reco::Track, edm::refhelper::FindUsingAdvance<std::vector<reco::Track, std::allocator<reco::Track> >, reco::Track> > const&, MagneticField const*, edm::ESHandle<GlobalTrackingGeometry> const&) [1721]
            0.0  .........       0.33 / 0.34         non-virtual thunk to reco::TrackTransientTrack::outermostMeasurementState() const [4570]
            0.0  .........       0.30 / 0.30         std::vector<std::shared_ptr<TrackingRecHit const>, std::allocator<std::shared_ptr<TrackingRecHit const> > >::~vector() [4781]
            0.0  .........       0.28 / 114.60       operator delete(void*, unsigned long) [121]
            0.0  .........       0.27 / 0.27         edm::refitem::GetRefPtrImpl<std::vector<Trajectory, std::allocator<Trajectory> >, Trajectory, edm::refhelper::FindUsingAdvance<std::vector<Trajectory, std::allocator<Trajectory> >, Trajectory>, unsigned int>::getRefPtr_(edm::RefCore const&, unsigned int) [4916]
            0.0  .........       0.26 / 2.39         TransientTrackBuilder::build(edm::Ref<std::vector<reco::Track, std::allocator<reco::Track> >, reco::Track, edm::refhelper::FindUsingAdvance<std::vector<reco::Track, std::allocator<reco::Track> >, reco::Track> > const&) const [2248]
(...)

The DQM histogram comparison shows a very moderate difference in the output of MTD tracks and vertices tests, and 4D vertices. Since the code should just replace the Trajectory from the original track fit with its reconstruction from the TrackTransformer I would conclude that some possible small difference might be expected there. In any case the overall behaviour of the code looks very well consistent between the two versions, even though not strictly identical.

…ed TrajectoryInEvent

cmsbuild · 2022-01-25T09:45:59Z

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-36793/27938

This PR adds an extra 48KB to repository

cmsbuild · 2022-01-25T09:46:13Z

A new Pull Request was created by @fabiocos (Fabio Cossutti) for master.

It involves the following packages:

RecoMTD/TrackExtender (upgrade, reconstruction)
RecoTracker/FinalTrackSelectors (reconstruction)
RecoTracker/IterativeTracking (reconstruction)

@clacaputo, @cmsbuild, @AdrianoDee, @srimanob, @slava77, @jpata can you please review it and eventually sign? Thanks.
@felicepantaleo, @GiacomoSguazzoni, @JanFSchulte, @rovere, @VinInn, @ebrondol, @gpetruc, @mmusich, @mtosi, @dgulhan this is something you requested to watch as well.
@perrotta, @dpiparo, @qliphy you are the release manager for this.

cms-bot commands are listed here

fabiocos · 2022-01-25T09:51:32Z

please test

cmsbuild · 2022-01-25T13:44:20Z

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-996284/21973/summary.html
COMMIT: 42b9d10
CMSSW: CMSSW_12_3_X_2022-01-24-2300/slc7_amd64_gcc10
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/36793/21973/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

No significant changes to the logs found
Reco comparison results: 104 differences found in the comparisons
DQMHistoTests: Total files compared: 43
DQMHistoTests: Total histograms compared: 3449324
DQMHistoTests: Total failures: 59
DQMHistoTests: Total nulls: 0
DQMHistoTests: Total successes: 3449243
DQMHistoTests: Total skipped: 22
DQMHistoTests: Total Missing objects: 0
DQMHistoSizes: Histogram memory added: 0.0 KiB( 42 files compared)
Checked 181 log files, 42 edm output root files, 43 DQM output files
TriggerResults: no differences found

slava77 · 2022-01-25T15:47:51Z

related to the memory use:
It would be better to also check the memory use with MEM_LIVE from igprof.

IIRC, the trajectory data is fully copied down to the generalTracks. This gives an opportunity to use the framework early delete feature (hooks are in place in Configuration/StandardSequences/python/earlyDeleteSettings_cff.py).

slava77 · 2022-01-25T15:48:06Z

enable profiling

slava77 · 2022-01-25T15:48:16Z

@cmsbuild please test

fabiocos · 2022-01-25T15:57:31Z

@slava77 I had given a try to MEM_LIVE checks in igprof, but I was getting a ridiculously large output of kind of warning messages...

cmsbuild · 2022-01-25T21:58:20Z

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-996284/21987/summary.html
COMMIT: 42b9d10
CMSSW: CMSSW_12_3_X_2022-01-25-1100/slc7_amd64_gcc10
Additional Tests: PROFILING
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/36793/21987/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

No significant changes to the logs found
Reco comparison results: 111 differences found in the comparisons
DQMHistoTests: Total files compared: 43
DQMHistoTests: Total histograms compared: 3449324
DQMHistoTests: Total failures: 71
DQMHistoTests: Total nulls: 0
DQMHistoTests: Total successes: 3449231
DQMHistoTests: Total skipped: 22
DQMHistoTests: Total Missing objects: 0
DQMHistoSizes: Histogram memory added: 0.0 KiB( 42 files compared)
Checked 181 log files, 42 edm output root files, 43 DQM output files
TriggerResults: no differences found

jpata · 2022-01-26T10:49:46Z

Looks like a fairly significant increase in memory:

https://cmssdt.cern.ch/SDT/cgi-bin/igprof-navigator/CMSSW_12_3_X_2022-01-25-1100/slc7_amd64_gcc10/profiling/23434.21/PR-996284/21987/igprofMEM_step3.19

https://cmssdt.cern.ch/SDT/cgi-bin/igprof-navigator/CMSSW_12_3_X_2022-01-25-1100/slc7_amd64_gcc10/profiling/23434.21/igprofMEM_step3.19

fabiocos · 2022-01-26T15:17:07Z

not a big surprise, the bulk of the difference is in the KF code as far as I can see, as expected
https://cmssdt.cern.ch/SDT/cgi-bin/igprof-navigator/CMSSW_12_3_X_2022-01-25-1100/slc7_amd64_gcc10/profiling/23434.21/PR-996284/21987/igprofMEM_step3.19/103

It is a matter of balance, we may squeeze the overall time but at the price of having some more stuff in memory. I selected only what strictly needed by the TrackExtenderWithMTD , there is a customisation fragment to enable the trajectory in event for every producer

https://github.com/cms-sw/cmssw/blob/master/RecoTracker/FinalTrackSelectors/python/customiseStoreTrajectoryInEvent.py

but I expect it would make the problem worse

perrotta · 2022-02-20T17:45:02Z

please test
(let refresh the nine day old tests before merging)

perrotta · 2022-02-20T17:46:33Z

please abort

perrotta · 2022-02-20T17:47:31Z

enable none
(try to speed up by disabling extra tests: profiling was already inspected)

perrotta · 2022-02-20T17:51:27Z

please test

perrotta · 2022-02-21T06:41:17Z

@smuzaffar , it seems that the bot is waiting to start the profiling: bust shouldn't I have prevented it with "enable none" in #36793 (comment) ?
In any case, tests seem stuck since a while. When I launched them lastly,my intention was simply to check that there were no counterindications in merging this PR coming from any of the other PRs merged in the last 9 days (when this one was tested last): by looking at the bot log I would say that everything works, besides that supposed attempt to start the profiling. Therefore; I could even merge this PR anyhow, but let wait for a few hours, just in case someone wants to check what's happening with the bot

cmsbuild · 2022-02-21T07:25:13Z

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-996284/22528/summary.html
COMMIT: dda4ecb
CMSSW: CMSSW_12_3_X_2022-02-20-1400/slc7_amd64_gcc10
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/36793/22528/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

No significant changes to the logs found
Reco comparison results: 124 differences found in the comparisons
DQMHistoTests: Total files compared: 49
DQMHistoTests: Total histograms compared: 3965143
DQMHistoTests: Total failures: 65
DQMHistoTests: Total nulls: 0
DQMHistoTests: Total successes: 3965056
DQMHistoTests: Total skipped: 22
DQMHistoTests: Total Missing objects: 0
DQMHistoSizes: Histogram memory added: 0.0 KiB( 48 files compared)
Checked 204 log files, 45 edm output root files, 49 DQM output files
TriggerResults: no differences found

smuzaffar · 2022-02-21T07:27:17Z

@perrotta , there is a known issue with bot that if a test was previously run for a commit then bot assumes that it will re-run again and waits for its results. I will fix this once find some time. For now I have manually set mark the profiling status successful

perrotta · 2022-02-21T08:19:52Z

Thank you @smuzaffar!

perrotta · 2022-02-21T08:20:07Z

+1

makortel · 2022-02-21T19:37:12Z

This PR might have caused an assertion failure in workflow 23234.9, see #37018.

fabiocos added 2 commits January 19, 2022 11:08

Fix chi2cut use in hit selection

c66e89f

Use externally produced trajectory in ExtenderWithMTD, customise need…

42b9d10

…ed TrajectoryInEvent

cmsbuild added this to the CMSSW_12_3_X milestone Jan 25, 2022

cmsbuild added code-checks-pending orp-pending pending-signatures reconstruction-pending tests-pending upgrade-pending labels Jan 25, 2022

cmsbuild added code-checks-approved and removed code-checks-pending labels Jan 25, 2022

cmsbuild added tests-started and removed tests-pending labels Jan 25, 2022

cmsbuild added tests-approved and removed tests-started labels Jan 25, 2022

cmsbuild added tests-started and removed tests-approved labels Jan 25, 2022

cmsbuild added tests-approved and removed tests-started labels Jan 25, 2022

cmsbuild added tests-started and removed requires-external tests-approved labels Feb 20, 2022

cmsbuild added tests-pending and removed tests-started labels Feb 20, 2022

cmsbuild added tests-started and removed tests-pending labels Feb 20, 2022

cmsbuild added tests-approved and removed tests-started labels Feb 21, 2022

cmsbuild added orp-approved and removed orp-pending labels Feb 21, 2022

cmsbuild merged commit b5958a1 into cms-sw:master Feb 21, 2022

fabiocos deleted the fc-extender branch February 21, 2022 09:26

This was referenced Feb 21, 2022

MillePedeAlignmentAlgorithm: update for new GBL version #37008

Merged

Update GBL (and move to git instead of svn) cms-sw/cmsdist#7638

Merged

[Do not merge] testing ROOT6 Ib cms-sw/cmsdist#7637

Closed

makortel mentioned this pull request Feb 21, 2022

Assertion failure in TrackCollectionCloner::Producer::operator() #37018

Closed

cmsbuild mentioned this pull request Feb 21, 2022

Implementation of beam halo tagger algorithm for the endcap photons #36901

Merged

jpata mentioned this pull request May 18, 2022

Inclusion of MTD in the tracking navigation #29267

Open

fabiocos mentioned this pull request Sep 27, 2022

Modified data product delete early system #38983

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MTD reconstruction: speed up TrackExtenderWithMTD #36793

MTD reconstruction: speed up TrackExtenderWithMTD #36793

fabiocos commented Jan 25, 2022 •

edited

Loading

cmsbuild commented Jan 25, 2022

cmsbuild commented Jan 25, 2022

fabiocos commented Jan 25, 2022

cmsbuild commented Jan 25, 2022

slava77 commented Jan 25, 2022

slava77 commented Jan 25, 2022

slava77 commented Jan 25, 2022

fabiocos commented Jan 25, 2022

cmsbuild commented Jan 25, 2022

jpata commented Jan 26, 2022 •

edited

Loading

fabiocos commented Jan 26, 2022

perrotta commented Feb 20, 2022

perrotta commented Feb 20, 2022

perrotta commented Feb 20, 2022

perrotta commented Feb 20, 2022

perrotta commented Feb 21, 2022

cmsbuild commented Feb 21, 2022

smuzaffar commented Feb 21, 2022

perrotta commented Feb 21, 2022

perrotta commented Feb 21, 2022

makortel commented Feb 21, 2022

MTD reconstruction: speed up TrackExtenderWithMTD #36793

MTD reconstruction: speed up TrackExtenderWithMTD #36793

Conversation

fabiocos commented Jan 25, 2022 • edited Loading

PR description:

PR validation:

cmsbuild commented Jan 25, 2022

cmsbuild commented Jan 25, 2022

fabiocos commented Jan 25, 2022

cmsbuild commented Jan 25, 2022

Comparison Summary

slava77 commented Jan 25, 2022

slava77 commented Jan 25, 2022

slava77 commented Jan 25, 2022

fabiocos commented Jan 25, 2022

cmsbuild commented Jan 25, 2022

Comparison Summary

jpata commented Jan 26, 2022 • edited Loading

fabiocos commented Jan 26, 2022

perrotta commented Feb 20, 2022

perrotta commented Feb 20, 2022

perrotta commented Feb 20, 2022

perrotta commented Feb 20, 2022

perrotta commented Feb 21, 2022

cmsbuild commented Feb 21, 2022

Comparison Summary

smuzaffar commented Feb 21, 2022

perrotta commented Feb 21, 2022

perrotta commented Feb 21, 2022

makortel commented Feb 21, 2022

fabiocos commented Jan 25, 2022 •

edited

Loading

jpata commented Jan 26, 2022 •

edited

Loading