Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Significant reduction of memory footprint for 2D templates, etc. #23263

Merged

Conversation

pmaksim1
Copy link
Contributor

  1. reduction of memory footprint of 2D templates. Average reduction is about a factor of 3. This was the main homework from Check and fix the code for SiPixelTemplate's #20950.
  2. additionally, there are bug fixes in 3D reconstruction (used by ClusterRepair).
  3. cosmetic fixes in SiPixelTemplate*2D.{hh,cc}. (Cosmetic fixes in other files coming separately.)

Tests:

  • TrackingValidation RunTheMatrix workflow (TTBar 2018_realistic) and it worked fine
  • Pixeltree running on data with Tamas's GT worked too.
  • runTheMatrix.py -e -n -l 136.831 (on data) with the tag that Tamas gave to us also runs fine.

As discussed previously, the plan is to have a separate FastSim PR and then another SiPixelRecHits + SiPixelDigitizer + pixel FastSim PR which involves moving templates + miscellanea to another package. So I'd like to defer all cosmetics to that PR. (But you can still give me suggestions :)

@perrotta @slava77 @fabiocos @makortel @tvami @tsusa @OzAmram @cmantill @schuetzepaul

@cmsbuild
Copy link
Contributor

The code-checks are being triggered in jenkins.

@cmsbuild
Copy link
Contributor

@cmsbuild
Copy link
Contributor

A new Pull Request was created by @pmaksim1 (Petar Maksimovic) for master.

It involves the following packages:

RecoLocalTracker/SiPixelRecHits

@perrotta, @cmsbuild, @slava77 can you please review it and eventually sign? Thanks.
@makortel, @felicepantaleo, @GiacomoSguazzoni, @rovere, @VinInn, @dkotlins, @gpetruc, @ebrondol, @threus this is something you requested to watch as well.
@davidlange6, @slava77, @fabiocos you are the release manager for this.

cms-bot commands are listed here

@slava77
Copy link
Contributor

slava77 commented May 20, 2018

@cmsbuild please test

@cmsbuild
Copy link
Contributor

cmsbuild commented May 20, 2018

The tests are being triggered in jenkins.
https://cmssdt.cern.ch/jenkins/job/ib-any-integration/28068/console Started: 2018/05/20 10:35

@cmsbuild
Copy link
Contributor

@cmsbuild
Copy link
Contributor

Comparison job queued.

@cmsbuild
Copy link
Contributor

Comparison is ready
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-23263/28068/summary.html

@slava77 comparisons for the following workflows were not done due to missing matrix map:

  • /build/cmsbld/jenkins/workspace/compare-root-files-short-matrix/results/JR-comparison/PR-23263/136.85_RunEGamma2018A+RunEGamma2018A+HLTDR2_2018+RECODR2_2018reHLT_skimEGamma_Prompt_L1TEgDQM+HARVEST2018_L1TEgDQM

Comparison Summary:

  • No significant changes to the logs found
  • Reco comparison results: 0 differences found in the comparisons
  • DQMHistoTests: Total files compared: 31
  • DQMHistoTests: Total histograms compared: 2901712
  • DQMHistoTests: Total failures: 2
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 2901520
  • DQMHistoTests: Total skipped: 190
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 30 files compared)
  • Checked 128 log files, 14 edm output root files, 31 DQM output files

@pmaksim1
Copy link
Contributor Author

Thanks. So I propose this PR gets integrated ASAP, and we immediately start putting together another one with a new location of the templates. I presume we are going to make one small package just for the low-level pixel reco. Where should it go, so that it can be references both by RecoLocalTracker and simulation packages (both SiPixelDigitizer and FastSimulation/TrackingRecHitProducer)?

@perrotta
Copy link
Contributor

@pmaksim1

Thank you for the pull request.

Please have also a look at the issues pointed out by the static analyzer:
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-23263/28068/llvm-analysis/
They should also get fixed before merging this PR, and it would be better if you can fix them now before we start the review.

@pmaksim1
Copy link
Contributor Author

This is pretty cool! (Is there a way for me to run it myself before making a PR? That may save everybody some time...)

The "dead assignments" are trivial. (That would have been a part of cosmetic cleanup in the next PR anyway, but we can start now.)

For the "memory leak"... I need to talk to Morris since the logic is a bit convoluted... Moreover, the job should really raise a fatal exception right there anyway, since there's no point proceeding. (So the problem is superficial, as at that point we have a much greater problem than just leaking memory.)

@cmsbuild
Copy link
Contributor

Comparison job queued.

@cmsbuild
Copy link
Contributor

Comparison is ready
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-23263/28315/summary.html

Comparison Summary:

  • No significant changes to the logs found
  • Reco comparison results: 0 differences found in the comparisons
  • DQMHistoTests: Total files compared: 31
  • DQMHistoTests: Total histograms compared: 2902471
  • DQMHistoTests: Total failures: 2
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 2902279
  • DQMHistoTests: Total skipped: 190
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 30 files compared)
  • Checked 128 log files, 14 edm output root files, 31 DQM output files

@perrotta
Copy link
Contributor

perrotta commented Jun 2, 2018

The memory performance was inspected with igprof (MEM_LIVE)

The overall job memory used in the 2018 TTbar MC workflow 10824 while allowing the ClusterRepair and PixelTemplates (which are still not enabled by default in the config) increases by some 53 MB with this PR. With the same setup and before merging this PR the overall increase amounts to some 169 MB.

Therefore, this pull requests reduces by a factor 3.2 the additional memory footprint of 2D templates, which is in line with what written in the PR description. There is probably room for some further improvement, but this is definitely a good start.

Going into the details, the memory reduction is mostly visible in the following modules and methods:

  • MEM_LIVE CMSSW_10_2_0_pre4 (2D templates enabled):
    4.3    137'989'206             12  PixelCPEClusterRepair::PixelCPEClusterRepair(edm::ParameterSet const&, MagneticField const*, TrackerGeometry const&, TrackerTopology const&, SiPixelLorentzAngle const*, SiPixelTemplateDBObject const*, SiPixel2DTemplateDBObject const*) [144]
    4.2    133'703'906              5  SiPixelTemplate2D::pushfile(SiPixel2DTemplateDBObject const&, std::vector<SiPixelTemplateStore2D, std::allocator<SiPixelTemplateStore2D> >&) [151]
    3.7    118'847'712              1  void std::vector<SiPixelTemplateStore2D, std::allocator<SiPixelTemplateStore2D> >::_M_emplace_back_aux<SiPixelTemplateStore2D const&>(SiPixelTemplateStore2D const&) [171]
    1.2     37'732'669          1'864  SiPixel2DTemplateDBObjectESProducer::produce(SiPixel2DTemplateDBObjectESProducerRcd const&) [578]
  • MEM_LIVE PR 23263 (2D templates enabled):
    0.7     23'298'714 / 23'298'714                27 / 27               PixelCPEClusterRepair::PixelCPEClusterRepair(edm::ParameterSet const&, MagneticField const*, TrackerGeometry const&, TrackerTopology const&, SiPixelLorentzAngle const*, SiPixelTemplateDBObject const*, SiPixel2DTemplateDBObject const*) [810]
    0.6     19'011'712 / 19'013'414                16 / 20               SiPixelTemplate2D::pushfile(SiPixel2DTemplateDBObject const&, std::vector<SiPixelTemplateStore2D, std::allocator<SiPixelTemplateStore2D> >&) [883]
    0.0          1'472 / 1'472                      1 / 1                void std::vector<SiPixelTemplateStore2D, std::allocator<SiPixelTemplateStore2D> >::_M_emplace_back_aux<SiPixelTemplateStore2D const&>(SiPixelTemplateStore2D const&) [14851]
    1.2     37'732'669          1'864  SiPixel2DTemplateDBObjectESProducer::produce(SiPixel2DTemplateDBObjectESProducerRcd const&) [568]

The largest additional memory eater when 2DTemplates are enabled remains now SiPixel2DTemplateDBObjectESProducer, which is still not touched by this PR

@perrotta
Copy link
Contributor

perrotta commented Jun 2, 2018

With the same igprof, I don't see sign of memory leakes in 2D templates.

Still, I believe that the procedure of newing the SiPixelTemplateEntry/SiPixelTemplateEntry2D and later destroying them when the ClusterRepair is deleted is rather dangerous: by now, all possible throwing's seem to be correctly taken into account, but mistakes could happen in case of future updates of the code (as it was the case here before the fix integrated with the last commit).

I would let as a possible improvement for a future PR the move to smart pointers for those SiPixelTemplateEntry/SiPixelTemplateEntry2D in the template code.

@perrotta
Copy link
Contributor

perrotta commented Jun 2, 2018

A few comparison with the current version of 2D templates (i.e. enabling those templates in CMSSW_10_2_0_pre4) show some non negligible difference in track outputs:

Some change in track outputs, with some 2% increase in the number of generalTracks

image

Quite noticeable is the reduction of electron seeds in a given DPhi bin:

image

The whole set of comparisons can be seen in /afs/cern.ch/work/a/aperrott/public/TESTPR23263

@pmaksim1 : are these differences understood/acceptable due to the bug fixes and adjustments included in this PR?

@VinInn
Copy link
Contributor

VinInn commented Jun 2, 2018

Do not understand at all the effect on the electronSeeds: they should NOT use templates.
It must be understood

@perrotta
Copy link
Contributor

perrotta commented Jun 2, 2018

@VinInn @pmaksim1 (Just to make clare what I did for the comparisons posted in #23263 (comment))

I compared baseline CMSSW_10_2_X_2018-05-30-2300 and the same baseline plus this PR where IN BOTH CASES I made the following modifications (as it was suggested in the initial 2D template PR):

  • Uncomment in RecoTracker/TransientTrackingRecHit/python/TTRHBuilderWithTemplate_cfi.py the following two lines
# from Configuration.Eras.Modifier_phase1Pixel_cff import phase1Pixel
# phase1Pixel.toModify(TTRHBuilderAngleAndTemplate, PixelCPE = cms.string('PixelCPEClusterRepair'))
  • Replace in CalibTracker/SiPixelESProducers/plugins/SiPixel2DTemplateDBObjectESProducer.cc
-       //  std::string label = "denominator"; // &&& Temporary: matches Barrel Layer1 fullsim MC
-       std::string label = "";      // the correct default

with

+       std::string label = "denominator"; // &&& Temporary: matches Barrel Layer1 fullsim MC
+       //  std::string label = "";      // the correct default

@pmaksim1
Copy link
Contributor Author

pmaksim1 commented Jun 2, 2018 via email

@perrotta
Copy link
Contributor

perrotta commented Jun 3, 2018

@pmaksim1 (et al.): you can access the plots with the differences in reco and miniAOD from
http://perrotta.web.cern.ch/perrotta/TESTPR23263

@pmaksim1
Copy link
Contributor Author

pmaksim1 commented Jun 3, 2018

Awesome, thanks! I can see them all now.

I went through a bunch of plots, and I didn't see much of a difference. The only differences seem to be in the number of hits, e.g.
http://perrotta.web.cern.ch/perrotta/TESTPR23263/all_testPR23263VSorig_TTbar13TeV2018wf10824p0/all_testPR23263VSorig_TTbar13TeV2018wf10824p0c_recoElectronSeeds_electronMergedSeeds__RECO_obj_nHits.png
where there seem to be fewer seeds with 3-4 hits. This may be expected: what we did previously was to assign aircraft-hangar-sized errors to the hits on the edges of modules (which are not infrequent in barrel, at a slightly larger eta). I presume those hits would be slurped by many tracks (or the first track that picks them up). In the new scheme, the errors are much smaller, so even when the edge hits are now reconstructed more correctly, the pulls from wrong tracks are now larger. So this goes in the direction I would expect.

However, it's hard to say whether the size of the effect is appropriate. One would really need to understand how the hits are picked up, and if some of the hits get artificially blow-up errors, how much more hits would be picked up.

I will crowd-source the examination of these plots :) and we'll try to identify all that visibly differ.

That being said, we may need to involve other experts -- esp. from tracking and people who understand all of these objects. Given that we are still parasitic (i.e. I don't think we are turning this own by default just yet, right?), what is the level of scrutiny you'd like to see?

@perrotta
Copy link
Contributor

perrotta commented Jun 4, 2018

@pmaksim1 : I would like that we all get convinced that this PR does not introduce new bugs or mistakes in the code. Besides that, I understand that the 2D templates and ClusterRepair are not enabled yet in CMSSW, and if some tuning of the parameters is needed this can be applied when they will be enabled, and I don't expect they will be already integrated now.

@OzAmram
Copy link
Contributor

OzAmram commented Jun 4, 2018

Ok after passing through all the plots, here are the differences I noticed.

Fewer lost tracks:
http://perrotta.web.cern.ch/perrotta/TESTPR23263/all_mini_testPR23263VSorig_TTbar13TeV2018wf10824p0/all_mini_testPR23263VSorig_TTbar13TeV2018wf10824p0c_patPackedCandidates_lostTracks__RECO_objAT_size.png
http://perrotta.web.cern.ch/perrotta/TESTPR23263/all_mini_testPR23263VSorig_TTbar13TeV2018wf10824p0/all_mini_testPR23263VSorig_TTbar13TeV2018wf10824p0c_patPackedCandidates_lostTracks__RECO_obj_normalizedChi2_.png
http://perrotta.web.cern.ch/perrotta/TESTPR23263/all_mini_testPR23263VSorig_TTbar13TeV2018wf10824p0/all_mini_testPR23263VSorig_TTbar13TeV2018wf10824p0c_patPackedCandidates_lostTracks__RECO_obj_numberOfHits.png

Less Ecal rec hits:
http://perrotta.web.cern.ch/perrotta/TESTPR23263/all_mini_testPR23263VSorig_TTbar13TeV2018wf10824p0/all_mini_testPR23263VSorig_TTbar13TeV2018wf10824p0c_EcalRecHitsSorted_reducedEgamma_reducedESRecHits_RECO_obj_obj_recoFlag.png
http://perrotta.web.cern.ch/perrotta/TESTPR23263/all_mini_testPR23263VSorig_TTbar13TeV2018wf10824p0/all_mini_testPR23263VSorig_TTbar13TeV2018wf10824p0c_log2maxEcalRecHitsSorted_reducedEgamma_reducedESRecHits_RECO_obj_obj_flagBits_,0_5.png

Interesting feature in phi for ECal cluster?:
http://perrotta.web.cern.ch/perrotta/TESTPR23263/all_mini_testPR23263VSorig_TTbar13TeV2018wf10824p0/all_mini_testPR23263VSorig_TTbar13TeV2018wf10824p0c_recoCaloClusters_reducedEgamma_reducedESClusters_RECO_obj_phi.png
http://perrotta.web.cern.ch/perrotta/TESTPR23263/all_mini_testPR23263VSorig_TTbar13TeV2018wf10824p0/all_mini_testPR23263VSorig_TTbar13TeV2018wf10824p0c_recoConversions_reducedEgamma_reducedConversions_RECO_obj_nTracks.png

Possible difference in number of muon matched stations (maybe just low statistics?):
http://perrotta.web.cern.ch/perrotta/TESTPR23263/all_mini_testPR23263VSorig_TTbar13TeV2018wf10824p0/all_mini_testPR23263VSorig_TTbar13TeV2018wf10824p0c_patMuons_slimmedMuons__RECO_obj_muMatches__detector.png
http://perrotta.web.cern.ch/perrotta/TESTPR23263/all_mini_testPR23263VSorig_TTbar13TeV2018wf10824p0/all_mini_testPR23263VSorig_TTbar13TeV2018wf10824p0c_patMuons_slimmedMuons__RECO_obj_muMatches__station.png

Less errors (From our own testing, this make sense because we removed one of the error codes for Template2D and significantly reduced the times another happened):
http://perrotta.web.cern.ch/perrotta/TESTPR23263/all_testPR23263VSorig_TTbar13TeV2018wf10824p0/all_testPR23263VSorig_TTbar13TeV2018wf10824p0c_edmErrorSummaryEntrys_logErrorHarvester__RECO_objAT_size.png
http://perrotta.web.cern.ch/perrotta/TESTPR23263/all_testPR23263VSorig_TTbar13TeV2018wf10824p0/all_testPR23263VSorig_TTbar13TeV2018wf10824p0c_edmErrorSummaryEntrys_logErrorHarvester__RECO_obj_category_size.png

Electron Seeds (Has been pointed out already):
http://perrotta.web.cern.ch/perrotta/TESTPR23263/all_testPR23263VSorig_TTbar13TeV2018wf10824p0/all_testPR23263VSorig_TTbar13TeV2018wf10824p0c_recoElectronSeeds_electronMergedSeeds__RECO_obj_dPhi1.png
http://perrotta.web.cern.ch/perrotta/TESTPR23263/all_testPR23263VSorig_TTbar13TeV2018wf10824p0/all_testPR23263VSorig_TTbar13TeV2018wf10824p0c_recoElectronSeeds_electronMergedSeeds__RECO_obj_dRz1.png
http://perrotta.web.cern.ch/perrotta/TESTPR23263/all_testPR23263VSorig_TTbar13TeV2018wf10824p0/all_testPR23263VSorig_TTbar13TeV2018wf10824p0c_recoElectronSeeds_electronMergedSeeds__RECO_obj_nHits.png

@pmaksim1
Copy link
Contributor Author

pmaksim1 commented Jun 4, 2018

@perrotta @VinInn @slava77
So @OzAmram and I discussed this, and we think it would be better to compare:

  • vanilla 10.2.0pre4, and
  • 10.2.0pre4 with this PR, with ClusterRepair enabled.

The problem with using 10.2.0pre4 with ClusterRepair enabled is that we are using code which has a couple of known bugs fixed by this PR. So if we see differences, I'm not sure what we should conclude from that.

So, can we run this ourselves? Or these histograms have already been run for the vanilla 10.2.0pre4, and we just need to compare them? (If you know how to do it and it's easy, we'd appreciate help with it :)

Thanks!

@perrotta
Copy link
Contributor

perrotta commented Jun 5, 2018

@pmaksim1 @OzAmram @VinInn

you can find the difference between this PR with cluster repair enabled and the baseline (no cluster repair) in http://perrotta.web.cern.ch/perrotta/TESTPR23263vsBASELINE

Differences look healtier to me, and it goes in the direction of your explanation based on the effect of the bug fixes implemented here

I plan to sign this PR in time for this afternoon ORP, but I let you have a look at the new comparison in case you notice anything worth to report or investigate further.

Please let me know if you have any comment

@pmaksim1
Copy link
Contributor Author

pmaksim1 commented Jun 5, 2018

Hi @perrotta ,

Thanks, this was very helpful! So as we suspected, the issues arose mostly from the old 2D reco, the new one is much better. @OzAmram just summarized at the pixel offline meeting:
https://indico.cern.ch/event/688866/contributions/3028237/attachments/1661570/2663447/tracking_jun4.pdf

There are a couple of small things we are still looking into, but since the feature is still parasitic, we agree with you that you can sign off on it.

Thanks for all the help!

@perrotta
Copy link
Contributor

perrotta commented Jun 5, 2018

+1

  • Memory footprint reduction is in line with what declared in the PR description
  • Additional bug fixes integrated here are able to adjust a few misbehaviours that were in the first version of the ClusterRepair and 2D template
  • There are a few other points to be followed up, as pointed out in Significant reduction of memory footprint for 2D templates, etc. #23263 (comment): however this PR already represents an improvement with what currently in the release, and in any case the features touched here are not enabled by default in the standard configurations. Therefore, I suggest merging this pull request to speed up further updates.
  • No effect on standard reco outputs as tested by jenkins, which is expected due to the fact that the new features are not enabled by defaulf

@cmsbuild
Copy link
Contributor

cmsbuild commented Jun 5, 2018

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @davidlange6, @slava77, @smuzaffar, @fabiocos (and backports should be raised in the release meeting by the corresponding L2)

@fabiocos
Copy link
Contributor

fabiocos commented Jun 5, 2018

+1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants