Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve performance of Cellular Automaton for regional tracking at HLT #17373

Merged
merged 8 commits into from Feb 14, 2017
Merged

Improve performance of Cellular Automaton for regional tracking at HLT #17373

merged 8 commits into from Feb 14, 2017

Conversation

JanFSchulte
Copy link
Contributor

@JanFSchulte JanFSchulte commented Feb 1, 2017

So far, for each tracking region at the HLT, a new CA Graph structure is created. This update intends to improve performance by re-using the data structure and re-filling it for each of the regions. For this, some code is rearranged, but the algorithm itself is not changed at all.

Some details and performance tests can be found here: https://cernbox.cern.ch/index.php/s/UWvE7a7itx6VGBs

The executive summary is:

  • Checked physics results both offline and online -> 100% identical
  • Checked timing of modules: Modules using regional tracking speed up between 3 and 40%. The gain strongly depends on the size of the regions, for many small regions the reduction in overhead naturally matters more than for few large ones.

@cmsbuild
Copy link
Contributor

cmsbuild commented Feb 1, 2017

A new Pull Request was created by @JanFSchulte (Jan-Frederik Schulte) for CMSSW_9_0_X.

It involves the following packages:

RecoPixelVertexing/PixelTriplets

@cmsbuild, @cvuosalo, @slava77, @davidlange6 can you please review it and eventually sign? Thanks.
@makortel, @felicepantaleo, @GiacomoSguazzoni, @rovere, @VinInn, @dgulhan this is something you requested to watch as well.
@davidlange6, @smuzaffar you are the release manager for this.

cms-bot commands are listed here #13028

@makortel
Copy link
Contributor

makortel commented Feb 1, 2017

@cmsbuild, please test

@cmsbuild
Copy link
Contributor

cmsbuild commented Feb 1, 2017

The tests are being triggered in jenkins.
https://cmssdt.cern.ch/jenkins/job/ib-any-integration/17569/console Started: 2017/02/01 19:45

@makortel
Copy link
Contributor

makortel commented Feb 1, 2017

Thanks Jan. Could you consider replacing "efficiency" with e.g. "performance" (or "speed") in PR title and description? Just to avoid a possible misconception that physics performance would be altered (which was my first interpretation of "improve efficiency").

@JanFSchulte JanFSchulte changed the title Improve efficiency of Cellular Automaton for regional tracking at HLT Improve performance of Cellular Automaton for regional tracking at HLT Feb 1, 2017
@JanFSchulte
Copy link
Contributor Author

Very good point. Fixed.

@cmsbuild
Copy link
Contributor

cmsbuild commented Feb 1, 2017

@cmsbuild
Copy link
Contributor

cmsbuild commented Feb 1, 2017

Comparison job queued.

@cmsbuild
Copy link
Contributor

cmsbuild commented Feb 1, 2017

Comparison is ready
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-17373/17569/summary.html

@slava77 comparisons for the following workflows were not done due to missing matrix map:

  • 20034.0_TTbar_14TeV+TTbar_14TeV_TuneCUETP8M1_2023D7_GenSimHLBeamSpotFull14+DigiFullTrigger_2023D7+RecoFullGlobal_2023D7+HARVESTFullGlobal_2023D7
  • 21234.0_TTbar_14TeV+TTbar_14TeV_TuneCUETP8M1_2023D4_GenSimHLBeamSpotFull14+DigiFullTrigger_2023D4+RecoFullGlobal_2023D4+HARVESTFullGlobal_2023D4
  • 23234.0_TTbar_14TeV+TTbar_14TeV_TuneCUETP8M1_2023D8_GenSimHLBeamSpotFull14+DigiFullTrigger_2023D8+RecoFullGlobal_2023D8+HARVESTFullGlobal_2023D8

@JanFSchulte
Copy link
Contributor Author

Could someone please review this?

I have performed additional performance studies using igprof on a reduced HLT menu running a couple of TkMu paths. Shows 14% improvement for the quadruplet building and 6% for the overall event.

@slava77
Copy link
Contributor

slava77 commented Feb 8, 2017 via email

@cmsbuild
Copy link
Contributor

cmsbuild commented Feb 9, 2017

Pull request #17373 was updated. @cmsbuild, @cvuosalo, @slava77, @davidlange6 can you please check and sign again.

@JanFSchulte
Copy link
Contributor Author

Thanks for reviewing, @slava77 . The new commit should address both your comments.

@slava77
Copy link
Contributor

slava77 commented Feb 9, 2017

@cmsbuild please test

@cmsbuild
Copy link
Contributor

cmsbuild commented Feb 9, 2017

The tests are being triggered in jenkins.
https://cmssdt.cern.ch/jenkins/job/ib-any-integration/17721/console Started: 2017/02/09 22:35

@cmsbuild
Copy link
Contributor

cmsbuild commented Feb 9, 2017

@cmsbuild
Copy link
Contributor

cmsbuild commented Feb 9, 2017

Comparison job queued.

@cmsbuild
Copy link
Contributor

cmsbuild commented Feb 9, 2017

@slava77
Copy link
Contributor

slava77 commented Feb 10, 2017

I ran 10224 (ttbar PU35 with CA era Run2_2017_trackingPhase1CA)
compared to commit 48d82fe, the latest one has a small increase in CPU

"detachedQuad\|detachedTriplet\|highPtTriplet\|initialStep\|lowPtQuad\|lowPtTriplet" |  grep -v Tracks
   +0.119492      +0.01 %        48.47 ms/ev ->        54.63 ms/ev initialStepHitQuadruplets
   +0.117181      +0.01 %        48.58 ms/ev ->        54.63 ms/ev initialStepHitQuadrupletsPreSplitting
   +0.096095      +0.02 %        76.45 ms/ev ->        84.17 ms/ev detachedQuadStepHitQuadruplets
   +0.085959      +0.03 %       144.01 ms/ev ->       156.95 ms/ev lowPtQuadStepHitQuadruplets
   +0.081517      +0.01 %        29.49 ms/ev ->        31.99 ms/ev highPtTripletStepHitTriplets
   +0.080369      +0.02 %        99.79 ms/ev ->       108.15 ms/ev lowPtTripletStepHitTriplets
   +0.074207      +0.01 %        49.95 ms/ev ->        53.80 ms/ev detachedTripletStepHitTriplets
Todal: 496.74 => 544.32	 delta 47.58

@JanFSchulte please check your HLT tests so that timing improvements remain
and were not blown away by the latest update which is just a one symbol (&) added in for(auto& ntuplet : ntuplets) ntuplet.reserve(localRA_.upper());

@slava77
Copy link
Contributor

slava77 commented Feb 13, 2017

@JanFSchulte
did you have a chance to follow up on my comment in #17373 (comment)
?
I'm holding off signing this PR until it's cleared.

@JanFSchulte
Copy link
Contributor Author

@slava77 What I can tell you so far is that I had checked the change in performance using igprof and the numbers for the produce() method did not change. I also asked @makortel for feedback, as he knows the code much better. He suggested a couple of checks that I will run now. Expect a conclusion later today.

@slava77
Copy link
Contributor

slava77 commented Feb 13, 2017

+1

for #17373 77f9a41

  • code changes are in line with the description and the follow up review; the change is not yet affecting the standard workflows
  • jenkins tests pass and comparisons show no differences
  • local test with Run2_2017_trackingPhase1CA era shows that there are no changes in results of step3 (step2/HLT is not modified by this era).

@JanFSchulte for me your check with igprof is good enough.

@cmsbuild
Copy link
Contributor

This pull request is fully signed and it will be integrated in one of the next CMSSW_9_0_X IBs (tests are also fine). This pull request requires discussion in the ORP meeting before it's merged. @davidlange6, @smuzaffar

@davidlange6
Copy link
Contributor

+1

@cmsbuild cmsbuild merged commit ad5e08f into cms-sw:CMSSW_9_0_X Feb 14, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants