Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Da clustering speedup #29626

Merged
merged 24 commits into from
May 19, 2020
Merged

Da clustering speedup #29626

merged 24 commits into from
May 19, 2020

Conversation

werdmann
Copy link
Contributor

@werdmann werdmann commented May 4, 2020

PR description:

Modified DA clustering code to speed up clustering in Z and ZT.
A new parameter "zrange" controls the size of a window around each track and only clusters within this window are considered for updates.
The convergence requirements are now configurable separately for high and low temperature through parameters "delta_lowT" and "delta_highT". Optionally an alternative scheme with termperature dependent requirements can be selected ("convergence_mode").
Two previously hardcoded cuts in the trackselection have been made configurable ("maxDzError", "maxD0Error") in order to be more flexible with high eta tracks used for phase 2.

related presentations:
https://indico.cern.ch/event/894865/contributions/3773737/attachments/2000277/3338756/2020-03-09-clustering-speed.pdf
https://indico.cern.ch/event/903822/contributions/3814735/attachments/2015768/3369185/2020-04-06-clustering-update.pdf

PR validation:

The code has been tested locally with 85PU run3 MC (minbias) and 200PU run4 MC (minbias and ttbar) in CMSSW_11_0_pre6 as described in the presentations linked above.
Small event-by-event changes on individual vertices are found, but no significant difference is found in histograms of typical quality indicators.

runTheMatrix.py -l limited -i all --ibeos
34 33 32 26 16 4 1 1 1 tests passed, 0 0 0 0 0 0 0 0 0 failed

@cmsbuild
Copy link
Contributor

cmsbuild commented May 4, 2020

The code-checks are being triggered in jenkins.

@cmsbuild
Copy link
Contributor

cmsbuild commented May 4, 2020

-code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-29626/14998

  • This PR adds an extra 52KB to repository

Code check has found code style and quality issues which could be resolved by applying following patch(s)

@perrotta
Copy link
Contributor

perrotta commented May 4, 2020

  • code-checks and code-format issues have to be addressed before other tests can run
  • Please provide some description in the PR description field: this is always required, even more for a >2k lines code changes

@cmsbuild
Copy link
Contributor

cmsbuild commented May 4, 2020

The code-checks are being triggered in jenkins.

@cmsbuild
Copy link
Contributor

cmsbuild commented May 4, 2020

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-29626/15002

  • This PR adds an extra 60KB to repository

@cmsbuild
Copy link
Contributor

cmsbuild commented May 4, 2020

A new Pull Request was created by @werdmann for master.

It involves the following packages:

Alignment/OfflineValidation
RecoVertex/PrimaryVertexProducer

@perrotta, @slava77, @christopheralanwest, @tocheng, @cmsbuild, @tlampen, @pohsun can you please review it and eventually sign? Thanks.
@adewit, @makortel, @GiacomoSguazzoni, @rovere, @VinInn, @ebrondol, @tocheng, @tlampen, @mschrode, @mmusich, @dgulhan this is something you requested to watch as well.
@silviodonato, @dpiparo you are the release manager for this.

cms-bot commands are listed here

@perrotta
Copy link
Contributor

perrotta commented May 4, 2020

please test

@cmsbuild
Copy link
Contributor

cmsbuild commented May 4, 2020

The tests are being triggered in jenkins.
https://cmssdt.cern.ch/jenkins/job/ib-run-pr-tests/5974/console Started: 2020/05/04 17:21

@cmsbuild
Copy link
Contributor

cmsbuild commented May 4, 2020

-1

Tested at: e8e9725

CMSSW: CMSSW_11_1_X_2020-05-04-1100
SCRAM_ARCH: slc7_amd64_gcc820
You can see the results of the tests here:
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-2fbd74/5974/summary.html

I found follow errors while testing this PR

Failed tests: ClangBuild

  • Clang:

I found compilation warning while trying to compile with clang. Command used:

USER_CUDA_FLAGS='--expt-relaxed-constexpr' USER_CXXFLAGS='-Wno-register -fsyntax-only' scram build -k -j 32 COMPILER='llvm compile'

See details on the summary page.

@cmsbuild
Copy link
Contributor

cmsbuild commented May 4, 2020

Comparison not run due to Build errors/Fireworks only changes/No short matrix requested (RelVals and Igprof tests were also skipped)

@perrotta
Copy link
Contributor

please test

@cmsbuild
Copy link
Contributor

cmsbuild commented May 18, 2020

The tests are being triggered in jenkins.
https://cmssdt.cern.ch/jenkins/job/ib-run-pr-tests/6368/console Started: 2020/05/18 11:12

@cmsbuild
Copy link
Contributor

+1
Tested at: eb543f0
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-2fbd74/6368/summary.html
CMSSW: CMSSW_11_1_X_2020-05-17-2300
SCRAM_ARCH: slc7_amd64_gcc820

@cmsbuild
Copy link
Contributor

Comparison job queued.

@cmsbuild
Copy link
Contributor

Comparison is ready
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-2fbd74/6368/summary.html

Comparison Summary:

  • No significant changes to the logs found
  • Reco comparison results: 5319 differences found in the comparisons
  • DQMHistoTests: Total files compared: 35
  • DQMHistoTests: Total histograms compared: 2694466
  • DQMHistoTests: Total failures: 27789
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 2666628
  • DQMHistoTests: Total skipped: 49
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 34 files compared)
  • Checked 150 log files, 16 edm output root files, 35 DQM output files

@perrotta
Copy link
Contributor

In the results of the last automatic comparisons there are apparently changed distributions of the differences with the baseline wrt the previous comparisons.
However, it looks like that also the baseline distributions have changed in between, and also the new differences are compatible with the "small event-by-event changes on individual vertices" which are mentioned in the PR description.
Taking also into account that the last commits were not expected to modify the overall behaviour of the code, I don't think that a new detailed validation is needed at this point. However, if you can quickly reproduce with the latest version of the code some of the validation plots already included in your slides, it would be nice to post them here for reference.

@perrotta
Copy link
Contributor

The data members of the struct's track_t and vertex_t do not fulfill the CMS naming code rules: single character names should be avoided even for local variables, except for loop indices; upper case initials should be avoided in class/struct data members.
I understand that this was already as such even before this PR. It could be profited of this PR to fix them once. Given we are already late in the release cycle, for me it can be ok both fixing those names here now, or just wait that this PR is merged and submit another cleaning one with it, Just let us know what you'd prefere: in case you prefere preparing a cleaning PR after this one is merged I think this can be even signed by reco since now.

@werdmann
Copy link
Contributor Author

I would prefer to fix the names later. I'm planning further development for the next release cycle.
I have launched jobs to reproduce the results from last Friday with this PR on top of CMSSW_11_1_0_pre7 and will post the results when available.

@werdmann
Copy link
Contributor Author

results for ttbar 200 PU equivalent to those presented in https://indico.cern.ch/event/918058/contributions/3859212/attachments/2039815/3415871/20200515_TrackingPOG_RecoAT_v2.pdf

good agreement between the old reconstruction and the new default (red symbols) and also
a less conservative parameter choice (green symbols).

merging-offlinePrimaryVertices
offlinePrimaryVertices-ttbar-splitting

clustering-time-offlinePrimaryVertices
clustering-time-offlinePrimaryVertices4D

@perrotta
Copy link
Contributor

+1

  • It improves the DA clustering time, which is evident in the attached performace reports and in the summary above, especially at large PU
  • Jenkins tests, pass and the differences shown there are compatible with small event-by-event changes on individual vertices
  • Validation posted above confirms that the overall quality of the reconstructed vertices and derived quantities is not affected

@tlampen
Copy link
Contributor

tlampen commented May 19, 2020

+1

@cmsbuild
Copy link
Contributor

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @silviodonato, @dpiparo (and backports should be raised in the release meeting by the corresponding L2)

@silviodonato
Copy link
Contributor

+1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants