avoid single letter names and code cleanup #32764

werdmann · 2021-01-28T15:53:54Z

PR description:

This pull request is following up a request that arose in the context of PR #29626 to rename single letter variables which was deferred to later. It is done now in preparation of further development for cmssw_11_3.
May other variables were renamed as well to improve clarity and readability. No functionality changes have been introduced by this.
There were also some other cosmetic changes concerning comments and white-spaces.
Two unused/obsolete methods were removed from the structs in the header files.
A missing initialization of the variable betastop_ has been added. This only affects the results when Tmin was initialized with an illegal value, which of course should never happen.

PR validation:

The code was tested in CMSSW_11_3_0_pre2 against relvals and no difference was found to the original code.
runTheMatrix resulted in no failures.

cmsbuild · 2021-01-28T16:03:58Z

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-32764/20922

This PR adds an extra 32KB to repository

cmsbuild · 2021-01-28T16:04:21Z

A new Pull Request was created by @werdmann for master.

It involves the following packages:

RecoVertex/PrimaryVertexProducer

@perrotta, @jpata, @cmsbuild, @slava77 can you please review it and eventually sign? Thanks.
@makortel, @GiacomoSguazzoni, @JanFSchulte, @rovere, @VinInn, @ebrondol, @mtosi, @dgulhan this is something you requested to watch as well.
@silviodonato, @dpiparo, @qliphy you are the release manager for this.

cms-bot commands are listed here

perrotta · 2021-01-28T16:18:16Z

please test

cmsbuild · 2021-01-28T18:52:42Z

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-6948be/12589/summary.html
COMMIT: 0184912
CMSSW: CMSSW_11_3_X_2021-01-28-1100/slc7_amd64_gcc900
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/32764/12589/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

No significant changes to the logs found
Reco comparison results: 4 differences found in the comparisons
DQMHistoTests: Total files compared: 37
DQMHistoTests: Total histograms compared: 2716596
DQMHistoTests: Total failures: 6
DQMHistoTests: Total nulls: 1
DQMHistoTests: Total successes: 2716567
DQMHistoTests: Total skipped: 22
DQMHistoTests: Total Missing objects: 0
DQMHistoSizes: Histogram memory added: -0.004 KiB( 36 files compared)
DQMHistoSizes: changed ( 312.0 ): -0.004 KiB MessageLogger/Warnings
Checked 156 log files, 37 edm output root files, 37 DQM output files

slava77 · 2021-01-28T19:04:41Z

@werdmann
I see that in a few places __restrict__ was added.
Did it have any positive input on vectorization of the code? Did you check the CPU time used by vertexing?

werdmann · 2021-01-28T19:39:41Z

@werdmann
I see that in a few places __restrict__ was added.
Did it have any positive input on vectorization of the code? Did you check the CPU time used by vertexing?
@slava77 I only noticed in the process of synchronizing Z and ZT clustering that one of them used restrict and the other did not.
I will address CPU time in another PR soon.

perrotta

Since you are cleaning this code, please also take what follows into account.

Moreover:

The evalF method defined in

cmssw/RecoVertex/PrimaryVertexProducer/interface/DAClusterizerInZ_vect.h

Line 201 in 0184912

double evalF(const double beta, track_t const &tks, vertex_t const &v) const;

is never used: can it be removed, then? Otherwise, please at least get rid of the thread-unsafe cout in it
Even though I can't imagine that classes with similar names will exist somewhere else in CMSSW, the include guards should better contain the whole file path name, i.e. (e.g.)
- RecoVertex_PrimaryVertexProducer_DAClusterizerInZT_vect_h
- RecoVertex_PrimaryVertexProducer_DAClusterizerInZ_vect_h

perrotta · 2021-02-02T09:19:44Z

RecoVertex/PrimaryVertexProducer/interface/DAClusterizerInZT_vect.h

+    std::vector<double> tpca_vec;                  // t-coordinate at point of closest approach to the beamline
+    std::vector<double> dz2_vec;                   // square of the error of z(pca)
+    std::vector<double> dt2_vec;                   // square of the error of t(pca)
+    std::vector<double> Z_sum_vec;                 // track contribution to the partition function, Z


lower case initials for data members

perrotta · 2021-02-02T09:43:23Z

RecoVertex/PrimaryVertexProducer/interface/DAClusterizerInZT_vect.h

          break;
      }
-      insertItem(k, z, t, pk, tks);
+      insertItem(k, zvtx, tvtx, rho, tks);
      return k;
    }

    void debugOut() {


Also this debugOut() method doesn't look being used anywhere

perrotta · 2021-02-02T09:48:43Z

RecoVertex/PrimaryVertexProducer/interface/DAClusterizerInZ_vect.h

-    }
+    std::vector<double> zpca_vec;                  // z-coordinate at point of closest approach to the beamline
+    std::vector<double> dz2_vec;                   // square of the error of z(pca)
+    std::vector<double> Z_sum_vec;                 // track contribution to the partition function, Z


lower case initials for data members

perrotta · 2021-02-02T09:48:58Z

RecoVertex/PrimaryVertexProducer/interface/DAClusterizerInZ_vect.h

+    double *__restrict__ zpca;
+    double *__restrict__ dz2;
+    double *__restrict__ tkwt;
+    double *__restrict__ Z_sum;


lower case initials for data members

perrotta · 2021-02-02T09:49:21Z

RecoVertex/PrimaryVertexProducer/interface/DAClusterizerInZT_vect.h

+    double *__restrict__ dz2;
+    double *__restrict__ dt2;
+    double *__restrict__ tkwt;
+    double *__restrict__ Z_sum;


lower case initials for data members

perrotta · 2021-02-02T10:10:44Z

RecoVertex/PrimaryVertexProducer/src/DAClusterizerInZ_vect.cc

@@ -24,7 +24,7 @@ DAClusterizerInZ_vect::DAClusterizerInZ_vect(const edm::ParameterSet& conf) {
  maxIterations_ = 1000;
  mintrkweight_ = 0.5;

-  // configurable debug outptut debug output
+  // configurable debug outptut
  verbose_ = conf.getUntrackedParameter<bool>("verbose", false);


verbose_ is only used at

cmssw/RecoVertex/PrimaryVertexProducer/src/DAClusterizerInZT_vect.cc

Lines 50 to 68 in 0184912

if (verbose_) {

std::cout << "DAClusterizerInZT_vect: mintrkweight = " << mintrkweight_ << std::endl;

std::cout << "DAClusterizerInZT_vect: uniquetrkweight = " << uniquetrkweight_ << std::endl;

std::cout << "DAClusterizerInZT_vect: zmerge = " << zmerge_ << std::endl;

std::cout << "DAClusterizerInZT_vect: tmerge = " << tmerge_ << std::endl;

std::cout << "DAClusterizerInZT_vect: Tmin = " << minT << std::endl;

std::cout << "DAClusterizerInZT_vect: Tpurge = " << purgeT << std::endl;

std::cout << "DAClusterizerInZT_vect: Tstop = " << stopT << std::endl;

std::cout << "DAClusterizerInZT_vect: vertexSize = " << vertexSize_ << std::endl;

std::cout << "DAClusterizerInZT_vect: vertexSizeTime = " << vertexSizeTime_ << std::endl;

std::cout << "DAClusterizerInZT_vect: coolingFactor = " << coolingFactor_ << std::endl;

std::cout << "DAClusterizerInZT_vect: d0CutOff = " << d0CutOff_ << std::endl;

std::cout << "DAClusterizerInZT_vect: dzCutOff = " << dzCutOff_ << std::endl;

std::cout << "DAClusterizerInZT_vect: dtCutoff = " << dtCutOff_ << std::endl;

std::cout << "DAClusterizerInZT_vect: zrange = " << sel_zrange_ << std::endl;

std::cout << "DAClusterizerinZT_vect: convergence mode = " << convergence_mode_ << std::endl;

std::cout << "DAClusterizerinZT_vect: delta_highT = " << delta_highT_ << std::endl;

std::cout << "DAClusterizerinZT_vect: delta_lowT = " << delta_lowT_ << std::endl;

}

The same functionality could be replaced, in a more thread compliant way, by using the DEBUG / DEBUG_LEVEL structure which is used everywhere else in this code

The very same apply to the other file DAClusterizerInZT_vect.cc

perrotta · 2021-02-02T10:24:26Z

RecoVertex/PrimaryVertexProducer/src/DAClusterizerInZT_vect.cc

+      t_tkwt = 1. / (1. + local_exp(std::pow(atIP.value() / atIP.error(), 2) -
+                                    std::pow(d0CutOff_, 2)));  // reduce weight for high ip tracks
+      if (edm::isNotFinite(t_tkwt) || t_tkwt < std::numeric_limits<double>::epsilon()) {
+        std::cout << "DAClusterizerInZT_vect.fill rejected track t_tkwt " << t_tkwt << std::endl;


Remove all (multi-thread unsafe) cout's from this fill() method: alternatively either replace them with some LogMessage's, or protect them with a DEBUG flag

perrotta · 2021-02-02T15:58:25Z

The two (longish) methods DAClusterizerInZ_vect::updateTc and DAClusterizerInZ_vect::update are pretty similar, only differing for updating swE in the "Tc" case, and not in the other one.
Some simplification that avoids copy/pasting, and all related risk of making mistakes during possible updates, could be obtained by having a single method with an additional (bool?) parameter that rules the two-lines update of swE

werdmann · 2021-02-02T18:23:58Z

The two (longish) methods DAClusterizerInZ_vect::updateTc and DAClusterizerInZ_vect::update are pretty similar, only differing for updating swE in the "Tc" case, and not in the other one.
Some simplification that avoids copy/pasting, and all related risk of making mistakes during possible updates, could be obtained by having a single method with an additional (bool?) parameter that rules the two-lines update of swE

That is of course the better solution for the code. The if statement would be inside the loops. Can someone comment on whether this will somehow impact vectorization?
Thanks

makortel · 2021-02-02T19:56:07Z

The if statement would be inside the loops. Can someone comment on whether this will somehow impact vectorization?

In leading order adding a runtime if inside a loop can prevent vectorization. In sub-leading order compiler may be able to implement the if with mask and continue to vectorize, but it would have to be checked from the generated code. By quick look in this case the "whether or not to update swE" probably could be given as a template argument, in which case one could use compile-time if constexpr that should lead to same behavior as now with explicit two functions.

perrotta · 2021-02-03T07:26:58Z

The if statement would be inside the loops. Can someone comment on whether this will somehow impact vectorization?

In leading order adding a runtime if inside a loop can prevent vectorization. In sub-leading order compiler may be able to implement the if with mask and continue to vectorize, but it would have to be checked from the generated code. By quick look in this case the "whether or not to update swE" probably could be given as a template argument, in which case one could use compile-time if constexpr that should lead to same behavior as now with explicit two functions.

Thank you @makortel!

Wouldn't that work, perhaps another possibility could be to move the if outside the loop, for example:

    // auto-vectorized
    if (Tc) {
      for (unsigned int k = kmin; k < kmax; ++k) {
        vertices.se[k] += vertices.exp[k] * (tmp_trk_tkwt * o_trk_Z_sum);
        auto w = vertices.rho[k] * vertices.exp[k] * (tmp_trk_tkwt * o_trk_Z_sum * o_trk_dz2);
        vertices.sw[k] += w;
        vertices.swz[k] += w * tmp_trk_z;
      }
   else  {
      for (unsigned int k = kmin; k < kmax; ++k) {
        vertices.se[k] += vertices.exp[k] * (tmp_trk_tkwt * o_trk_Z_sum);
        auto w = vertices.rho[k] * vertices.exp[k] * (tmp_trk_tkwt * o_trk_Z_sum * o_trk_dz2);
        vertices.sw[k] += w;
        vertices.swz[k] += w * tmp_trk_z;
        vertices.swE[k] += w * vertices.exp_arg[k] * obeta;
      }
   }

Could this also work (even if more naif)?

makortel · 2021-02-03T14:12:21Z

Could this also work (even if more naif)?

Yes, that should also work (with the price of some code duplication).

werdmann · 2021-02-08T10:30:06Z

Thank you for the feedback and apologies for the long silence. I verified that Andreas proposal incurs no measurable cpu time penalty and will implement all suggestions asap.

cmsbuild · 2021-02-12T18:03:05Z

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-32764/21130

This PR adds an extra 20KB to repository

cmsbuild · 2021-02-12T18:03:31Z

Pull request #32764 was updated. @perrotta, @jpata, @cmsbuild, @slava77 can you please check and sign again.

perrotta · 2021-02-12T18:32:15Z

please test
(Thank you Wolfran... even if the updates applied here does not seem to correspond to the diff you posted yesterday... but let see how do the test look like)

werdmann · 2021-02-12T20:59:38Z

@perotta you are right. in addition to what I posted yesterday it addresses your comment #32764 (comment) by moving the multiplication with that factor to a different loop where it needs to be executed only nvertex times instead of ntrack times.

cmsbuild · 2021-02-12T21:06:04Z

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-6948be/12869/summary.html
COMMIT: fec4cbd
CMSSW: CMSSW_11_3_X_2021-02-12-1100/slc7_amd64_gcc900
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/32764/12869/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

@slava77 comparisons for the following workflows were not done due to missing matrix map:

/build/cmsbld/jenkins/workspace/compare-root-files-short-matrix/data/PR-6948be/11634.911_TTbar_14TeV+2021_DD4hep+TTbar_14TeV_TuneCP5_GenSim+Digi+Reco+HARVEST+ALCA

Summary:

No significant changes to the logs found
Reco comparison results: 5 differences found in the comparisons
DQMHistoTests: Total files compared: 37
DQMHistoTests: Total histograms compared: 2750846
DQMHistoTests: Total failures: 11
DQMHistoTests: Total nulls: 0
DQMHistoTests: Total successes: 2750813
DQMHistoTests: Total skipped: 22
DQMHistoTests: Total Missing objects: 0
DQMHistoSizes: Histogram memory added: 0.0 KiB( 36 files compared)
Checked 156 log files, 37 edm output root files, 37 DQM output files

perrotta · 2021-02-13T08:00:08Z

In DAClusterizerInZT_vect.cc the new update(...) function corresponds to the previous one when updateTc=true. Therefore, all calls should add as argument the corresponding boolean parameter as true. However, I see that still L662, as well as L1211 and L1231, use the default (false) argument: was it done on purpose, or just forgotten?

At the contrary in DAClusterizerInZ_vect.cc all new update() calls now correspond to what is in the version of the code currently in CMSSW

perrotta · 2021-02-13T08:05:38Z

There are a couple of differences in DQM plots for wfs 23234 and 28234. They look like just as possible numerical instabilities, but they could still be correlated with what implemented here. Eg. for wf 28234:

werdmann · 2021-02-13T09:53:46Z

In DAClusterizerInZT_vect.cc the new update(...) function corresponds to the previous one when updateTc=true. Therefore, all calls should add as argument the corresponding boolean parameter as true. However, I see that still L662, as well as L1211 and L1231, use the default (false) argument: was it done on purpose, or just forgotten?

This is done on purpose to avoid unnecessary arithmetic when an updated Tc is not needed. The ZT clusterizer only uses Tc for splitting, so it only needs to be done when a subsequent split() is possible.

At the contrary in DAClusterizerInZ_vect.cc all new update() calls now correspond to what is in the version of the code currently in CMSSW

Yes. In contrast to the ZT clusterizer, which before this PR did not distinguish between update and updateTc and simply effectively did updateTc all the time, the Z clusterizer already had two versions of update and the only change was to replace update() with update(..,false) and updateTc() with update(..., true). The complication was, that there was a place where strictly speaking updateTc should have been called, but wasn't. The last commits go back to that behaviour, although it could be considered a bug (albeit one that has only small consequences if the swE sum is not reset during the update(...,false)).

perrotta · 2021-02-14T19:46:10Z

+1

Technical cleanup of the c++ code
Jenkins tests pass

cmsbuild · 2021-02-14T19:46:35Z

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @silviodonato, @dpiparo, @qliphy (and backports should be raised in the release meeting by the corresponding L2)

qliphy · 2021-02-15T01:17:33Z

+1

perrotta · 2021-02-15T06:40:33Z

@werdmann : since the "technical" PR is now merged, please prepare the "fix" one based on the first IB that will contain this PR (quite likely this afternoon CMSSW_11_3_X_2021-02-15-1100)

avoid single letter names and code cleanup

0184912

cmsbuild added this to the CMSSW_11_3_X milestone Jan 28, 2021

cmsbuild added code-checks-pending orp-pending pending-signatures reconstruction-pending tests-pending labels Jan 28, 2021

cmsbuild added code-checks-approved and removed code-checks-pending labels Jan 28, 2021

cmsbuild added tests-started and removed tests-pending labels Jan 28, 2021

cmsbuild added tests-approved and removed tests-started labels Jan 28, 2021

perrotta reviewed Feb 2, 2021

View reviewed changes

implemented pr recommendations

aa450b5

cmsbuild added code-checks-pending and removed code-checks-approved tests-approved labels Feb 9, 2021

cmsbuild added code-checks-pending tests-pending labels Feb 12, 2021

cmsbuild added code-checks-approved and removed code-checks-pending labels Feb 12, 2021

cmsbuild added tests-started and removed tests-pending labels Feb 12, 2021

cmsbuild added tests-approved and removed tests-started labels Feb 12, 2021

cmsbuild added fully-signed reconstruction-approved and removed pending-signatures reconstruction-pending labels Feb 14, 2021

cmsbuild added orp-approved and removed orp-pending labels Feb 15, 2021

cmsbuild merged commit 37bbe7c into cms-sw:master Feb 15, 2021

werdmann mentioned this pull request Feb 16, 2021

add missing Tc updates before cluster merging #32924

Merged

werdmann deleted the 2021-daclusterizer-code-cosmetics branch February 17, 2021 13:31

perrotta mentioned this pull request Feb 18, 2021

merge or unify DAClusterizerInZT_vect and DAClusterizerInZ_vect #20417

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

avoid single letter names and code cleanup #32764

avoid single letter names and code cleanup #32764

werdmann commented Jan 28, 2021

cmsbuild commented Jan 28, 2021

cmsbuild commented Jan 28, 2021

perrotta commented Jan 28, 2021

cmsbuild commented Jan 28, 2021

slava77 commented Jan 28, 2021

werdmann commented Jan 28, 2021

perrotta left a comment

perrotta Feb 2, 2021

perrotta Feb 2, 2021

perrotta Feb 2, 2021

perrotta Feb 2, 2021

perrotta Feb 2, 2021

perrotta Feb 2, 2021

perrotta Feb 2, 2021

perrotta Feb 2, 2021

perrotta commented Feb 2, 2021 •

edited

werdmann commented Feb 2, 2021

makortel commented Feb 2, 2021

perrotta commented Feb 3, 2021

makortel commented Feb 3, 2021

werdmann commented Feb 8, 2021

cmsbuild commented Feb 12, 2021

cmsbuild commented Feb 12, 2021

perrotta commented Feb 12, 2021

werdmann commented Feb 12, 2021

cmsbuild commented Feb 12, 2021

perrotta commented Feb 13, 2021

perrotta commented Feb 13, 2021

werdmann commented Feb 13, 2021 •

edited

perrotta commented Feb 14, 2021

cmsbuild commented Feb 14, 2021

qliphy commented Feb 15, 2021

perrotta commented Feb 15, 2021

	if (verbose_) {
	std::cout << "DAClusterizerInZT_vect: mintrkweight = " << mintrkweight_ << std::endl;
	std::cout << "DAClusterizerInZT_vect: uniquetrkweight = " << uniquetrkweight_ << std::endl;
	std::cout << "DAClusterizerInZT_vect: zmerge = " << zmerge_ << std::endl;
	std::cout << "DAClusterizerInZT_vect: tmerge = " << tmerge_ << std::endl;
	std::cout << "DAClusterizerInZT_vect: Tmin = " << minT << std::endl;
	std::cout << "DAClusterizerInZT_vect: Tpurge = " << purgeT << std::endl;
	std::cout << "DAClusterizerInZT_vect: Tstop = " << stopT << std::endl;
	std::cout << "DAClusterizerInZT_vect: vertexSize = " << vertexSize_ << std::endl;
	std::cout << "DAClusterizerInZT_vect: vertexSizeTime = " << vertexSizeTime_ << std::endl;
	std::cout << "DAClusterizerInZT_vect: coolingFactor = " << coolingFactor_ << std::endl;
	std::cout << "DAClusterizerInZT_vect: d0CutOff = " << d0CutOff_ << std::endl;
	std::cout << "DAClusterizerInZT_vect: dzCutOff = " << dzCutOff_ << std::endl;
	std::cout << "DAClusterizerInZT_vect: dtCutoff = " << dtCutOff_ << std::endl;
	std::cout << "DAClusterizerInZT_vect: zrange = " << sel_zrange_ << std::endl;
	std::cout << "DAClusterizerinZT_vect: convergence mode = " << convergence_mode_ << std::endl;
	std::cout << "DAClusterizerinZT_vect: delta_highT = " << delta_highT_ << std::endl;
	std::cout << "DAClusterizerinZT_vect: delta_lowT = " << delta_lowT_ << std::endl;
	}

avoid single letter names and code cleanup #32764

avoid single letter names and code cleanup #32764

Conversation

werdmann commented Jan 28, 2021

PR description:

PR validation:

cmsbuild commented Jan 28, 2021

cmsbuild commented Jan 28, 2021

perrotta commented Jan 28, 2021

cmsbuild commented Jan 28, 2021

Comparison Summary

slava77 commented Jan 28, 2021

werdmann commented Jan 28, 2021

perrotta left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

perrotta commented Feb 2, 2021 • edited

werdmann commented Feb 2, 2021

makortel commented Feb 2, 2021

perrotta commented Feb 3, 2021

makortel commented Feb 3, 2021

werdmann commented Feb 8, 2021

cmsbuild commented Feb 12, 2021

cmsbuild commented Feb 12, 2021

perrotta commented Feb 12, 2021

werdmann commented Feb 12, 2021

cmsbuild commented Feb 12, 2021

Comparison Summary

perrotta commented Feb 13, 2021

perrotta commented Feb 13, 2021

werdmann commented Feb 13, 2021 • edited

perrotta commented Feb 14, 2021

cmsbuild commented Feb 14, 2021

qliphy commented Feb 15, 2021

perrotta commented Feb 15, 2021

perrotta commented Feb 2, 2021 •

edited

werdmann commented Feb 13, 2021 •

edited