Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

avoid single letter names and code cleanup #32764

Merged
merged 4 commits into from Feb 15, 2021

Conversation

werdmann
Copy link
Contributor

PR description:

This pull request is following up a request that arose in the context of PR #29626 to rename single letter variables which was deferred to later. It is done now in preparation of further development for cmssw_11_3.
May other variables were renamed as well to improve clarity and readability. No functionality changes have been introduced by this.
There were also some other cosmetic changes concerning comments and white-spaces.
Two unused/obsolete methods were removed from the structs in the header files.
A missing initialization of the variable betastop_ has been added. This only affects the results when Tmin was initialized with an illegal value, which of course should never happen.

PR validation:

The code was tested in CMSSW_11_3_0_pre2 against relvals and no difference was found to the original code.
runTheMatrix resulted in no failures.

@cmsbuild
Copy link
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-32764/20922

  • This PR adds an extra 32KB to repository

@cmsbuild
Copy link
Contributor

A new Pull Request was created by @werdmann for master.

It involves the following packages:

RecoVertex/PrimaryVertexProducer

@perrotta, @jpata, @cmsbuild, @slava77 can you please review it and eventually sign? Thanks.
@makortel, @GiacomoSguazzoni, @JanFSchulte, @rovere, @VinInn, @ebrondol, @mtosi, @dgulhan this is something you requested to watch as well.
@silviodonato, @dpiparo, @qliphy you are the release manager for this.

cms-bot commands are listed here

@perrotta
Copy link
Contributor

please test

@cmsbuild
Copy link
Contributor

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-6948be/12589/summary.html
COMMIT: 0184912
CMSSW: CMSSW_11_3_X_2021-01-28-1100/slc7_amd64_gcc900
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/32764/12589/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 4 differences found in the comparisons
  • DQMHistoTests: Total files compared: 37
  • DQMHistoTests: Total histograms compared: 2716596
  • DQMHistoTests: Total failures: 6
  • DQMHistoTests: Total nulls: 1
  • DQMHistoTests: Total successes: 2716567
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: -0.004 KiB( 36 files compared)
  • DQMHistoSizes: changed ( 312.0 ): -0.004 KiB MessageLogger/Warnings
  • Checked 156 log files, 37 edm output root files, 37 DQM output files

@slava77
Copy link
Contributor

slava77 commented Jan 28, 2021

@werdmann
I see that in a few places __restrict__ was added.
Did it have any positive input on vectorization of the code? Did you check the CPU time used by vertexing?

@werdmann
Copy link
Contributor Author

@werdmann
I see that in a few places __restrict__ was added.
Did it have any positive input on vectorization of the code? Did you check the CPU time used by vertexing?
@slava77 I only noticed in the process of synchronizing Z and ZT clustering that one of them used restrict and the other did not.
I will address CPU time in another PR soon.

Copy link
Contributor

@perrotta perrotta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you are cleaning this code, please also take what follows into account.

Moreover:

  • The evalF method defined in
    double evalF(const double beta, track_t const &tks, vertex_t const &v) const;
    is never used: can it be removed, then? Otherwise, please at least get rid of the thread-unsafe cout in it
  • Even though I can't imagine that classes with similar names will exist somewhere else in CMSSW, the include guards should better contain the whole file path name, i.e. (e.g.)
    • RecoVertex_PrimaryVertexProducer_DAClusterizerInZT_vect_h
    • RecoVertex_PrimaryVertexProducer_DAClusterizerInZ_vect_h

std::vector<double> tpca_vec; // t-coordinate at point of closest approach to the beamline
std::vector<double> dz2_vec; // square of the error of z(pca)
std::vector<double> dt2_vec; // square of the error of t(pca)
std::vector<double> Z_sum_vec; // track contribution to the partition function, Z
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lower case initials for data members

break;
}
insertItem(k, z, t, pk, tks);
insertItem(k, zvtx, tvtx, rho, tks);
return k;
}

void debugOut() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also this debugOut() method doesn't look being used anywhere

}
std::vector<double> zpca_vec; // z-coordinate at point of closest approach to the beamline
std::vector<double> dz2_vec; // square of the error of z(pca)
std::vector<double> Z_sum_vec; // track contribution to the partition function, Z
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lower case initials for data members

double *__restrict__ zpca;
double *__restrict__ dz2;
double *__restrict__ tkwt;
double *__restrict__ Z_sum;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lower case initials for data members

double *__restrict__ dz2;
double *__restrict__ dt2;
double *__restrict__ tkwt;
double *__restrict__ Z_sum;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lower case initials for data members

@@ -24,7 +24,7 @@ DAClusterizerInZ_vect::DAClusterizerInZ_vect(const edm::ParameterSet& conf) {
maxIterations_ = 1000;
mintrkweight_ = 0.5;

// configurable debug outptut debug output
// configurable debug outptut
verbose_ = conf.getUntrackedParameter<bool>("verbose", false);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

verbose_ is only used at

if (verbose_) {
std::cout << "DAClusterizerInZT_vect: mintrkweight = " << mintrkweight_ << std::endl;
std::cout << "DAClusterizerInZT_vect: uniquetrkweight = " << uniquetrkweight_ << std::endl;
std::cout << "DAClusterizerInZT_vect: zmerge = " << zmerge_ << std::endl;
std::cout << "DAClusterizerInZT_vect: tmerge = " << tmerge_ << std::endl;
std::cout << "DAClusterizerInZT_vect: Tmin = " << minT << std::endl;
std::cout << "DAClusterizerInZT_vect: Tpurge = " << purgeT << std::endl;
std::cout << "DAClusterizerInZT_vect: Tstop = " << stopT << std::endl;
std::cout << "DAClusterizerInZT_vect: vertexSize = " << vertexSize_ << std::endl;
std::cout << "DAClusterizerInZT_vect: vertexSizeTime = " << vertexSizeTime_ << std::endl;
std::cout << "DAClusterizerInZT_vect: coolingFactor = " << coolingFactor_ << std::endl;
std::cout << "DAClusterizerInZT_vect: d0CutOff = " << d0CutOff_ << std::endl;
std::cout << "DAClusterizerInZT_vect: dzCutOff = " << dzCutOff_ << std::endl;
std::cout << "DAClusterizerInZT_vect: dtCutoff = " << dtCutOff_ << std::endl;
std::cout << "DAClusterizerInZT_vect: zrange = " << sel_zrange_ << std::endl;
std::cout << "DAClusterizerinZT_vect: convergence mode = " << convergence_mode_ << std::endl;
std::cout << "DAClusterizerinZT_vect: delta_highT = " << delta_highT_ << std::endl;
std::cout << "DAClusterizerinZT_vect: delta_lowT = " << delta_lowT_ << std::endl;
}

The same functionality could be replaced, in a more thread compliant way, by using the DEBUG / DEBUG_LEVEL structure which is used everywhere else in this code

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The very same apply to the other file DAClusterizerInZT_vect.cc

t_tkwt = 1. / (1. + local_exp(std::pow(atIP.value() / atIP.error(), 2) -
std::pow(d0CutOff_, 2))); // reduce weight for high ip tracks
if (edm::isNotFinite(t_tkwt) || t_tkwt < std::numeric_limits<double>::epsilon()) {
std::cout << "DAClusterizerInZT_vect.fill rejected track t_tkwt " << t_tkwt << std::endl;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove all (multi-thread unsafe) cout's from this fill() method: alternatively either replace them with some LogMessage's, or protect them with a DEBUG flag

@perrotta
Copy link
Contributor

perrotta commented Feb 2, 2021

The two (longish) methods DAClusterizerInZ_vect::updateTc and DAClusterizerInZ_vect::update are pretty similar, only differing for updating swE in the "Tc" case, and not in the other one.
Some simplification that avoids copy/pasting, and all related risk of making mistakes during possible updates, could be obtained by having a single method with an additional (bool?) parameter that rules the two-lines update of swE

@werdmann
Copy link
Contributor Author

werdmann commented Feb 2, 2021

The two (longish) methods DAClusterizerInZ_vect::updateTc and DAClusterizerInZ_vect::update are pretty similar, only differing for updating swE in the "Tc" case, and not in the other one.
Some simplification that avoids copy/pasting, and all related risk of making mistakes during possible updates, could be obtained by having a single method with an additional (bool?) parameter that rules the two-lines update of swE

That is of course the better solution for the code. The if statement would be inside the loops. Can someone comment on whether this will somehow impact vectorization?
Thanks

@makortel
Copy link
Contributor

makortel commented Feb 2, 2021

The if statement would be inside the loops. Can someone comment on whether this will somehow impact vectorization?

In leading order adding a runtime if inside a loop can prevent vectorization. In sub-leading order compiler may be able to implement the if with mask and continue to vectorize, but it would have to be checked from the generated code. By quick look in this case the "whether or not to update swE" probably could be given as a template argument, in which case one could use compile-time if constexpr that should lead to same behavior as now with explicit two functions.

@perrotta
Copy link
Contributor

perrotta commented Feb 3, 2021

The if statement would be inside the loops. Can someone comment on whether this will somehow impact vectorization?

In leading order adding a runtime if inside a loop can prevent vectorization. In sub-leading order compiler may be able to implement the if with mask and continue to vectorize, but it would have to be checked from the generated code. By quick look in this case the "whether or not to update swE" probably could be given as a template argument, in which case one could use compile-time if constexpr that should lead to same behavior as now with explicit two functions.

Thank you @makortel!

Wouldn't that work, perhaps another possibility could be to move the if outside the loop, for example:

    // auto-vectorized
    if (Tc) {
      for (unsigned int k = kmin; k < kmax; ++k) {
        vertices.se[k] += vertices.exp[k] * (tmp_trk_tkwt * o_trk_Z_sum);
        auto w = vertices.rho[k] * vertices.exp[k] * (tmp_trk_tkwt * o_trk_Z_sum * o_trk_dz2);
        vertices.sw[k] += w;
        vertices.swz[k] += w * tmp_trk_z;
      }
   else  {
      for (unsigned int k = kmin; k < kmax; ++k) {
        vertices.se[k] += vertices.exp[k] * (tmp_trk_tkwt * o_trk_Z_sum);
        auto w = vertices.rho[k] * vertices.exp[k] * (tmp_trk_tkwt * o_trk_Z_sum * o_trk_dz2);
        vertices.sw[k] += w;
        vertices.swz[k] += w * tmp_trk_z;
        vertices.swE[k] += w * vertices.exp_arg[k] * obeta;
      }
   }

Could this also work (even if more naif)?

@makortel
Copy link
Contributor

makortel commented Feb 3, 2021

Could this also work (even if more naif)?

Yes, that should also work (with the price of some code duplication).

@werdmann
Copy link
Contributor Author

werdmann commented Feb 8, 2021

Thank you for the feedback and apologies for the long silence. I verified that Andreas proposal incurs no measurable cpu time penalty and will implement all suggestions asap.

@cmsbuild
Copy link
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-32764/21130

  • This PR adds an extra 20KB to repository

@cmsbuild
Copy link
Contributor

Pull request #32764 was updated. @perrotta, @jpata, @cmsbuild, @slava77 can you please check and sign again.

@perrotta
Copy link
Contributor

please test
(Thank you Wolfran... even if the updates applied here does not seem to correspond to the diff you posted yesterday... but let see how do the test look like)

@werdmann
Copy link
Contributor Author

@perotta you are right. in addition to what I posted yesterday it addresses your comment #32764 (comment) by moving the multiplication with that factor to a different loop where it needs to be executed only nvertex times instead of ntrack times.

@cmsbuild
Copy link
Contributor

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-6948be/12869/summary.html
COMMIT: fec4cbd
CMSSW: CMSSW_11_3_X_2021-02-12-1100/slc7_amd64_gcc900
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/32764/12869/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

@slava77 comparisons for the following workflows were not done due to missing matrix map:

  • /build/cmsbld/jenkins/workspace/compare-root-files-short-matrix/data/PR-6948be/11634.911_TTbar_14TeV+2021_DD4hep+TTbar_14TeV_TuneCP5_GenSim+Digi+Reco+HARVEST+ALCA

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 5 differences found in the comparisons
  • DQMHistoTests: Total files compared: 37
  • DQMHistoTests: Total histograms compared: 2750846
  • DQMHistoTests: Total failures: 11
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 2750813
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 36 files compared)
  • Checked 156 log files, 37 edm output root files, 37 DQM output files

@perrotta
Copy link
Contributor

In DAClusterizerInZT_vect.cc the new update(...) function corresponds to the previous one when updateTc=true. Therefore, all calls should add as argument the corresponding boolean parameter as true. However, I see that still L662, as well as L1211 and L1231, use the default (false) argument: was it done on purpose, or just forgotten?

At the contrary in DAClusterizerInZ_vect.cc all new update() calls now correspond to what is in the version of the code currently in CMSSW

@perrotta
Copy link
Contributor

There are a couple of differences in DQM plots for wfs 23234 and 28234. They look like just as possible numerical instabilities, but they could still be correlated with what implemented here. Eg. for wf 28234:
image
image

@werdmann
Copy link
Contributor Author

werdmann commented Feb 13, 2021

In DAClusterizerInZT_vect.cc the new update(...) function corresponds to the previous one when updateTc=true. Therefore, all calls should add as argument the corresponding boolean parameter as true. However, I see that still L662, as well as L1211 and L1231, use the default (false) argument: was it done on purpose, or just forgotten?

This is done on purpose to avoid unnecessary arithmetic when an updated Tc is not needed. The ZT clusterizer only uses Tc for splitting, so it only needs to be done when a subsequent split() is possible.

At the contrary in DAClusterizerInZ_vect.cc all new update() calls now correspond to what is in the version of the code currently in CMSSW

Yes. In contrast to the ZT clusterizer, which before this PR did not distinguish between update and updateTc and simply effectively did updateTc all the time, the Z clusterizer already had two versions of update and the only change was to replace update() with update(..,false) and updateTc() with update(..., true). The complication was, that there was a place where strictly speaking updateTc should have been called, but wasn't. The last commits go back to that behaviour, although it could be considered a bug (albeit one that has only small consequences if the swE sum is not reset during the update(...,false)).

@perrotta
Copy link
Contributor

+1

  • Technical cleanup of the c++ code
  • Jenkins tests pass

@cmsbuild
Copy link
Contributor

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @silviodonato, @dpiparo, @qliphy (and backports should be raised in the release meeting by the corresponding L2)

@qliphy
Copy link
Contributor

qliphy commented Feb 15, 2021

+1

@cmsbuild cmsbuild merged commit 37bbe7c into cms-sw:master Feb 15, 2021
@perrotta
Copy link
Contributor

@werdmann : since the "technical" PR is now merged, please prepare the "fix" one based on the first IB that will contain this PR (quite likely this afternoon CMSSW_11_3_X_2021-02-15-1100)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants