Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid deref of Ref in copy constructor of PackedCandidate #32964

Merged
merged 2 commits into from Feb 25, 2021

Conversation

Dr15Jones
Copy link
Contributor

PR description:

Previously if the vertex had not been read from the original, a call to a copy constructor would cause the deref to happen.

The copy constructor was being called by ROOT during read back from a file. This meant while we were reading the branch holding the PackedCandidate we were trigger reading the branch holding the Vertexes.

PR validation:

The code compiles and the extended unit test passes.

Previously if the vertex had not been read from the original, a
call to a copy constructor would cause the deref to happen.
@Dr15Jones
Copy link
Contributor Author

Please test

@cmsbuild
Copy link
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-32964/21199

  • This PR adds an extra 24KB to repository

@cmsbuild
Copy link
Contributor

A new Pull Request was created by @Dr15Jones (Chris Jones) for master.

It involves the following packages:

DataFormats/PatCandidates

@perrotta, @jpata, @slava77 can you please review it and eventually sign? Thanks.
@gpetruc, @cbernet, @gouskos, @rovere, @hatakeyamak this is something you requested to watch as well.
@silviodonato, @dpiparo, @qliphy you are the release manager for this.

cms-bot commands are listed here

@cmsbuild
Copy link
Contributor

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-10b87b/13016/summary.html
COMMIT: 22f5521
CMSSW: CMSSW_11_3_X_2021-02-22-1100/slc7_amd64_gcc900
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/32964/13016/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 4 differences found in the comparisons
  • DQMHistoTests: Total files compared: 37
  • DQMHistoTests: Total histograms compared: 2750983
  • DQMHistoTests: Total failures: 7
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 2750954
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 36 files compared)
  • Checked 156 log files, 37 edm output root files, 37 DQM output files

@slava77
Copy link
Contributor

slava77 commented Feb 24, 2021

Previously if the vertex had not been read from the original, a call to a copy constructor would cause the deref to happen.

The copy constructor was being called by ROOT during read back from a file. This meant while we were reading the branch holding the PackedCandidate we were trigger reading the branch holding the Vertexes.

is there a chance that a PackedCandidate(PackedCandidate &&iOther) is called or any of the assignment operators?

Comment on lines +208 to +212
dxy_(vertex_ ? iOther.dxy_ : 0),
dz_(vertex_ ? iOther.dz_ : 0),
dphi_(vertex_ ? iOther.dphi_ : 0),
deta_(vertex_ ? iOther.deta_ : 0),
dtrkpt_(vertex_ ? iOther.dtrkpt_ : 0),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this doesn't look right; having a null vertex does not imply zero values in these parameters

void pat::PackedCandidate::unpackVtx() const {
reco::VertexRef pvRef = vertexRef();
dphi_ = int16_t(packedDPhi_) * 3.2f / std::numeric_limits<int16_t>::max(),
deta_ = MiniFloatConverter::float16to32(packedDEta_);
dtrkpt_ = MiniFloatConverter::float16to32(packedDTrkPt_);
dxy_ = MiniFloatConverter::float16to32(packedDxy_) / 100.;
dz_ = pvRef.isNonnull() ? MiniFloatConverter::float16to32(packedDz_) / 100.
: int16_t(packedDz_) * 40.f / std::numeric_limits<int16_t>::max();
Point pv = pvRef.isNonnull() ? pvRef->position() : Point();
float phi = p4_.load()->Phi() + dphi_, s = std::sin(phi), c = std::cos(phi);
auto vertex = std::make_unique<Point>(pv.X() - dxy_ * s,

I think that the unpacking of parameters not dependent on the value of the vertexRef should still happen.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

uhm, on a second thought, perhaps not and the changes are OK, but that becomes a bit more obvious only after seeing that all cases to access the dxy_ through dtrkpt_ values rely on having unpackVtx called.

I see a path after this PR change that a candidate that comes out of this PackedCandidate(const PackedCandidate &iOther) with iOther.vertex_ == nullptr can get to e.g. a call to ::packVtx without a preceding unpackVtx (e.g. via ::setTrackProperties) and lead to a crash in

void pat::PackedCandidate::packVtx(bool unpackAfterwards) {
reco::VertexRef pvRef = vertexRef();
Point pv = pvRef.isNonnull() ? pvRef->position() : Point();
float dxPV = vertex_.load()->X() - pv.X(),

This looks like a maybeUnpackBoth() call is needed in ::setTrackProperties

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! I made the change.

@Dr15Jones
Copy link
Contributor Author

is there a chance that a PackedCandidate(PackedCandidate &&iOther) is called or any of the assignment operators?

Those methods either already avoid reading the vertex_ or only do so if iOther.vertex_ is not null. Now they do read such values as iOther.dphi_ which are a race condition if iOther.vertex_ == nullptr. However, those values (which are not guaranteed to be reasonable in that case) will never be used as the object will have vertex_ == nullptr which will trigger unpacking.

This protects the call to pack() which requires the variables
have already been unpacked.
@cmsbuild
Copy link
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-32964/21238

  • This PR adds an extra 28KB to repository

@cmsbuild
Copy link
Contributor

Pull request #32964 was updated. @perrotta, @jpata, @cmsbuild, @slava77 can you please check and sign again.

@Dr15Jones
Copy link
Contributor Author

please test

@cmsbuild
Copy link
Contributor

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-10b87b/13064/summary.html
COMMIT: 85dfd14
CMSSW: CMSSW_11_3_X_2021-02-23-2300/slc7_amd64_gcc900
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/32964/13064/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 20 differences found in the comparisons
  • DQMHistoTests: Total files compared: 37
  • DQMHistoTests: Total histograms compared: 2750983
  • DQMHistoTests: Total failures: 12
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 2750949
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 36 files compared)
  • Checked 156 log files, 37 edm output root files, 37 DQM output files

@slava77
Copy link
Contributor

slava77 commented Feb 25, 2021

+1

for #32964 85dfd14

  • code changes are in line with the PR description and the follow up review; the physics outputs should not change
  • jenkins tests pass and comparisons with the baseline show no relevant differences as expected
    • a few workflows like 140.56 have ExcessiveTime LogWarning in the PR tests for the same module in all 4 cases that I checked, with the message looking like
%MSG-e ExcessiveTime:  SiPixelClusterProducer:siPixelClustersPreSplitting@cpu  24-Feb-2021 16:51:38 CET Run: 326479 Event: 1394020
ExcessiveTime: Module used 614.385 seconds of time which exceeds the error threshold configured in the Timing Service of 600 seconds.

As I recall in the past similar symptoms were an indication of some issues in the Frontier infrastructure

@cmsbuild
Copy link
Contributor

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @silviodonato, @dpiparo, @qliphy (and backports should be raised in the release meeting by the corresponding L2)

@silviodonato
Copy link
Contributor

+1

@cmsbuild cmsbuild merged commit 8943836 into cms-sw:master Feb 25, 2021
@Dr15Jones Dr15Jones deleted the fixCopyCtrPackedCandidate branch March 8, 2021 14:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants