Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initialize member data in PerigeeTrajectoryError #36895

Closed

Conversation

Dr15Jones
Copy link
Contributor

PR description:

This should fix the UBSAN warnings.

PR validation:

Code compiles.

@cmsbuild
Copy link
Contributor

cmsbuild commented Feb 4, 2022

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-36895/28169

  • This PR adds an extra 12KB to repository

@cmsbuild
Copy link
Contributor

cmsbuild commented Feb 4, 2022

A new Pull Request was created by @Dr15Jones (Chris Jones) for master.

It involves the following packages:

  • TrackingTools/TrajectoryParametrization (reconstruction)

@jpata, @cmsbuild, @clacaputo, @slava77 can you please review it and eventually sign? Thanks.
@felicepantaleo, @GiacomoSguazzoni, @JanFSchulte, @rovere, @VinInn, @bellan, @ebrondol, @lecriste, @gpetruc, @mmusich, @mtosi, @dgulhan this is something you requested to watch as well.
@perrotta, @dpiparo, @qliphy you are the release manager for this.

cms-bot commands are listed here

@Dr15Jones
Copy link
Contributor Author

please test

@cmsbuild
Copy link
Contributor

cmsbuild commented Feb 5, 2022

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-7fd39e/22237/summary.html
COMMIT: 0e4b742
CMSSW: CMSSW_12_3_X_2022-02-04-1100/slc7_amd64_gcc10
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/36895/22237/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 4 differences found in the comparisons
  • DQMHistoTests: Total files compared: 46
  • DQMHistoTests: Total histograms compared: 3766018
  • DQMHistoTests: Total failures: 7
  • DQMHistoTests: Total nulls: 1
  • DQMHistoTests: Total successes: 3765988
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: -0.004 KiB( 45 files compared)
  • DQMHistoSizes: changed ( 312.0 ): -0.004 KiB MessageLogger/Warnings
  • Checked 193 log files, 42 edm output root files, 46 DQM output files
  • TriggerResults: no differences found

@jpata
Copy link
Contributor

jpata commented Feb 7, 2022

+reconstruction

  • technical

@cmsbuild
Copy link
Contributor

cmsbuild commented Feb 7, 2022

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @perrotta, @dpiparo, @qliphy (and backports should be raised in the release meeting by the corresponding L2)

This should fix the UBSAN warnings.
@cmsbuild
Copy link
Contributor

cmsbuild commented Feb 7, 2022

Pull request #36895 was updated. @jpata, @clacaputo, @slava77 can you please check and sign again.

@Dr15Jones
Copy link
Contributor Author

please test

@VinInn
Copy link
Contributor

VinInn commented Feb 7, 2022

Do not really understand this one.
Is this not a false positive?
and Why initialized to zero?

@Dr15Jones
Copy link
Contributor Author

This stems from the UBSAN IB

Is this not a false positive?

https://cmssdt.cern.ch/SDT/jenkins-artifacts/ubsan_logs/CMSSW_12_3_X_2022-02-05-1100/logs/5b/5b0b48448feee12e3fd3166c553fb58f/log

e.g.

1329.0/step3:TrackingTools/TrajectoryParametrization/interface/PerigeeTrajectoryError.h:14:7: runtime error: load of value 173, which is not a valid value for type 'bool'

Based on all the reports I've looked at, it appears the UBSAN only detects when bools are not properly initialized and doesn't do so for any other builtin types. The few cases I've more closely looked at happen when either the copy constructor or operator= are being called with the right hand side being a instance of the object which was default constructed. I assume that is what is happening here.

and Why initialized to zero?

I've found that people who do not know that C++ doesn't initialize builtin type member data assume it will be initialized to 0 (so I was just following that chain of thought).
You are free to suggest an alternative value.

@VinInn
Copy link
Contributor

VinInn commented Feb 7, 2022

The few cases I've more closely looked at happen when either the copy constructor or operator= are being called with the right hand side being a instance of the object which was default constructed. I assume that is what is happening here.

This is what I would call a "false positive" and I object to fix correct code only because a tool does not understand the logic.

@cmsbuild
Copy link
Contributor

cmsbuild commented Feb 7, 2022

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-7fd39e/22268/summary.html
COMMIT: b17a78f
CMSSW: CMSSW_12_3_X_2022-02-06-2300/slc7_amd64_gcc10
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/36895/22268/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 4 differences found in the comparisons
  • DQMHistoTests: Total files compared: 46
  • DQMHistoTests: Total histograms compared: 3766018
  • DQMHistoTests: Total failures: 8
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3765988
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 45 files compared)
  • Checked 193 log files, 42 edm output root files, 46 DQM output files
  • TriggerResults: no differences found

@jpata
Copy link
Contributor

jpata commented Feb 8, 2022

So how do we want to proceed here?

Do tracking experts confirm the behavior is intended (and UBSAN should be suppressed here), or should this be investigated further?

Does UBSAN give some more information about where the uninitialized use is happening?

@Dr15Jones
Copy link
Contributor Author

Does UBSAN give some more information about where the uninitialized use is happening?

The actual UBSAN log does give a traceback. Unfortunately, only the addresses inside the library are given, not the actual symbol names. @smuzaffar any ideas on getting the symbols?

@smuzaffar
Copy link
Contributor

@Dr15Jones , I will enable -g option for UBSAN IBs to too if that gives more detail.

@smuzaffar
Copy link
Contributor

cms-sw/cmsdist#7608 should build UBSAN in debug mode now

@Dr15Jones
Copy link
Contributor Author

I did a bit of poking about.

An invalid PerigeeTrajectoryError is created in this constructor

TrajectoryStateClosestToPoint(const PerigeeTrajectoryParameters& perigeeParameters,
double pt,
const GlobalPoint& referencePoint,
const MagneticField* field)
: theField(field),
theRefPoint(referencePoint),
theParameters(perigeeParameters),
thePt(pt),
valid(true),
theFTSavailable(false),
errorIsAvailable(false) {}

then a call to

inline AlgebraicSymMatrix55 PerigeeLinearizedTrackState::predictedStateWeight(int& error) const {
if UNLIKELY (!jacobiansAvailable)
computeJacobians();
if (!thePredState.isValid()) {
error = 1;
return AlgebraicSymMatrix55();
}
return thePredState.perigeeError().weightMatrix(error);
}

can be made where thePredState has the uninitialized member data (since call to computeJacobians() which fills thePredState ultimately traced back to here

TrajectoryStateClosestToPoint::TrajectoryStateClosestToPoint(const FTS& originalFTS, const GlobalPoint& referencePoint)
: theFTS(originalFTS), theRefPoint(referencePoint), valid(true), theFTSavailable(true) {
try {
theParameters = PerigeeConversions::ftsToPerigeeParameters(originalFTS, referencePoint, thePt);
if (theFTS.hasError()) {
thePerigeeError = PerigeeConversions::ftsToPerigeeError(originalFTS);
errorIsAvailable = true;
} else {
errorIsAvailable = false;
}

)

which mean the returned info from the call to weightMatrix will be unknown.

This does seem to mean that inverseError should probably be initialized to 1 to match what was done above.

@jpata
Copy link
Contributor

jpata commented Feb 9, 2022

thanks, let's see what the tracking experts suggest here.
also tagging @cms-sw/tracking-pog-l2

@jpata
Copy link
Contributor

jpata commented Feb 16, 2022

So should we instead use CMS_SA_ALLOW here?

@makortel
Copy link
Contributor

So should we instead use CMS_SA_ALLOW here?

CMS_SA_ALLOW is recognized only by static analyzer, so won't do anything for UBSAN.

@jpata
Copy link
Contributor

jpata commented Feb 17, 2022

@VinInn would you have some time to suggest a fix here? I was not able to dig up if just replacing by 1 is fine here or not, sorry.

@VinInn
Copy link
Contributor

VinInn commented Feb 17, 2022

I would just leave it as it was...

@VinInn
Copy link
Contributor

VinInn commented Feb 17, 2022

or fix the logic of the user of TrajectoryStateClosestToPoint
if error is not available it should not be asked for...

@Dr15Jones
Copy link
Contributor Author

or fix the logic of the user of TrajectoryStateClosestToPoint if error is not available it should not be asked for...

I just happened to spot that problem. There could be addition cases where the uninitialized data is read and there could be more in the future as people change/add code.

@VinInn
Copy link
Contributor

VinInn commented Feb 17, 2022

How the class shall be used is clear: it has a method to ask if the error is available.
If people misuse it they get UB. By design.

@jpata
Copy link
Contributor

jpata commented Feb 17, 2022

It looks to me based on the above discussion that

  1. some action is needed to either remove or prevent the possibility of the UB (just thinking out loud, could it be better to abort the job rather than have undefined behaviour?)
  2. it's not trivial how to do it, and the tracking-pog should have a say in it
    so it would make sense perhaps to close this PR, and open an issue.

Any objections?

@VinInn
Copy link
Contributor

VinInn commented Feb 17, 2022 via email

@jpata
Copy link
Contributor

jpata commented Feb 18, 2022

-reconstruction

As I understand, the fix proposed here by Chris is not universally accepted, so the tracking POG may be in a position decide here if/how this is to be fixed. I found an older issue for this #35036 (comment), where I updated the status. There are a few more cases in tracking packages.

@Dr15Jones
Copy link
Contributor Author

under protest I will close this

@Dr15Jones Dr15Jones closed this Feb 22, 2022
@Dr15Jones Dr15Jones deleted the ubsanPerigeeTrajectoryError branch February 21, 2024 19:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants