New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
avoid redundant computations in StripCPE #14344
Conversation
A new Pull Request was created by @VinInn (Vincenzo Innocente) for CMSSW_8_1_X. It involves the following packages: DataFormats/Common @smuzaffar, @Dr15Jones, @cvuosalo, @cmsbuild, @slava77, @davidlange6 can you please review it and eventually sign? Thanks. cms-bot commands are list here #13028 |
@cmsbuild , please test |
The tests are being triggered in jenkins. |
@@ -96,6 +96,9 @@ namespace edmNew { | |||
return edm::Ref<typename HandleT::element_type, typename HandleT::element_type::value_type::value_type>( handle.id(), ci, ci - &(container().front()) ); | |||
} | |||
|
|||
unsigned int makeKeyOf(const_iterator ci) const { | |||
return ci - &(container().front()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not ci - container().begin();
? That would seem to me to be easier for a person to understand.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you have to use front
because const_iterator
in this class is a pointer and not a std::vector<...>::const_iterator
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to make sure I am using pointers and not iterators.
it is consistent to the implementation few lines above
+1 |
Timing test in progress... |
SiStripDetId::SubDetector loc = SiStripDetId( det.geographicalId() ).subDetector(); | ||
|
||
LocalVector track = ltp.momentum(); | ||
track *= -p.thickness/track.z(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this division safe? Could track.z()
be zero? We've just been fixing some bugs caused by an invalid TrajectoryStateOnSurface. There might be more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
was like this before.
In any case things should be checked in advance, not so deep.
Could please remind me which bug you refers to?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@VinInn: I was thinking of #14306, which is not related to the code in this PR, but fixes a bug with a bad TrajectoryStateOnSurface. I have a faint recollection of seeing similar issues previously.
How many paths through the code lead to line 61 above? If you have an idea of how to find them all, we could check that the proper validation is being done in advance. But I would prefer bullet-proof code that never has a possibility to propagate NANs and nonsense values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to have a more in depth discussion about "FPE" and NaN particularly in the context of vectorization and the future move to vector hardware. We cannot afford to protect each single operation. The current accepted wisdom is to let NaN to propagate and trap it at very high level. Maybe we need to invite an expert and give us a lecture on how one manage this type of issues in HPC.
btw #14306 has nothing to do with NaN or "FPE" is just a trajectory not reaching the target.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On 5/6/16 8:44 AM, Vincenzo Innocente wrote:
In RecoLocalTracker/SiStripRecHitConverter/interface/StripCPE.h
#14344 (comment):
- SiStripDetId::SubDetector loc; float afullProjection; float corr;
- };
- virtual StripClusterParameterEstimator::LocalValues
- localParameters( const SiStripCluster& cl, AlgoParam const & ap) const {
- return std::make_pair(LocalPoint(), LocalError());
- }
- AlgoParam getAlgoParam(const GeomDetUnit& det, const LocalTrajectoryParameters & ltp) const {
- StripCPE::Param const & p = param(det);
- SiStripDetId::SubDetector loc = SiStripDetId( det.geographicalId() ).subDetector();
- LocalVector track = ltp.momentum();
- track *= -p.thickness/track.z();
I think we need to have a more in depth discussion about "FPE" and NaN
particularly in the context of vectorization and the future move to
vector hardware. We cannot afford to protect each single operation. The
current accepted wisdom is to let NaN to propagate and trap it at very
high level.
"A very high level" better be the output of the module or at worst
output of a sequence of modules
that has consumable output downstream.
The problem is more complex for utility/tools which are used in many places.
A fraction of users of the code may not care about execution speed much.
Maybe for that case it's practical to add the checks to avoid FPE
(template to hand over FPE-safe and unsafe interface?).
Maybe we need to invite an expert and give us a lecture on
how one manage this type of issues in HPC.
Do you have a name in mind?
Maybe send a suggestion by email.
—
You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
https://github.com/cms-sw/cmssw/pull/14344/files/e77be1aadab9d37f1baf957e190b179f95a1fc39#r62294072
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very High level is before storing in the event.
whatever happen in the meantime is irrelevant (including FPEs)
In any case this code was like this before, so this is not the place to argue about it.
We need to investigate. Somebody from Intel or NVidia or a supercomputing center...
In any case it is something that goes beyond CMS
@VinInn: You said no regressions expected, but my test shows numerous small differences. I ran workflow 50202.0_TTbar_13+TTbar_13INPUT+DIGIUP15_PU50 with 70 events against baseline CMSSW_8_1_X_2016-04-29-2300. The most notable differences I found are below. Are they to be expected? |
CPU timing measurements confirm the expected 10% improvement in timing. A test of 50202.0 as described above shows reduced times for tracking modules:
(Times exclude the first event.) By another measure:
|
The Jenkins test results show very numerous tiny differences for both the DQM and alternative comparisons for many workflows. |
For what concern the fraction of true pt I wil check with @makortel |
@cvuosalo , I can reproduce the observed differences with this small change in the current IB
we may even claim that the modified code is more precise than the original one |
The test for workflow 25202 (described above) also shows a roughly 5-10% improvement for tracking modules in the HLT step:
|
+1 Speeding up Strip CPE computations. The code changes are satisfactory. Jenkins tests against baseline CMSSW_8_1_X_2016-05-02-2300 show no significant differences but do show numerous tiny differences expected for the code changes. Extended tests of workflows 25202 and 50202 discussed above show similar tiny changes and confirm that a 10% speed-up of affected tracking modules has been achieved. |
This pull request is fully signed and it will be integrated in one of the next CMSSW_8_1_X IBs (tests are also fine). This pull request requires discussion in the ORP meeting before it's merged. @slava77, @davidlange6, @Degano, @smuzaffar |
+1 |
Major speed up in the computation of Strip CPE during pattern-recognition by avoiding redundant computations depending on track and detector only.
The net result is about 10% speed up for tracking in TTBAR PU35
No regression expected.