Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Branchless MVA #13143

Merged
merged 2 commits into from Feb 2, 2016
Merged

Branchless MVA #13143

merged 2 commits into from Feb 2, 2016

Conversation

VinInn
Copy link
Contributor

@VinInn VinInn commented Feb 1, 2016

avoid conditional loads to allow the compiler to emit branchless code
for the tracking MVA is a 35% speedup
regression tested with the code in 8c2964a

@cmsbuild
Copy link
Contributor

cmsbuild commented Feb 1, 2016

A new Pull Request was created by @VinInn (Vincenzo Innocente) for CMSSW_8_0_X.

It involves the following packages:

CondFormats/EgammaObjects

@diguida, @cerminar, @cmsbuild, @franzoni, @ggovi, @mmusich, @davidlange6 can you please review it and eventually sign? Thanks.
@istaslis, @apfeiffer1, @tocheng this is something you requested to watch as well.
@slava77, @Degano, @smuzaffar you are the release manager for this.

cms-bot commands are list here #13028

@VinInn
Copy link
Contributor Author

VinInn commented Feb 1, 2016

@cmsbuild, please test

purely technical. no regression expected of any sort
@bendavid please have a look

@cmsbuild
Copy link
Contributor

cmsbuild commented Feb 1, 2016

The tests are being triggered in jenkins.
https://cmssdt.cern.ch/jenkins/job/ib-any-integration/10882/console

do {
auto r = fRightIndices[index];
auto l = fLeftIndices[index];
index = vector[fCutIndices[index]] > fCutVals[index] ? r : l;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the compiler really produce worse code if these three lines were replaced by
index = vector[fCutIndices[index]] > fCutVals[index] ? fRightIndices[index] : fLeftIndices[index];

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes,
because no where is explicitly said that if
vector[fCutIndices[index]] > fCutVals[index] is true fLeftIndices[index] exists

obvious in
a = p ? p->k : 0
here you cannot evaluate the two options as if p il null the first will segfault

indeed a tree could be implemented in some strange way that not all right and left node exists for a given index...

@bendavid
Copy link
Contributor

bendavid commented Feb 1, 2016

Alright looks good to me.

@cmsbuild
Copy link
Contributor

cmsbuild commented Feb 1, 2016

@mmusich
Copy link
Contributor

mmusich commented Feb 1, 2016

+1

@cmsbuild
Copy link
Contributor

cmsbuild commented Feb 1, 2016

@VinInn
Copy link
Contributor Author

VinInn commented Feb 1, 2016

Jenkins seems to confirm absence of regression

@VinInn
Copy link
Contributor Author

VinInn commented Feb 2, 2016

@slava77 , did not notice you were absent from the official subscribers
Hope we can merge in pre6: it is a not negligible speedup in reco

p.s.
It is not normal that a major reco algorithm is inside CondFormats:
Formats should contain only data, not algos
(this is pretty common, many calib algos are hidden inside calib-objects)
@Dr15Jones , @davidlange6, @slava77 : maybe is NOT to late to act?

@davidlange6
Copy link
Contributor

is not waiting on @slava77 in any case.. @ggovi - any comments?

@VinInn
Copy link
Contributor Author

VinInn commented Feb 2, 2016

the point is that it SHOULD wait on @slava77 ....

@ggovi
Copy link
Contributor

ggovi commented Feb 2, 2016

+1

@cmsbuild
Copy link
Contributor

cmsbuild commented Feb 2, 2016

This pull request is fully signed and it will be integrated in one of the next CMSSW_8_0_X IBs (tests are also fine). This pull request requires discussion in the ORP meeting before it's merged. @slava77, @davidlange6, @Degano, @smuzaffar

@slava77
Copy link
Contributor

slava77 commented Feb 2, 2016

On 2/2/16 7:18 AM, Vincenzo Innocente wrote:

@slava77 https://github.com/slava77 , did not notice you were absent
from the official subscribers
Hope we can merge in pre6

I've seen this and the fact that there are no regressions.
As David mentioned, it's not waiting for me.


Reply to this email directly or view it on GitHub
#13143 (comment).

@ggovi
Copy link
Contributor

ggovi commented Feb 2, 2016

no change on persistent data attributes...

@slava77
Copy link
Contributor

slava77 commented Feb 2, 2016

@VinInn
How did you quantify the speedup and what was the denominator of that 35% (or, alternatively, what is the absolute decrease in CPU)?
All code here is inlined, so an igprof will not be too descriptive.

@VinInn
Copy link
Contributor Author

VinInn commented Feb 2, 2016

@slava77 , actually I used perf (I am writing a note on that)
in any case the 35% is from igprof on TrackMVAClassifierBase::produce
that was 3.7% of tracking (single muon runD) and now is 2.4%

@slava77
Copy link
Contributor

slava77 commented Feb 2, 2016

On 2/2/16 7:38 AM, Vincenzo Innocente wrote:

@slava77 https://github.com/slava77 , actually I used perf (I am
writing a note on that)
in any case the 35% is from igprof on TrackMVAClassifierBase::produce
that was 3.7% of tracking (single muon runD) and now is 2.4%

Is this a new version?
In 25202 in 800pre3 I see that TrackMVAClassifierBase is about 1% of the
total reco/aod part of event processing.


Reply to this email directly or view it on GitHub
#13143 (comment).

@VinInn
Copy link
Contributor Author

VinInn commented Feb 2, 2016

consistent: 3.7% of tracking at PU20, 1% of total at PU35

@davidlange6
Copy link
Contributor

+1

cmsbuild added a commit that referenced this pull request Feb 2, 2016
@cmsbuild cmsbuild merged commit 261b79a into cms-sw:CMSSW_8_0_X Feb 2, 2016
cmsbuild added a commit that referenced this pull request Feb 2, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants