-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RecoTracker/MkFitCore: factorize loops to allow gcc -msse3 to vectorize loops #37868
Conversation
-code-checks Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-37868/29816 ERROR: Build errors found during clang-tidy run.
|
f42a152
to
58881b8
Compare
58881b8
to
3ebd8b5
Compare
a0fab6a
to
4fe14c0
Compare
-code-checks Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-37868/29831 ERROR: Build errors found during clang-tidy run.
|
4fe14c0
to
33f75ff
Compare
-code-checks Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-37868/29833
Code check has found code style and quality issues which could be resolved by applying following patch(s)
|
please test |
+1 Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-5bd708/24964/summary.html Comparison SummarySummary:
|
+core |
+reconstruction
|
This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @perrotta, @dpiparo, @qliphy (and backports should be raised in the release meeting by the corresponding L2) |
+1
|
Does this qualify for a backport on the production release for high intensity pp collisions? |
At the first order I would say yes. |
please backport |
Let me suggest again to try vdt
V.
…Sent from my iPhone
On May 20, 2022, at 14:58, Dan Riley ***@***.***> wrote:
@gartung, just for my understanding, so the high number of differences that we saw was related to the vectorization of the trig functions?
@clacaputo -ffast-math also enables optimizations that can reduce precision, like reordering operations using associativity and using reciprocal approximations. Kalman fitters in single precision are known to have marginal stability so could be affected by those changes. It would be interesting to isolate the effect of just the trig vectorization, but looks to be non-trivial with gcc.
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you were mentioned.
|
See discussion in trackreco#93
Split large loops in
PropagationMPlex
into smaller loops, converting loop temporary floats into function temporary arrays of floats. This allowsgcc -msse3
to convert loops with floating point operations intosimd
instructions.Under EL8 use system provided mvec which implements vectorized trig functions. Under sl7, Intel's svml library must be linked to get the same performance.
Below is quoted the data from running the stand alone mkFit executable. This shows the improvements from this PR and the improvements from using AVX and AVX2 instructions sets as well as the improvements from using libmvec on el8.
No changes on sl7
gcc -msse3
No changes on sl7
gcc -mavx
No changes on sl7
gcc -mavx2 -mfma
With changes on sl7
gcc -msse3 -ffastmath -mveclibabi=svml -lsvml
With changes on sl7
gcc -mavx -ffastmath -mveclibabi=svml -lsvml
With changes on sl7
gcc -mavx2 -mfma
With changes on sl7
gcc -mavx2 -mfma -ffastmath
With changes on sl7
gcc -mavx2 -mfma -ffastmath -mveclibabi=svml -lsvml
With changes on el8
gcc -mavx2 -mfma -ffastmath
With changes on el8
gcc -mavx -ffastmath
With changes on el8
gcc -msse3 -ffastmath
With changes on el8
gcc -msse3
nofast-math
noattribute(math-errno)
No changes on slc7
icc -mAVX2
which adds-lsvml
With changes on slc7
icc -mAVX2
which adds-lsvml