perf: Remove x87 elliptic integral in solenoid B-field #1469

stephenswat · 2022-08-26T14:40:58Z

As it stands, the solenoid B-field code uses the boost::math::ellint_1 and boost::math::ellint_2 methods. For reasons which I do not understand, these functions use 80-bit x87 floating point numbers, which leads to a lot of conversion between x87 and SSE floating points. It also has the downside that x87 operations are generally slower then their SSE counterparts on modern CPUs.

The interesting thing is that Boost's method for deciding on the floating point precision seems to be quite arbitrary; even when the
functions are explicitly templated to use 64-bit double types, the internals of these functions still use 80-bit floating points.

The best solution I have found is to use a compiler preprocessor flag, BOOST_MATH_NO_LONG_DOUBLE_MATH_FUNCTIONS, which tells Boost not to use x87 floating points at all. Enabling this flag only in the solenoid B-field translation unit should be safe and improve performance. Very unscientifically benchmarked on my Intel i5-7300U CPU, the solenoid B-field benchmark with the old code takes 615.7±7.3µs per iteration, while the new code takes 353.7±7.0µs, a speedup of about 75%.

As it stands, the solenoid B-field code uses the `boost::math::ellint_1` and `boost::math::ellint_2` methods. For reasons which I do not understand, these functions use 80-bit x87 floating point numbers, which leads to a lot of conversion between x87 and SSE floating points. It also has the downside that x87 operations are generally slower then their SSE counterparts on modern CPUs. The interesting thing is that Boost's method for deciding on the floating point precision seems to be quite arbitrary; even when the functions are explicitly templated to use 64-bit `double` types, the internals of these functions still use 80-bit floating points. The best solution I have found is to use a compiler preprocessor flag, `BOOST_MATH_NO_LONG_DOUBLE_MATH_FUNCTIONS`, which tells Boost not to use x87 floating points at all. Enabling this flag only in the solenoid B-field translation unit should be safe and improve performance. Very unscientifically benchmarked on my Intel i5-7300U CPU, the solenoid B-field benchmark with the old code takes 615.7±7.3µs per iteration, while the new code takes 353.7±7.0µs, a speedup of about 75%.

codecov · 2022-08-26T15:18:52Z

Codecov Report

Merging #1469 (a05f694) into main (fc75a54) will not change coverage.
The diff coverage is n/a.

@@           Coverage Diff           @@
##             main    #1469   +/-   ##
=======================================
  Coverage   48.59%   48.59%           
=======================================
  Files         380      380           
  Lines       20589    20589           
  Branches     9431     9431           
=======================================
  Hits        10005    10005           
  Misses       4097     4097           
  Partials     6487     6487

Impacted Files	Coverage Δ
Core/src/MagneticField/SolenoidBField.cpp	`69.64% <ø> (ø)`

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

AJPfleger

nice! lgtm

paulgessinger · 2022-08-31T14:14:22Z

Seeing this only now, I'm guessing you checked that the solenoid field output is approximately compatible @stephenswat? (not identical of course)

acts-project-service · 2022-09-02T07:59:14Z

The backport to develop/v19.x failed:

The process '/usr/bin/git' failed with exit code 1

To backport manually, run these commands in your terminal:

# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add .worktrees/backport-develop/v19.x develop/v19.x
# Navigate to the new working tree
cd .worktrees/backport-develop/v19.x
# Create a new branch
git switch --create backport-1469-to-develop/v19.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 3795148974ec2682f84218b7f8c14d1f3be121c1
# Push it to GitHub
git push --set-upstream origin backport-1469-to-develop/v19.x
# Go back to the original working tree
cd ../..
# Delete the working tree
git worktree remove .worktrees/backport-develop/v19.x

Then, create a pull request where the base branch is develop/v19.x and the compare/head branch is backport-1469-to-develop/v19.x.

…1469) As it stands, the solenoid B-field code uses the `boost::math::ellint_1` and `boost::math::ellint_2` methods. For reasons which I do not understand, these functions use 80-bit x87 floating point numbers, which leads to a lot of conversion between x87 and SSE floating points. It also has the downside that x87 operations are generally slower then their SSE counterparts on modern CPUs. The interesting thing is that Boost's method for deciding on the floating point precision seems to be quite arbitrary; even when the functions are explicitly templated to use 64-bit `double` types, the internals of these functions still use 80-bit floating points. The best solution I have found is to use a compiler preprocessor flag, `BOOST_MATH_NO_LONG_DOUBLE_MATH_FUNCTIONS`, which tells Boost not to use x87 floating points at all. Enabling this flag only in the solenoid B-field translation unit should be safe and improve performance. Very unscientifically benchmarked on my Intel i5-7300U CPU, the solenoid B-field benchmark with the old code takes 615.7±7.3µs per iteration, while the new code takes 353.7±7.0µs, a speedup of about 75%.

…to develop/v19.x] (#1487) Backport 3795148 from #1469 --- As it stands, the solenoid B-field code uses the `boost::math::ellint_1` and `boost::math::ellint_2` methods. For reasons which I do not understand, these functions use 80-bit x87 floating point numbers, which leads to a lot of conversion between x87 and SSE floating points. It also has the downside that x87 operations are generally slower then their SSE counterparts on modern CPUs. The interesting thing is that Boost's method for deciding on the floating point precision seems to be quite arbitrary; even when the functions are explicitly templated to use 64-bit `double` types, the internals of these functions still use 80-bit floating points. The best solution I have found is to use a compiler preprocessor flag, `BOOST_MATH_NO_LONG_DOUBLE_MATH_FUNCTIONS`, which tells Boost not to use x87 floating points at all. Enabling this flag only in the solenoid B-field translation unit should be safe and improve performance. Very unscientifically benchmarked on my Intel i5-7300U CPU, the solenoid B-field benchmark with the old code takes 615.7±7.3µs per iteration, while the new code takes 353.7±7.0µs, a speedup of about 75%.

stephenswat added Component - Core Affects the Core module Improvement Changes to an existing feature Impact - Minor Nuissance bug and/or affects only a single module labels Aug 26, 2022

stephenswat added this to the next milestone Aug 26, 2022

stephenswat changed the title ~~Remove x87 elliptic integral in solenoid B-field~~ perf: Remove x87 elliptic integral in solenoid B-field Aug 26, 2022

Update solenoid writing hashes

a05f694

stephenswat self-assigned this Aug 26, 2022

AJPfleger approved these changes Aug 29, 2022

View reviewed changes

AJPfleger added the automerge label Aug 29, 2022

kodiakhq bot merged commit 3795148 into acts-project:main Aug 29, 2022

github-actions bot removed the automerge label Aug 29, 2022

paulgessinger added the backport develop/v19.x Backport this PR to the v19.x series label Sep 2, 2022

paulgessinger mentioned this pull request Sep 2, 2022

perf: Remove x87 elliptic integral in solenoid B-field [backport #1469 to develop/v19.x] #1487

Merged

paulgessinger modified the milestones: next, v20.1.0 Sep 2, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: Remove x87 elliptic integral in solenoid B-field #1469

perf: Remove x87 elliptic integral in solenoid B-field #1469

stephenswat commented Aug 26, 2022

codecov bot commented Aug 26, 2022 •

edited

AJPfleger left a comment

paulgessinger commented Aug 31, 2022

acts-project-service commented Sep 2, 2022

perf: Remove x87 elliptic integral in solenoid B-field #1469

perf: Remove x87 elliptic integral in solenoid B-field #1469

Conversation

stephenswat commented Aug 26, 2022

codecov bot commented Aug 26, 2022 • edited

Codecov Report

AJPfleger left a comment

Choose a reason for hiding this comment

paulgessinger commented Aug 31, 2022

acts-project-service commented Sep 2, 2022

codecov bot commented Aug 26, 2022 •

edited