Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: Remove x87 elliptic integral in solenoid B-field #1469

Merged
merged 2 commits into from
Aug 29, 2022

Conversation

stephenswat
Copy link
Member

As it stands, the solenoid B-field code uses the boost::math::ellint_1 and boost::math::ellint_2 methods. For reasons which I do not understand, these functions use 80-bit x87 floating point numbers, which leads to a lot of conversion between x87 and SSE floating points. It also has the downside that x87 operations are generally slower then their SSE counterparts on modern CPUs.

The interesting thing is that Boost's method for deciding on the floating point precision seems to be quite arbitrary; even when the
functions are explicitly templated to use 64-bit double types, the internals of these functions still use 80-bit floating points.

The best solution I have found is to use a compiler preprocessor flag, BOOST_MATH_NO_LONG_DOUBLE_MATH_FUNCTIONS, which tells Boost not to use x87 floating points at all. Enabling this flag only in the solenoid B-field translation unit should be safe and improve performance. Very unscientifically benchmarked on my Intel i5-7300U CPU, the solenoid B-field benchmark with the old code takes 615.7±7.3µs per iteration, while the new code takes 353.7±7.0µs, a speedup of about 75%.

As it stands, the solenoid B-field code uses the `boost::math::ellint_1`
and `boost::math::ellint_2` methods. For reasons which I do not
understand, these functions use 80-bit x87 floating point numbers, which
leads to a lot of conversion between x87 and SSE floating points. It
also has the downside that x87 operations are generally slower then
their SSE counterparts on modern CPUs.

The interesting thing is that Boost's method for deciding on the
floating point precision seems to be quite arbitrary; even when the
functions are explicitly templated to use 64-bit `double` types, the
internals of these functions still use 80-bit floating points.

The best solution I have found is to use a compiler preprocessor flag,
`BOOST_MATH_NO_LONG_DOUBLE_MATH_FUNCTIONS`, which tells Boost not to use
x87 floating points at all. Enabling this flag only in the solenoid
B-field translation unit should be safe and improve performance. Very
unscientifically benchmarked on my Intel i5-7300U CPU, the solenoid
B-field benchmark with the old code takes 615.7±7.3µs per iteration,
while the new code takes 353.7±7.0µs, a speedup of about 75%.
@stephenswat stephenswat added Component - Core Affects the Core module Improvement Changes to an existing feature Impact - Minor Nuissance bug and/or affects only a single module labels Aug 26, 2022
@stephenswat stephenswat added this to the next milestone Aug 26, 2022
@stephenswat stephenswat changed the title Remove x87 elliptic integral in solenoid B-field perf: Remove x87 elliptic integral in solenoid B-field Aug 26, 2022
@codecov
Copy link

codecov bot commented Aug 26, 2022

Codecov Report

Merging #1469 (a05f694) into main (fc75a54) will not change coverage.
The diff coverage is n/a.

@@           Coverage Diff           @@
##             main    #1469   +/-   ##
=======================================
  Coverage   48.59%   48.59%           
=======================================
  Files         380      380           
  Lines       20589    20589           
  Branches     9431     9431           
=======================================
  Hits        10005    10005           
  Misses       4097     4097           
  Partials     6487     6487           
Impacted Files Coverage Δ
Core/src/MagneticField/SolenoidBField.cpp 69.64% <ø> (ø)

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@stephenswat stephenswat self-assigned this Aug 26, 2022
Copy link
Contributor

@AJPfleger AJPfleger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice! lgtm

@kodiakhq kodiakhq bot merged commit 3795148 into acts-project:main Aug 29, 2022
@paulgessinger
Copy link
Member

Seeing this only now, I'm guessing you checked that the solenoid field output is approximately compatible @stephenswat? (not identical of course)

@paulgessinger paulgessinger added the backport develop/v19.x Backport this PR to the v19.x series label Sep 2, 2022
@acts-project-service
Copy link
Collaborator

The backport to develop/v19.x failed:

The process '/usr/bin/git' failed with exit code 1

To backport manually, run these commands in your terminal:

# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add .worktrees/backport-develop/v19.x develop/v19.x
# Navigate to the new working tree
cd .worktrees/backport-develop/v19.x
# Create a new branch
git switch --create backport-1469-to-develop/v19.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 3795148974ec2682f84218b7f8c14d1f3be121c1
# Push it to GitHub
git push --set-upstream origin backport-1469-to-develop/v19.x
# Go back to the original working tree
cd ../..
# Delete the working tree
git worktree remove .worktrees/backport-develop/v19.x

Then, create a pull request where the base branch is develop/v19.x and the compare/head branch is backport-1469-to-develop/v19.x.

paulgessinger pushed a commit to paulgessinger/acts that referenced this pull request Sep 2, 2022
…1469)

As it stands, the solenoid B-field code uses the `boost::math::ellint_1` and `boost::math::ellint_2` methods. For reasons which I do not understand, these functions use 80-bit x87 floating point numbers, which leads to a lot of conversion between x87 and SSE floating points. It also has the downside that x87 operations are generally slower then their SSE counterparts on modern CPUs.

The interesting thing is that Boost's method for deciding on the floating point precision seems to be quite arbitrary; even when the
functions are explicitly templated to use 64-bit `double` types, the internals of these functions still use 80-bit floating points.

The best solution I have found is to use a compiler preprocessor flag, `BOOST_MATH_NO_LONG_DOUBLE_MATH_FUNCTIONS`, which tells Boost not to use x87 floating points at all. Enabling this flag only in the solenoid B-field translation unit should be safe and improve performance. Very unscientifically benchmarked on my Intel i5-7300U CPU, the solenoid B-field benchmark with the old code takes 615.7±7.3µs per iteration, while the new code takes 353.7±7.0µs, a speedup of about 75%.
@paulgessinger paulgessinger modified the milestones: next, v20.1.0 Sep 2, 2022
kodiakhq bot pushed a commit that referenced this pull request Sep 22, 2022
…to develop/v19.x] (#1487)

Backport 3795148 from #1469
---
As it stands, the solenoid B-field code uses the `boost::math::ellint_1` and `boost::math::ellint_2` methods. For reasons which I do not understand, these functions use 80-bit x87 floating point numbers, which leads to a lot of conversion between x87 and SSE floating points. It also has the downside that x87 operations are generally slower then their SSE counterparts on modern CPUs.

The interesting thing is that Boost's method for deciding on the floating point precision seems to be quite arbitrary; even when the
functions are explicitly templated to use 64-bit `double` types, the internals of these functions still use 80-bit floating points.

The best solution I have found is to use a compiler preprocessor flag, `BOOST_MATH_NO_LONG_DOUBLE_MATH_FUNCTIONS`, which tells Boost not to use x87 floating points at all. Enabling this flag only in the solenoid B-field translation unit should be safe and improve performance. Very unscientifically benchmarked on my Intel i5-7300U CPU, the solenoid B-field benchmark with the old code takes 615.7±7.3µs per iteration, while the new code takes 353.7±7.0µs, a speedup of about 75%.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport develop/v19.x Backport this PR to the v19.x series Component - Core Affects the Core module Impact - Minor Nuissance bug and/or affects only a single module Improvement Changes to an existing feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants