Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow SCAMP binaries to be optionally redistributable #99

Merged
merged 35 commits into from
Jun 6, 2022

Conversation

zpzim
Copy link
Owner

@zpzim zpzim commented Jun 2, 2022

Since the beginning of the project, SCAMP has used -march=native and the like to maximize CPU performance on the host system. Unfortunately, this also makes it so we can't redistribute SCAMP binaries or pyscamp wheels, etc.

This PR adds an optional flag to remove compiler flags which are not redistributable SCAMP_ENABLE_BINARY_DISTRIBUTON

In order to retain some semblance of performance in this mode. I have added code paths in the cpu kernel library which will execute based on runtime checks of the host CPU architecture.

In particular, there is a code path for AVX and AVX2. These paths will only be triggered if the FMA instruction is also available (this should be true the vast majority of the time).

I did not add an AVX512 path as I don't have a way to test that locally. It might be useful to add one in the future.

zpzim and others added 30 commits May 30, 2022 15:38
This will allow SCAMP to avoid using march=native for performance, allowing SCAMP binaries to be distributed.
…led.

Also adds some flags to increase the chance a compiler will use FMA.
@zpzim zpzim merged commit ae79670 into master Jun 6, 2022
@zpzim zpzim deleted the enable-binary-distribution branch June 8, 2022 17:27
zpzim added a commit that referenced this pull request Jun 18, 2022
* Added runtime dispatch of AVX/AVX2-based CPU kernels. These are conditionally compiled only if they are needed to produce a redistributable binary.

* Add option to disable -march=native configurations and make the SCAMP binary redistributable. This is specified via the environment variable SCAMP_ENABLE_BINARY_DISTRIBUTION=ON

* Adds some flags to increase the chance a compiler will use FMA instructions when they are available.

* Add testing coverage for redistributable binary builds. Including emulation tests with Intel SDE to verify SIMD dispatch runs on various CPU configurations.

* Update main CMakeLists.txt to better specify global compile flags for different build types.

* Update docker container to build in a redistributable way. 

* Update CUDA build tests to use updated action to build on windows-latest.

* Minor performance tuning of CPU kernel unroll widths.

* Prevent unnecessary files from being packaged with pyscamp
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant