-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GPU auto-detect capability for kernel builds #341
base: master
Are you sure you want to change the base?
GPU auto-detect capability for kernel builds #341
Conversation
Fixes to CI -should work in both environments
If it's for compiling for the local architecture, why not just use Also, note the system could have more than one GPU and https://docs.nvidia.com/cuda/cuda-compiler-driver-nvcc/#gpu-architecture-arch
|
I think -arch=native is a relatively new option. |
any reason we don't do this in main makefile too? |
huh |
Okay the extra space at the end of 80 is fixed. Also, fixed the command line override too. Tested all 3 cases on Ubuntu. One strange thing - it appears that the = after the generate-code was superfluous. It didn't seem to make any difference leaving it there or removing it. So, these two below appear to run fine even though there's an extra = in there. What's the right syntax?
|
Added check for command line override
Main Makefile GPU auto-detect change is here: #371 |
Fixes to CI -should work in both environments
This is a proposal in case there is interest for kernel builds.
Usage:
Auto detect GPU capability:
make
(e.g. if your GPU capability type is 80 then --generate-code=arch=compute_80,code=[compute_80,sm_80] is used with CFLAGS)
Do not specify capability:
make GPU_COMPUTE_CAPABILITY=
(CFLAGS = -O3 --use_fast_math)
Override capability:
make GPU_COMPUTE_CAPABILITY=86
(e.g. even if your GPU capability type is 80 then --generate-code=arch=compute_86,code=[compute_86,sm_86] is used with CFLAGS)
Tested on Linux Ubuntu 22.04 only.