Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GPU auto-detect capability for kernel builds #341

Open
wants to merge 14 commits into
base: master
Choose a base branch
from

Conversation

rosslwheeler
Copy link
Contributor

@rosslwheeler rosslwheeler commented May 3, 2024

Fixes to CI -should work in both environments

This is a proposal in case there is interest for kernel builds.

Usage:

Auto detect GPU capability:

make
(e.g. if your GPU capability type is 80 then --generate-code=arch=compute_80,code=[compute_80,sm_80] is used with CFLAGS)

Do not specify capability:

make GPU_COMPUTE_CAPABILITY=
(CFLAGS = -O3 --use_fast_math)

Override capability:

make GPU_COMPUTE_CAPABILITY=86
(e.g. even if your GPU capability type is 80 then --generate-code=arch=compute_86,code=[compute_86,sm_86] is used with CFLAGS)

Tested on Linux Ubuntu 22.04 only.

Fixes to CI -should work in both environments
@rosslwheeler rosslwheeler changed the title GPU auto-detect capability GPU auto-detect capability for kernel builds May 3, 2024
@rosslwheeler rosslwheeler marked this pull request as ready for review May 3, 2024 07:04
@alecco
Copy link

alecco commented May 3, 2024

If it's for compiling for the local architecture, why not just use -arch=native?

Also, note the system could have more than one GPU and -arch=native will compile for all GPUs present:

https://docs.nvidia.com/cuda/cuda-compiler-driver-nvcc/#gpu-architecture-arch

When -arch=native is specified, nvcc detects the visible GPUs on the system and generates codes for them, no PTX program will be generated for this option. It is a warning if there are no visible supported GPU on the system, and the default architecture will be used.

@ngc92
Copy link
Contributor

ngc92 commented May 3, 2024

I think -arch=native is a relatively new option.
nvidia-cuda-toolkit in ubuntu 22.04 comes with 11.5, which doesn't support this option yet.

@karpathy
Copy link
Owner

karpathy commented May 5, 2024

any reason we don't do this in main makefile too?

@karpathy
Copy link
Owner

karpathy commented May 5, 2024

~/llm.c/dev/cuda$ make gelu_backward
/usr/bin/nvcc -O3 --use_fast_math --generate-code=arch=compute_80 ,code=[compute_80 ,sm_80 ] -lcublas -lcublasLt gelu_backward.cu -o gelu_backward
nvcc fatal   : Option '--generate-code arch=compute_80', missing code
make: *** [Makefile:27: gelu_backward] Error 1

huh

@rosslwheeler
Copy link
Contributor Author

rosslwheeler commented May 6, 2024

Okay the extra space at the end of 80 is fixed. Also, fixed the command line override too. Tested all 3 cases on Ubuntu.

One strange thing - it appears that the = after the generate-code was superfluous. It didn't seem to make any difference leaving it there or removing it.

So, these two below appear to run fine even though there's an extra = in there. What's the right syntax?

NVCC_FLAGS = -O3 -t=0 --use_fast_math --generate-code=arch=compute_80,code=[compute_80,sm_80]
NVCC_FLAGS = -O3 -t=0 --use_fast_math --generate-code arch=compute_80,code=[compute_80,sm_80]

@rosslwheeler
Copy link
Contributor Author

rosslwheeler commented May 6, 2024

Main Makefile GPU auto-detect change is here: #371

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants