Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Detect gpu architecture in build.rs and pass --gpu-architecture flag to nvcc #438

Closed
Tracked by #278
coreylowman opened this issue Feb 8, 2023 · 4 comments · Fixed by #474
Closed
Tracked by #278

Detect gpu architecture in build.rs and pass --gpu-architecture flag to nvcc #438

coreylowman opened this issue Feb 8, 2023 · 4 comments · Fixed by #474

Comments

@coreylowman
Copy link
Owner

Unsure how to detect as of now, so part of this issue is to figure that out!

@coreylowman coreylowman mentioned this issue Feb 10, 2023
47 tasks
@Narsil
Copy link
Contributor

Narsil commented Feb 19, 2023

nvidia-smi --query-gpu=compute_cap --format=csv

Or

cudaDeviceProp deviceProp;
cudaGetDeviceProperties(&deviceProp, dev);
std::printf("%d.%d\n", deviceProp.major, deviceProp.minor)

Might be better though (nvidia-smi is not necessarily installed)
https://stackoverflow.com/questions/48283009/nvcc-get-device-compute-capability-in-runtime

@JoeOsborn
Copy link

JoeOsborn commented Feb 19, 2023

Two questions on this:

  1. Is -arch=native what is being asked for here? (Per https://docs.nvidia.com/cuda/cuda-compiler-driver-nvcc/index.html#options-for-steering-gpu-code-generation )
  2. Would -arch=all or -arch=all-major be a better choice for release builds?

Otherwise you might need to get the device name(s) and check them against this list or something: https://docs.nvidia.com/cuda/cuda-compiler-driver-nvcc/index.html#gpu-feature-list

@Narsil
Copy link
Contributor

Narsil commented Feb 20, 2023

Would -arch=all or -arch=all-major be a better choice for release builds?

I think not, one asset of being a compiled framework, is that we can make it highly optimized for the highest possible compute capability making small and efficient binaries for a specific platform it's supposed to work on.

Making this overridable for users that want to create shared binaries would be good. But I don't think it should be the default.

@coreylowman
Copy link
Owner Author

Oh -arch=native seems exactly what we want! Easy change too, thanks for sharing that

coreylowman added a commit that referenced this issue Feb 22, 2023
* #438 using --gpu-architecture native for nvcc

* Revert adding cuda to default features

* Fixing ci-check for cuda
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants