Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make fails to autodetect GPU compute capability #387

Open
akulchik opened this issue May 8, 2024 · 2 comments · May be fixed by #388
Open

make fails to autodetect GPU compute capability #387

akulchik opened this issue May 8, 2024 · 2 comments · May be fixed by #388

Comments

@akulchik
Copy link

akulchik commented May 8, 2024

Running make (e.g., make test_gpt2) on my PC outputs the following:

make: __nvcc_device_query: No such file or directory
"Detected GPU compute capability: "
---------------------------------------------
→ cuDNN is manually disabled by default, run make with `USE_CUDNN=1` to try to enable
✓ OpenMP found
✓ OpenMPI found, OK to train with multiple GPUs
✓ nvcc found, including GPU/CUDA support
---------------------------------------------
cc -Ofast -Wno-unused-result -Wno-ignored-pragmas -Wno-unknown-attributes -march=native -fopenmp -DOMP   test_gpt2.c -lm -lgomp -o test_gpt2

Although my PC has RTX 4090 and, as can be seen, nvcc is found. I have already found a solution which relies on nvidia-smi rather than __nvcc_device_query (which suspiciously looks like something an intentionally hidden/temporary file) and the problem is gone. With this change, make stops complaining about __nvcc_device_query:

---------------------------------------------
→ cuDNN is manually disabled by default, run make with `USE_CUDNN=1` to try to enable
✓ OpenMP found
✓ OpenMPI found, OK to train with multiple GPUs
✓ nvcc found, including GPU/CUDA support
---------------------------------------------
cc -Ofast -Wno-unused-result -Wno-ignored-pragmas -Wno-unknown-attributes -march=native -fopenmp -DOMP   test_gpt2.c -lm -lgomp -o test_gpt2
@akulchik akulchik linked a pull request May 8, 2024 that will close this issue
@rosslwheeler
Copy link
Contributor

@akulchik - what toolkit version are you using and what OS?

@rosslwheeler
Copy link
Contributor

rosslwheeler commented May 9, 2024

Just did a check on an older 11.7 Cuda SDK and the file is there. I think your installation might have a problem. I do like your change but not sure it it's urgent unless it's critical that we support the older SDKs with the auto-detect. Can you try either reinstalling or using the latest 12.4.1 SDK? The file is supposed to be installed with nvcc.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants