Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] Handle sm_* which is no longer supported by CUDA / ptxas exec check or configure check? #30

Open
tob2 opened this issue Mar 3, 2022 · 2 comments · Fixed by #33

Comments

@tob2
Copy link
Contributor

tob2 commented Mar 3, 2022

@tschwinge @vries
From https://gcc.gnu.org/pipermail/gcc-patches/2022-March/591154.html

In PR97348, we ran into the problem that recent CUDA dropped support for
sm_30, which inhibited the build when building with CUDA bin in the path,
because the nvptx-tools assembler uses CUDA's ptxas to do ptx verification.
...
Deal with PR97348 in the simplest way possible: when calling the assembler for
sm_30, specify --no-verify.

This has the unfortunate effect that after fixing PR104758 by building
libraries with sm_30, the libraries are no longer verified. This can be
improved upon by:

  • adding a configure test in gcc that tests if CUDA supports sm_30, and
    if so disabling this patch
  • dealing with this in nvptx-tools somehow, either:
    • detect at ptxas execution time that it doesn't support sm_30, or
    • detect this at nvptx-tool configure time.
xionghul pushed a commit to xionghul/gcc that referenced this issue Mar 31, 2022
Newer versions of CUDA no longer support sm_30, and nvptx-tools as
currently doesn't handle that gracefully when verifying
( SourceryTools/nvptx-tools#30 ).

There's a --no-verify work-around in place in ASM_SPEC, but that one doesn't
work when using -Wa,--verify on the command line.

Use a more robust workaround: verify using sm_35 when misa=sm_30 is specified
(either implicitly or explicitly).

Tested on nvptx.

gcc/ChangeLog:

2022-03-30  Tom de Vries  <tdevries@suse.de>

	* config/nvptx/nvptx.h (ASM_SPEC): Use "-m sm_35" for -misa=sm_30.
tschwinge added a commit that referenced this issue Apr 7, 2022
…itecture based products is dropped"

This resolves #30 "[RFC] Handle sm_* which is no longer supported by
CUDA / ptxas exec check or configure check?".

Suggested-by: Tom de Vries <tdevries@suse.de>
@tschwinge
Copy link
Member

Another thing that we could do: instead of invoking command-line ptxas for verification, we could dlopen the CUDA Driver (libcuda), and use that one to load and thus verify the PTX code (just like libgomp nvptx plugin and nvptx-tools nvptx-run are doing): https://docs.nvidia.com/cuda/cuda-driver-api/group__CUDA__MODULE.html, etc.
That way, we could first check which .target values are actually supported by the current CUDA installation, and only adjust if actually necessary. (Well, or probably not at all, since all current versions of the CUDA Driver actually still do load .target sm_30 code?)
We'd avoid using the "narrow" ptxas command-line interface, and instead get the "full" API. (The latter hopefully doesn't bring its own set of compatibility concerns? Not expecting any, given that the libgomp nvptx plugin and nvptx-tools nvptx-run have been stable for a long time in that regard.)

How does the CUDA Driver API differ/relate to the PTX Compiler APIs, https://docs.nvidia.com/cuda/ptx-compiler-api/?

@tob2, @vries

@vries
Copy link
Contributor

vries commented Apr 12, 2022

Another thing that we could do: instead of invoking command-line ptxas for verification, we could dlopen the CUDA Driver (libcuda), and use that one to load and thus verify the PTX code (just like libgomp nvptx plugin and nvptx-tools nvptx-run are doing): https://docs.nvidia.com/cuda/cuda-driver-api/group__CUDA__MODULE.html, etc.
That way, we could first check which .target values are actually supported by the current CUDA installation, and only adjust if actually necessary. (Well, or probably not at all, since all current versions of the CUDA Driver actually still do load .target sm_30 code?) We'd avoid using the "narrow" ptxas command-line interface, and instead get the "full" API. (The latter hopefully doesn't bring its own set of compatibility concerns? Not expecting any, given that the libgomp nvptx plugin and nvptx-tools nvptx-run have been stable for a long time in that regard.)

I think I follow your thinking: using the cuda driver api means you're using a defined api rather than the defacto ptxas interface. Note that what you describe checks what .target values are supported by the CUDA driver api, not the CUDA runtime api (in other words, ptxas and friends). Those are separate things.

How does the CUDA Driver API differ/relate to the PTX Compiler APIs, https://docs.nvidia.com/cuda/ptx-compiler-api/?

Ah, did know that one. Well, after I read the introduction part, mainly this looks like the CUDA runtime api equivalent of the CUDA driver api part that does the ptx compilation.

At first glance, it sounds like that part you could use instead of ptxas. But, it's also a recent addition, meaning you'd also require a modern CUDA installation, which kind of could defeat the purpose, or give a very narrow range of CUDAs for which this setup would actually be useful.

Note that it's not a good idea to use the driver API because that requires one to have a driver installed, which doesn't work in a scenario where you use say a server class machine to do heavy toolchain and/or apps builds, and then execute on a separate machine with bulky video cards. We don't want to require (for verification purposes) installing the driver on the server class machine.

@tschwinge tschwinge reopened this Apr 12, 2022
tschwinge added a commit that referenced this issue Sep 4, 2023
..., so that things keep working with CUDA 12.0+.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants