Support for sm_90a in <nv/target> API #1270

jrhemstad · 2024-01-10T20:22:41Z

Summary:

Currently, <nv/target> does not support sm_90a. The problem with sm_90a is not its non-numeric nature but the fact that it includes features that might not be supported in future architectures, which breaks the assumptions <nv/target> was built around. Adding support for sm_90a would require a significant redesign of the API.

Requested Feature:

<nv/target> should supported sm_90a.

It's important to differentiate between numerical architecture values and feature-specific checks.

Suggestions include introducing new macros like NV_HAS_FEATURE_SM90A or NV_HAS_FEATURE_SM100FOO for feature-specific checks.

The goal is to make it clear and meaningful when writing code for specific architectures and their features.

Next Steps:

Discuss the possibility of introducing a new API or mechanism for feature-specific checks.
Explore the implementation of NV_HAS_FEATURE_SM90A or similar macros.
Define the behavior of IS_EXACTLY and HAS_FEATURE in the context of feature-specific SMs.

The text was updated successfully, but these errors were encountered:

ahendriksen · 2024-01-11T08:45:00Z

Define the behavior of IS_EXACTLY and HAS_FEATURE in the context of feature-specific SMs.

The current behavior of IS_EXACTLY_SM90 is that it is true when compiling for SM90a (godbolt). To not break existing code (if it exists), NV_IS_EXACTLY_SM90 should continue to be true for SM90a.

It would confusing when -arch sm90a is specified to have NV_IS_EXACTLY_SM90 and NV_IS_EXACTLY_SM90a both be true at the same time. In the below code example, the SM90a specific code would never run:

NV_DISPATCH_TARGET(
  NV_IS_EXACTLY_SM_90, (...), 
  NV_IS_EXACTLY_SM_90a, (/* this will never run */),
)

The proposal by @dkolsen-pgi to therefore change the syntax for checking for architecture specific features makes a lot of sense. It would introduce the NV_HAS_FEATURE_SM90A macro that is true only when compiling for SM90a. It will not fix the above code example, but make it less confusing for users why it doesn't work. We would get:

NV_DISPATCH_TARGET(
  NV_IS_EXACTLY_SM_90, (...),
  NV_HAS_FEATURE_SM_90a, (/* this will still not run */),
)
// The fix is to reorder the dispatch targets:
NV_DISPATCH_TARGET(
  NV_HAS_FEATURE_SM_90a, (/* this will run on -arch sm90a */),
  NV_IS_EXACTLY_SM_90, (/* this will run on -arch sm90 */),
)

As noted in internal discussion by @wmaxey, the NV_HAS_FEATURE_SM90A has exactly the same behavior as a hypothetical NV_IS_EXACTLY_SM90a macro would have, but just different syntax.

Tagging @gonzalobg

ahendriksen · 2024-01-11T09:00:40Z

One thing that I see coming up is that a feature is arch-specific in multiple architectures (I can name at least one PTX instruction). To use this feature, I don't want to do:

NV_DISPATCH_TARGET(
  NV_HAS_FEATURE_SM_90a, (/* Code block X. This will run on -arch sm90a */),
  NV_HAS_FEATURE_SM_100a, (/* Repeat of code block X. This will run on -arch sm100a */),
  NV_PROVIDES_SM_90, (/* Code block Y. This will run on -arch sm90 and -arch sm100 */),
)

It would be great if we could have something like this:

NV_DISPATCH_TARGET(
  NV_HAS_FEATURE_SM_90a || NV_HAS_FEATURE_SM_100a, (
         /* Code block X. This will run on -arch sm90a and on -arch sm100a */
  ),
  NV_PROVIDES_SM_90, (/* Code block Y. This will run on -arch sm90 and -arch sm100 */),
)

This was referenced Feb 20, 2024

[EPIC]: Add Hopper features to cuda::ptx #1340

Open

Add support for sm_90a in <nv/target> API #1411

Merged

miscco closed this as completed in #1411 Feb 23, 2024

drisspg mentioned this issue Jun 20, 2024

Update sm90 to -> sm90a pytorch/builder#1878

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for sm_90a in <nv/target> API #1270

Support for sm_90a in <nv/target> API #1270

jrhemstad commented Jan 10, 2024 •

edited

Loading

ahendriksen commented Jan 11, 2024

ahendriksen commented Jan 11, 2024

Support for sm_90a in <nv/target> API #1270

Support for sm_90a in <nv/target> API #1270

Comments

jrhemstad commented Jan 10, 2024 • edited Loading

Summary:

Requested Feature:

Next Steps:

ahendriksen commented Jan 11, 2024

ahendriksen commented Jan 11, 2024

jrhemstad commented Jan 10, 2024 •

edited

Loading