-
Notifications
You must be signed in to change notification settings - Fork 15.3k
[CUDA][HIP] Fix CTAD for host/device constructors #168711
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -287,6 +287,46 @@ Example Usage | |
| basePtr->virtualFunction(); // Allowed since obj is constructed in device code | ||
| } | ||
|
|
||
| C++17 Class Template Argument Deduction (CTAD) Support | ||
| ====================================================== | ||
|
|
||
| Clang supports C++17 Class Template Argument Deduction (CTAD) in both host and device code for HIP. | ||
| This allows you to omit template arguments when creating class template instances, letting the compiler | ||
| deduce them from constructor arguments. | ||
|
|
||
| .. code-block:: c++ | ||
|
|
||
| #include <tuple> | ||
|
|
||
| __host__ __device__ void func() { | ||
| std::tuple<int, int> t = std::tuple(1, 1); | ||
| } | ||
|
|
||
| In the above example, ``std::tuple(1, 1)`` automatically deduces the type to be ``std::tuple<int, int>``. | ||
|
|
||
| Deduction Guides | ||
| ---------------- | ||
|
|
||
| User-defined deduction guides are also supported. Since deduction guides are not executable code and only | ||
| participate in type deduction, they are treated as ``__host__ __device__`` by the compiler, regardless of | ||
| explicit target attributes. This ensures they are available for deduction in both host and device contexts. | ||
|
|
||
| .. code-block:: c++ | ||
|
|
||
| template <typename T> | ||
| struct MyType { | ||
| T value; | ||
| MyType(T v) : value(v) {} | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. constructor should be invocable on the GPU, otherwise the sample does not compile: https://godbolt.org/z/PYo3n5aGK
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. you are right. this example does not work. will add device attr to the ctor. |
||
| }; | ||
|
|
||
| // User-defined deduction guide | ||
| template <typename T> | ||
| MyType(T) -> MyType<T>; | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. With To reproduce the same issue you demonstrated with std::tuple deviceFunc must also be HD. Or, perhaps the example should be a bit more complex. In any case, the immediate issue we need to deal with is deduction failures on attempts to use host-side deduction guides supplied by the standard library from D or HD functions. A big-picture issue is to figure out a consistent way to mix target attributes and deduction guides. Proposed solution is to treat the guides as HD, and ignore explicit target attributes. It allows the code to compile, but ignoring target attributes is the best idea. If we do not want to use them for the guides, then we should diagnose it. I'm not sure if there's any existing CUDA code that may rely on the deduction guides with attributes. @miscco -- you may be the best person to answer this. Would target-specific deduction guides be useful in practice? My preference would be, in order:
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It seems nvcc just treats deduction guide as host+device and silently ignores host/device attributes on the deduction guide (https://godbolt.org/z/8s74G7vvh). If we emit warnings for host/device attrs on deduction guide, CUDA programs that can be compiled by nvcc may fail with clang under -Wall.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Fair point, but we should still consider the big picture. NVCC's current way of handling this may have been driven by convenience more than the sound principles. For deduction guidelines we do not have to worry much about existing code being compilable. The fix to make code compatible with either approach is also simple -- just remove attributes, or make them HD. I don't have a good picture where we're going to end up with this, so I'd rather keep our options open. It's generally easier to relax restrictions than to tighten them. If we accept attributes, but ignore them, they will proliferate. Then, if we want to make attribute-based guides work in the future, the change will break existing users. On the other hand, if we start with a strict "no H or D" requirement, allowing H or D attributes in the future will be much easier, as it would only apply to the new code. To think of it, I'm starting to think that H or D use should be an error, not a warning. At least for now. A second order issue with accepting but ignoring H/D without diagnostics is that it makes handling H/D inconsistent. Whoever reads the code will have to be on guard -- if they see |
||
|
|
||
| __device__ void deviceFunc() { | ||
| MyType m(10); // Deduces MyType<int> | ||
| } | ||
|
|
||
| Host and Device Attributes of Default Destructors | ||
| =================================================== | ||
|
|
||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,47 @@ | ||
| // RUN: %clang_cc1 -std=c++17 -triple nvptx64-nvidia-cuda -fsyntax-only \ | ||
| // RUN: -fcuda-is-device -verify=expected,dev %s | ||
| // RUN: %clang_cc1 -std=c++17 -triple nvptx64-nvidia-cuda -fsyntax-only \ | ||
| // RUN: -verify %s | ||
|
|
||
| #include "Inputs/cuda.h" | ||
|
|
||
| template <class T> | ||
| struct CTADType { // expected-note 2{{candidate constructor (the implicit copy constructor) not viable: requires 1 argument, but 3 were provided}} | ||
| // expected-note@-1 2{{candidate constructor (the implicit move constructor) not viable: requires 1 argument, but 3 were provided}} | ||
| T first; | ||
| T second; | ||
|
|
||
| CTADType(T x) : first(x), second(x) {} // expected-note 2{{candidate constructor not viable: requires single argument 'x', but 3 arguments were provided}} | ||
| __device__ CTADType(T x) : first(x), second(x) {} // expected-note 2{{candidate constructor not viable: requires single argument 'x', but 3 arguments were provided}} | ||
| __host__ __device__ CTADType(T x, T y) : first(x), second(y) {} // expected-note 2{{candidate constructor not viable: requires 2 arguments, but 3 were provided}} | ||
| CTADType(T x, T y, T z) : first(x), second(z) {} // dev-note {{'CTADType' declared here}} | ||
| // expected-note@-1 {{candidate constructor not viable: call to __host__ function from __device__ function}} | ||
| // expected-note@-2 {{candidate constructor not viable: call to __host__ function from __global__ function}} | ||
| }; | ||
|
|
||
| template <class T> | ||
| CTADType(T, T) -> CTADType<T>; | ||
|
|
||
| __host__ __device__ void use_ctad_host_device() { | ||
| CTADType ctad_from_two_args(1, 1); | ||
| CTADType ctad_from_one_arg(1); | ||
| CTADType ctad_from_three_args(1, 2, 3); // dev-error {{reference to __host__ function 'CTADType' in __host__ __device__ function}} | ||
| } | ||
|
|
||
| __host__ void use_ctad_host() { | ||
| CTADType ctad_from_two_args(1, 1); | ||
| CTADType ctad_from_one_arg(1); | ||
| CTADType ctad_from_three_args(1, 2, 3); | ||
| } | ||
|
|
||
| __device__ void use_ctad_device() { | ||
| CTADType ctad_from_two_args(1, 1); | ||
| CTADType ctad_from_one_arg(1); | ||
| CTADType<int> ctad_from_three_args(1, 2, 3); // expected-error {{no matching constructor for initialization of 'CTADType<int>'}} | ||
| } | ||
|
|
||
| __global__ void use_ctad_global() { | ||
| CTADType ctad_from_two_args(1, 1); | ||
| CTADType ctad_from_one_arg(1); | ||
| CTADType<int> ctad_from_three_args(1, 2, 3); // expected-error {{no matching constructor for initialization of 'CTADType<int>'}} | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not convinced that ignoring target attribute is the best approach here.
Perhaps we should require deduction guides to always be host/device, either explicitly or implicitly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we could emit a warning if a deduction guide has host or device attribute, saying that deduction guide is always host+device and the explicit host/device attribute is ignored.