Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
40 changes: 40 additions & 0 deletions clang/docs/HIPSupport.rst
Original file line number Diff line number Diff line change
Expand Up @@ -287,6 +287,46 @@ Example Usage
basePtr->virtualFunction(); // Allowed since obj is constructed in device code
}

C++17 Class Template Argument Deduction (CTAD) Support
======================================================

Clang supports C++17 Class Template Argument Deduction (CTAD) in both host and device code for HIP.
This allows you to omit template arguments when creating class template instances, letting the compiler
deduce them from constructor arguments.

.. code-block:: c++

#include <tuple>

__host__ __device__ void func() {
std::tuple<int, int> t = std::tuple(1, 1);
}

In the above example, ``std::tuple(1, 1)`` automatically deduces the type to be ``std::tuple<int, int>``.

Deduction Guides
----------------

User-defined deduction guides are also supported. Since deduction guides are not executable code and only
participate in type deduction, they are treated as ``__host__ __device__`` by the compiler, regardless of
explicit target attributes. This ensures they are available for deduction in both host and device contexts.
Comment on lines +311 to +312
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not convinced that ignoring target attribute is the best approach here.

Perhaps we should require deduction guides to always be host/device, either explicitly or implicitly.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we could emit a warning if a deduction guide has host or device attribute, saying that deduction guide is always host+device and the explicit host/device attribute is ignored.


.. code-block:: c++

template <typename T>
struct MyType {
T value;
MyType(T v) : value(v) {}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

constructor should be invocable on the GPU, otherwise the sample does not compile: https://godbolt.org/z/PYo3n5aGK

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you are right. this example does not work. will add device attr to the ctor.

};

// User-defined deduction guide
template <typename T>
MyType(T) -> MyType<T>;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With __device__ constructor, this example compiler regardless of the target attributes of the guide. https://godbolt.org/z/nxhGjh7dr

To reproduce the same issue you demonstrated with std::tuple deviceFunc must also be HD. Or, perhaps the example should be a bit more complex.

In any case, the immediate issue we need to deal with is deduction failures on attempts to use host-side deduction guides supplied by the standard library from D or HD functions. A big-picture issue is to figure out a consistent way to mix target attributes and deduction guides.

Proposed solution is to treat the guides as HD, and ignore explicit target attributes. It allows the code to compile, but ignoring target attributes is the best idea. If we do not want to use them for the guides, then we should diagnose it.

I'm not sure if there's any existing CUDA code that may rely on the deduction guides with attributes. @miscco -- you may be the best person to answer this. Would target-specific deduction guides be useful in practice?

My preference would be, in order:
a) do not allow target attributes, and treat guides as HD. It's simple and unambiguous.
b) treat guides as regular functions, so target attributes apply. Treat guides in system headers as HD. I think we already have pragmas for doing that for other standard library parts. This allows standard library to work, but keeps target attribute use in the user code consistent between guides and functions.
c) always treat guides as HD, but diagnose as a warning when an explicit host or device attribute is used.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems nvcc just treats deduction guide as host+device and silently ignores host/device attributes on the deduction guide (https://godbolt.org/z/8s74G7vvh). If we emit warnings for host/device attrs on deduction guide, CUDA programs that can be compiled by nvcc may fail with clang under -Wall.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair point, but we should still consider the big picture. NVCC's current way of handling this may have been driven by convenience more than the sound principles. For deduction guidelines we do not have to worry much about existing code being compilable. The fix to make code compatible with either approach is also simple -- just remove attributes, or make them HD.

I don't have a good picture where we're going to end up with this, so I'd rather keep our options open. It's generally easier to relax restrictions than to tighten them. If we accept attributes, but ignore them, they will proliferate. Then, if we want to make attribute-based guides work in the future, the change will break existing users. On the other hand, if we start with a strict "no H or D" requirement, allowing H or D attributes in the future will be much easier, as it would only apply to the new code.

To think of it, I'm starting to think that H or D use should be an error, not a warning. At least for now.

A second order issue with accepting but ignoring H/D without diagnostics is that it makes handling H/D inconsistent. Whoever reads the code will have to be on guard -- if they see __device__ they would need to keep in mind that if it's applied to a deduction guide, then it's not real. I'd rather keep target attribute semantics unambiguous -- if the target attribute is used, it does make a meaningful difference. While it is technically an attribute, semantically it's closer to a language keyword. Dropping/ignoring it when it's written by the user is not something we should do.


__device__ void deviceFunc() {
MyType m(10); // Deduces MyType<int>
}

Host and Device Attributes of Default Destructors
===================================================

Expand Down
12 changes: 12 additions & 0 deletions clang/lib/Sema/SemaCUDA.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -137,6 +137,12 @@ CUDAFunctionTarget SemaCUDA::IdentifyTarget(const FunctionDecl *D,
if (D == nullptr)
return CurCUDATargetCtx.Target;

// C++ deduction guides are never codegen'ed and only participate in template
// argument deduction. Treat them as if they were always host+device so that
// CUDA/HIP target checking never rejects their use based solely on target.
if (isa<CXXDeductionGuideDecl>(D))
return CUDAFunctionTarget::HostDevice;

if (D->hasAttr<CUDAInvalidTargetAttr>())
return CUDAFunctionTarget::InvalidTarget;

Expand Down Expand Up @@ -907,6 +913,12 @@ bool SemaCUDA::CheckCall(SourceLocation Loc, FunctionDecl *Callee) {
if (ExprEvalCtx.isUnevaluated() || ExprEvalCtx.isConstantEvaluated())
return true;

// C++ deduction guides participate in overload resolution but are not
// callable functions and are never codegen'ed. Treat them as always
// allowed for CUDA/HIP compatibility checking.
if (isa<CXXDeductionGuideDecl>(Callee))
return true;

// FIXME: Is bailing out early correct here? Should we instead assume that
// the caller is a global initializer?
FunctionDecl *Caller = SemaRef.getCurFunctionDecl(/*AllowLambda=*/true);
Expand Down
28 changes: 26 additions & 2 deletions clang/lib/Sema/SemaTemplateDeductionGuide.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -218,9 +218,33 @@ buildDeductionGuide(Sema &SemaRef, TemplateDecl *OriginalTemplate,
TInfo->getTypeLoc().castAs<FunctionProtoTypeLoc>().getParams();

// Build the implicit deduction guide template.
QualType GuideType = TInfo->getType();

// In CUDA/HIP mode, avoid creating duplicate implicit deduction guides with
// identical function types. This can happen when there are separate
// __host__ and __device__ constructors with the same signature; each would
// otherwise synthesize its own implicit deduction guide, leading to
// ambiguous CTAD purely due to target attributes. For such cases we keep the
// first guide we created and skip building another one.
if (IsImplicit && Ctor && SemaRef.getLangOpts().CUDA) {
for (NamedDecl *Existing : DC->lookup(DeductionGuideName)) {
auto *ExistingFT = dyn_cast<FunctionTemplateDecl>(Existing);
auto *ExistingGuide =
ExistingFT
? dyn_cast<CXXDeductionGuideDecl>(ExistingFT->getTemplatedDecl())
: dyn_cast<CXXDeductionGuideDecl>(Existing);
if (!ExistingGuide)
continue;

if (SemaRef.Context.hasSameType(ExistingGuide->getType(), GuideType)) {
return Existing;
}
}
}

auto *Guide = CXXDeductionGuideDecl::Create(
SemaRef.Context, DC, LocStart, ES, Name, TInfo->getType(), TInfo, LocEnd,
Ctor, DeductionCandidate::Normal, FunctionTrailingRC);
SemaRef.Context, DC, LocStart, ES, Name, GuideType, TInfo, LocEnd, Ctor,
DeductionCandidate::Normal, FunctionTrailingRC);
Guide->setImplicit(IsImplicit);
Guide->setParams(Params);

Expand Down
47 changes: 47 additions & 0 deletions clang/test/SemaCUDA/deduction-guide.cu
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
// RUN: %clang_cc1 -std=c++17 -triple nvptx64-nvidia-cuda -fsyntax-only \
// RUN: -fcuda-is-device -verify=expected,dev %s
// RUN: %clang_cc1 -std=c++17 -triple nvptx64-nvidia-cuda -fsyntax-only \
// RUN: -verify %s

#include "Inputs/cuda.h"

template <class T>
struct CTADType { // expected-note 2{{candidate constructor (the implicit copy constructor) not viable: requires 1 argument, but 3 were provided}}
// expected-note@-1 2{{candidate constructor (the implicit move constructor) not viable: requires 1 argument, but 3 were provided}}
T first;
T second;

CTADType(T x) : first(x), second(x) {} // expected-note 2{{candidate constructor not viable: requires single argument 'x', but 3 arguments were provided}}
__device__ CTADType(T x) : first(x), second(x) {} // expected-note 2{{candidate constructor not viable: requires single argument 'x', but 3 arguments were provided}}
__host__ __device__ CTADType(T x, T y) : first(x), second(y) {} // expected-note 2{{candidate constructor not viable: requires 2 arguments, but 3 were provided}}
CTADType(T x, T y, T z) : first(x), second(z) {} // dev-note {{'CTADType' declared here}}
// expected-note@-1 {{candidate constructor not viable: call to __host__ function from __device__ function}}
// expected-note@-2 {{candidate constructor not viable: call to __host__ function from __global__ function}}
};

template <class T>
CTADType(T, T) -> CTADType<T>;

__host__ __device__ void use_ctad_host_device() {
CTADType ctad_from_two_args(1, 1);
CTADType ctad_from_one_arg(1);
CTADType ctad_from_three_args(1, 2, 3); // dev-error {{reference to __host__ function 'CTADType' in __host__ __device__ function}}
}

__host__ void use_ctad_host() {
CTADType ctad_from_two_args(1, 1);
CTADType ctad_from_one_arg(1);
CTADType ctad_from_three_args(1, 2, 3);
}

__device__ void use_ctad_device() {
CTADType ctad_from_two_args(1, 1);
CTADType ctad_from_one_arg(1);
CTADType<int> ctad_from_three_args(1, 2, 3); // expected-error {{no matching constructor for initialization of 'CTADType<int>'}}
}

__global__ void use_ctad_global() {
CTADType ctad_from_two_args(1, 1);
CTADType ctad_from_one_arg(1);
CTADType<int> ctad_from_three_args(1, 2, 3); // expected-error {{no matching constructor for initialization of 'CTADType<int>'}}
}