174 changes: 174 additions & 0 deletions clang/include/clang/Basic/AttrDocs.td
Original file line number Diff line number Diff line change
Expand Up @@ -455,6 +455,180 @@ The SYCL kernel in the previous code sample meets these expectations.
}];
}

def SYCLKernelEntryPointDocs : Documentation {
let Category = DocCatFunction;
let Content = [{
The ``sycl_kernel_entry_point`` attribute facilitates the generation of an
offload kernel entry point, sometimes called a SYCL kernel caller function,
suitable for invoking a SYCL kernel on an offload device. The attribute is
intended for use in the implementation of SYCL kernel invocation functions
like the ``single_task`` and ``parallel_for`` member functions of the
``sycl::handler`` class specified in section 4.9.4, "Command group ``handler``
class", of the SYCL 2020 specification.

The attribute requires a single type argument that specifies a class type that
meets the requirements for a SYCL kernel name as described in section 5.2,
"Naming of kernels", of the SYCL 2020 specification. A unique kernel name type
is required for each function declared with the attribute. The attribute may
not first appear on a declaration that follows a definition of the function.

The attribute only appertains to functions and only those that meet the
following requirements.

* Has a ``void`` return type.
* Is not a non-static member function, constructor, or destructor.
* Is not a C variadic function.
* Is not a coroutine.
* Is not defined as deleted or as defaulted.
* Is not declared with the ``constexpr`` or ``consteval`` specifiers.
* Is not declared with the ``[[noreturn]]`` attribute.

Use in the implementation of a SYCL kernel invocation function might look as
follows.

.. code-block:: c++

namespace sycl {
class handler {
template<typename KernelNameType, typename KernelType>
[[ clang::sycl_kernel_entry_point(KernelNameType) ]]
static void kernel_entry_point(KernelType kernel) {
kernel();
}

public:
template<typename KernelNameType, typename KernelType>
void single_task(KernelType kernel) {
// Call kernel_entry_point() to trigger generation of an offload
// kernel entry point.
kernel_entry_point<KernelNameType>(kernel);
// Call functions appropriate for the desired offload backend
// (OpenCL, CUDA, HIP, Level Zero, etc...).
}
};
} // namespace sycl

A SYCL kernel is a callable object of class type that is constructed on a host,
often via a lambda expression, and then passed to a SYCL kernel invocation
function to be executed on an offload device. A SYCL kernel invocation function
is responsible for copying the provided SYCL kernel object to an offload
device and initiating a call to it. The SYCL kernel object and its data members
constitute the parameters of an offload kernel.

A SYCL kernel type is required to satisfy the device copyability requirements
specified in section 3.13.1, "Device copyable", of the SYCL 2020 specification.
Additionally, any data members of the kernel object type are required to satisfy
section 4.12.4, "Rules for parameter passing to kernels". For most types, these
rules require that the type is trivially copyable. However, the SYCL
specification mandates that certain special SYCL types, such as
``sycl::accessor`` and ``sycl::stream`` be device copyable even if they are not
trivially copyable. These types require special handling because they cannot
be copied to device memory as if by ``memcpy()``. Additionally, some offload
backends, OpenCL for example, require objects of some of these types to be
passed as individual arguments to the offload kernel.

An offload kernel consists of an entry point function that declares the
parameters of the offload kernel and the set of all functions and variables that
are directly or indirectly used by the entry point function.

A SYCL kernel invocation function invokes a SYCL kernel on a device by
performing the following tasks (likely with the help of an offload backend
like OpenCL):

#. Identifying the offload kernel entry point to be used for the SYCL kernel.

#. Deconstructing the SYCL kernel object, if necessary, to produce the set of
offload kernel arguments required by the offload kernel entry point.

#. Copying the offload kernel arguments to device memory.

#. Initiating execution of the offload kernel entry point.

The offload kernel entry point for a SYCL kernel performs the following tasks:

#. Reconstituting the SYCL kernel object, if necessary, using the offload
kernel parameters.

#. Calling the ``operator()`` member function of the (reconstituted) SYCL kernel
object.

The ``sycl_kernel_entry_point`` attribute automates generation of an offload
kernel entry point that performs those latter tasks. The parameters and body of
a function declared with the ``sycl_kernel_entry_point`` attribute specify a
pattern from which the parameters and body of the entry point function are
derived. Consider the following call to a SYCL kernel invocation function.

.. code-block:: c++

struct S { int i; };
void f(sycl::handler &handler, sycl::stream &sout, S s) {
handler.single_task<struct KN>([=] {
sout << "The value of s.i is " << s.i << "\n";
});
}

The SYCL kernel object is the result of the lambda expression. It has two
data members corresponding to the captures of ``sout`` and ``s``. Since one
of these data members corresponds to a special SYCL type that must be passed
individually as an offload kernel parameter, it is necessary to decompose the
SYCL kernel object into its constituent parts; the offload kernel will have
two kernel parameters. Given a SYCL implementation that uses a
``sycl_kernel_entry_point`` attributed function like the one shown above, an
offload kernel entry point function will be generated that looks approximately
as follows.

.. code-block:: c++

void sycl-kernel-caller-for-KN(sycl::stream sout, S s) {
kernel-type kernel = { sout, s );
kernel();
}

There are a few items worthy of note:

#. The name of the generated function incorporates the SYCL kernel name,
``KN``, that was passed as the ``KernelNameType`` template parameter to
``kernel_entry_point()`` and provided as the argument to the
``sycl_kernel_entry_point`` attribute. There is a one-to-one correspondence
between SYCL kernel names and offload kernel entry points.

#. The SYCL kernel is a lambda closure type and therefore has no name;
``kernel-type`` is substituted above and corresponds to the ``KernelType``
template parameter deduced in the call to ``kernel_entry_point()``.
Lambda types cannot be declared and initialized using the aggregate
initialization syntax used above, but the intended behavior should be clear.

#. ``S`` is a device copyable type that does not directly or indirectly contain
a data member of a SYCL special type. It therefore does not need to be
decomposed into its constituent members to be passed as a kernel argument.

#. The depiction of the ``sycl::stream`` parameter as a single self contained
kernel parameter is an oversimplification. SYCL special types may require
additional decomposition such that the generated function might have three
or more parameters depending on how the SYCL library implementation defines
these types.

#. The call to ``kernel_entry_point()`` has no effect other than to trigger
emission of the entry point function. The statments that make up the body
of the function are not executed when the function is called; they are
only used in the generation of the entry point function.

It is not necessary for a function declared with the ``sycl_kernel_entry_point``
attribute to be called for the offload kernel entry point to be emitted. For
inline functions and function templates, any ODR-use will suffice. For other
functions, an ODR-use is not required; the offload kernel entry point will be
emitted if the function is defined.

Functions declared with the ``sycl_kernel_entry_point`` attribute are not
limited to the simple example shown above. They may have additional template
parameters, declare additional function parameters, and have complex control
flow in the function body. Function parameter decomposition and reconstitution
is performed for all function parameters. The function must abide by the
language feature restrictions described in section 5.4, "Language restrictions
for device functions" in the SYCL 2020 specification.
}];
}

def SYCLSpecialClassDocs : Documentation {
let Category = DocCatStmt;
let Content = [{
Expand Down
13 changes: 13 additions & 0 deletions clang/include/clang/Lex/Preprocessor.h
Original file line number Diff line number Diff line change
Expand Up @@ -2617,6 +2617,19 @@ class Preprocessor {
/// \#pragma GCC poison/system_header/dependency and \#pragma once.
void RegisterBuiltinPragmas();

/// RegisterBuiltinMacro - Register the specified identifier in the identifier
/// table and mark it as a builtin macro to be expanded.
IdentifierInfo *RegisterBuiltinMacro(const char *Name) {
// Get the identifier.
IdentifierInfo *Id = getIdentifierInfo(Name);

// Mark it as being a macro that is builtin.
MacroInfo *MI = AllocateMacroInfo(SourceLocation());
MI->setIsBuiltinMacro();
appendDefMacroDirective(Id, MI);
return Id;
}

/// Register builtin macros such as __LINE__ with the identifier table.
void RegisterBuiltinMacros();

Expand Down
1 change: 1 addition & 0 deletions clang/include/clang/Sema/SemaSYCL.h
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,7 @@ class SemaSYCL : public SemaBase {
ParsedType ParsedTy);

void handleKernelAttr(Decl *D, const ParsedAttr &AL);
void handleKernelEntryPointAttr(Decl *D, const ParsedAttr &AL);
};

} // namespace clang
Expand Down
26 changes: 26 additions & 0 deletions clang/lib/AST/ASTContext.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -14411,6 +14411,32 @@ void ASTContext::getFunctionFeatureMap(llvm::StringMap<bool> &FeatureMap,
}
}

static SYCLKernelInfo BuildSYCLKernelInfo(CanQualType KernelNameType,
const FunctionDecl *FD) {
return {KernelNameType, FD};
}

void ASTContext::registerSYCLEntryPointFunction(FunctionDecl *FD) {
// If the function declaration to register is invalid or dependent, the
// registration attempt is ignored.
if (FD->isInvalidDecl() || FD->isTemplated())
return;

const auto *SKEPAttr = FD->getAttr<SYCLKernelEntryPointAttr>();
assert(SKEPAttr && "Missing sycl_kernel_entry_point attribute");

// Be tolerant of multiple registration attempts so long as each attempt
// is for the same entity. Callers are obligated to detect and diagnose
// conflicting kernel names prior to calling this function.
CanQualType KernelNameType = getCanonicalType(SKEPAttr->getKernelName());
auto IT = SYCLKernels.find(KernelNameType);
assert((IT == SYCLKernels.end() ||
declaresSameEntity(FD, IT->second.getKernelEntryPointDecl())) &&
"SYCL kernel name conflict");
SYCLKernels.insert(
std::make_pair(KernelNameType, BuildSYCLKernelInfo(KernelNameType, FD)));
}

OMPTraitInfo &ASTContext::getNewOMPTraitInfo() {
OMPTraitInfoVector.emplace_back(new OMPTraitInfo());
return *OMPTraitInfoVector.back();
Expand Down
18 changes: 14 additions & 4 deletions clang/lib/AST/ByteCode/Compiler.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -6446,8 +6446,6 @@ bool Compiler<Emitter>::emitBuiltinBitCast(const CastExpr *E) {
QualType ToType = E->getType();
std::optional<PrimType> ToT = classify(ToType);

assert(!DiscardResult && "Implement DiscardResult mode for bitcasts.");

if (ToType->isNullPtrType()) {
if (!this->discard(SubExpr))
return false;
Expand All @@ -6463,12 +6461,24 @@ bool Compiler<Emitter>::emitBuiltinBitCast(const CastExpr *E) {
}
assert(!ToType->isReferenceType());

// Prepare storage for the result in case we discard.
if (DiscardResult && !Initializing && !ToT) {
std::optional<unsigned> LocalIndex = allocateLocal(E);
if (!LocalIndex)
return false;
if (!this->emitGetPtrLocal(*LocalIndex, E))
return false;
}

// Get a pointer to the value-to-cast on the stack.
if (!this->visit(SubExpr))
return false;

if (!ToT || ToT == PT_Ptr)
return this->emitBitCastPtr(E);
if (!ToT || ToT == PT_Ptr) {
if (!this->emitBitCastPtr(E))
return false;
return DiscardResult ? this->emitPopPtr(E) : true;
}
assert(ToT);

const llvm::fltSemantics *TargetSemantics = nullptr;
Expand Down
18 changes: 12 additions & 6 deletions clang/lib/AST/ByteCode/InterpBuiltinBitCast.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -26,8 +26,8 @@ using namespace clang;
using namespace clang::interp;

/// Used to iterate over pointer fields.
using DataFunc =
llvm::function_ref<bool(const Pointer &P, PrimType Ty, size_t BitOffset)>;
using DataFunc = llvm::function_ref<bool(const Pointer &P, PrimType Ty,
size_t BitOffset, bool PackedBools)>;

#define BITCAST_TYPE_SWITCH(Expr, B) \
do { \
Expand Down Expand Up @@ -89,6 +89,7 @@ struct BitcastBuffer {

std::byte *getBytes(unsigned BitOffset) const {
assert(BitOffset % 8 == 0);
assert(BitOffset < SizeInBits);
return const_cast<std::byte *>(data() + (BitOffset / 8));
}

Expand Down Expand Up @@ -147,18 +148,20 @@ static bool enumerateData(const Pointer &P, const Context &Ctx, size_t Offset,

// Primitives.
if (FieldDesc->isPrimitive())
return F(P, FieldDesc->getPrimType(), Offset);
return F(P, FieldDesc->getPrimType(), Offset, false);

// Primitive arrays.
if (FieldDesc->isPrimitiveArray()) {
bool BigEndianTarget = Ctx.getASTContext().getTargetInfo().isBigEndian();
QualType ElemType = FieldDesc->getElemQualType();
size_t ElemSizeInBits = Ctx.getASTContext().getTypeSize(ElemType);
PrimType ElemT = *Ctx.classify(ElemType);
// Special case, since the bools here are packed.
bool PackedBools = FieldDesc->getType()->isExtVectorBoolType();
bool Ok = true;
for (unsigned I = 0; I != FieldDesc->getNumElems(); ++I) {
unsigned Index = BigEndianTarget ? (FieldDesc->getNumElems() - 1 - I) : I;
Ok = Ok && F(P.atIndex(Index), ElemT, Offset);
Ok = Ok && F(P.atIndex(Index), ElemT, Offset, PackedBools);
Offset += ElemSizeInBits;
}
return Ok;
Expand Down Expand Up @@ -302,7 +305,8 @@ static bool readPointerToBuffer(const Context &Ctx, const Pointer &FromPtr,

return enumeratePointerFields(
FromPtr, Ctx,
[&](const Pointer &P, PrimType T, size_t BitOffset) -> bool {
[&](const Pointer &P, PrimType T, size_t BitOffset,
bool PackedBools) -> bool {
if (!P.isInitialized()) {
assert(false && "Implement uninitialized value tracking");
return ReturnOnUninit;
Expand Down Expand Up @@ -334,6 +338,8 @@ static bool readPointerToBuffer(const Context &Ctx, const Pointer &FromPtr,
} else {
if (const FieldDecl *FD = P.getField(); FD && FD->isBitField())
BitWidth = FD->getBitWidthValue(ASTCtx);
else if (T == PT_Bool && PackedBools)
BitWidth = 1;

BITCAST_TYPE_SWITCH(T, {
T Val = P.deref<T>();
Expand Down Expand Up @@ -401,7 +407,7 @@ bool clang::interp::DoBitCastPtr(InterpState &S, CodePtr OpPC,
size_t BitOffset = 0;
bool Success = enumeratePointerFields(
ToPtr, S.getContext(),
[&](const Pointer &P, PrimType T, size_t _) -> bool {
[&](const Pointer &P, PrimType T, size_t _, bool PackedBools) -> bool {
if (T == PT_Float) {
CharUnits ObjectReprChars = ASTCtx.getTypeSizeInChars(P.getType());
const auto &Semantics = ASTCtx.getFloatTypeSemantics(P.getType());
Expand Down
16 changes: 8 additions & 8 deletions clang/lib/Basic/Targets/SPIR.h
Original file line number Diff line number Diff line change
Expand Up @@ -310,8 +310,8 @@ class LLVM_LIBRARY_VISIBILITY SPIRVTargetInfo : public BaseSPIRVTargetInfo {

// SPIR-V IDs are represented with a single 32-bit word.
SizeType = TargetInfo::UnsignedInt;
resetDataLayout("e-i64:64-v16:16-v24:32-v32:32-v48:64-"
"v96:128-v192:256-v256:256-v512:512-v1024:1024-G1");
resetDataLayout("e-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-"
"v256:256-v512:512-v1024:1024-n8:16:32:64-G1");
}

void getTargetDefines(const LangOptions &Opts,
Expand All @@ -334,8 +334,8 @@ class LLVM_LIBRARY_VISIBILITY SPIRV32TargetInfo : public BaseSPIRVTargetInfo {
// SPIR-V has core support for atomic ops, and Int32 is always available;
// we take the maximum because it's possible the Host supports wider types.
MaxAtomicInlineWidth = std::max<unsigned char>(MaxAtomicInlineWidth, 32);
resetDataLayout("e-p:32:32-i64:64-v16:16-v24:32-v32:32-v48:64-"
"v96:128-v192:256-v256:256-v512:512-v1024:1024-G1");
resetDataLayout("e-p:32:32-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-"
"v192:256-v256:256-v512:512-v1024:1024-n8:16:32:64-G1");
}

void getTargetDefines(const LangOptions &Opts,
Expand All @@ -358,8 +358,8 @@ class LLVM_LIBRARY_VISIBILITY SPIRV64TargetInfo : public BaseSPIRVTargetInfo {
// SPIR-V has core support for atomic ops, and Int64 is always available;
// we take the maximum because it's possible the Host supports wider types.
MaxAtomicInlineWidth = std::max<unsigned char>(MaxAtomicInlineWidth, 64);
resetDataLayout("e-i64:64-v16:16-v24:32-v32:32-v48:64-"
"v96:128-v192:256-v256:256-v512:512-v1024:1024-G1");
resetDataLayout("e-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-"
"v256:256-v512:512-v1024:1024-n8:16:32:64-G1");
}

void getTargetDefines(const LangOptions &Opts,
Expand All @@ -384,8 +384,8 @@ class LLVM_LIBRARY_VISIBILITY SPIRV64AMDGCNTargetInfo final
PtrDiffType = IntPtrType = TargetInfo::SignedLong;
AddrSpaceMap = &SPIRDefIsGenMap;

resetDataLayout("e-i64:64-v16:16-v24:32-v32:32-v48:64-"
"v96:128-v192:256-v256:256-v512:512-v1024:1024-G1-P4-A0");
resetDataLayout("e-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-"
"v256:256-v512:512-v1024:1024-n32:64-S32-G1-P4-A0");

BFloat16Width = BFloat16Align = 16;
BFloat16Format = &llvm::APFloat::BFloat();
Expand Down
28 changes: 12 additions & 16 deletions clang/lib/CodeGen/CGObjCMac.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -365,7 +365,7 @@ class ObjCCommonTypesHelper {
/// GcReadWeakFn -- LLVM objc_read_weak (id *src) function.
llvm::FunctionCallee getGcReadWeakFn() {
// id objc_read_weak (id *)
llvm::Type *args[] = { ObjectPtrTy->getPointerTo() };
llvm::Type *args[] = {CGM.UnqualPtrTy};
llvm::FunctionType *FTy =
llvm::FunctionType::get(ObjectPtrTy, args, false);
return CGM.CreateRuntimeFunction(FTy, "objc_read_weak");
Expand All @@ -374,7 +374,7 @@ class ObjCCommonTypesHelper {
/// GcAssignWeakFn -- LLVM objc_assign_weak function.
llvm::FunctionCallee getGcAssignWeakFn() {
// id objc_assign_weak (id, id *)
llvm::Type *args[] = { ObjectPtrTy, ObjectPtrTy->getPointerTo() };
llvm::Type *args[] = {ObjectPtrTy, CGM.UnqualPtrTy};
llvm::FunctionType *FTy =
llvm::FunctionType::get(ObjectPtrTy, args, false);
return CGM.CreateRuntimeFunction(FTy, "objc_assign_weak");
Expand All @@ -383,7 +383,7 @@ class ObjCCommonTypesHelper {
/// GcAssignGlobalFn -- LLVM objc_assign_global function.
llvm::FunctionCallee getGcAssignGlobalFn() {
// id objc_assign_global(id, id *)
llvm::Type *args[] = { ObjectPtrTy, ObjectPtrTy->getPointerTo() };
llvm::Type *args[] = {ObjectPtrTy, CGM.UnqualPtrTy};
llvm::FunctionType *FTy =
llvm::FunctionType::get(ObjectPtrTy, args, false);
return CGM.CreateRuntimeFunction(FTy, "objc_assign_global");
Expand All @@ -392,7 +392,7 @@ class ObjCCommonTypesHelper {
/// GcAssignThreadLocalFn -- LLVM objc_assign_threadlocal function.
llvm::FunctionCallee getGcAssignThreadLocalFn() {
// id objc_assign_threadlocal(id src, id * dest)
llvm::Type *args[] = { ObjectPtrTy, ObjectPtrTy->getPointerTo() };
llvm::Type *args[] = {ObjectPtrTy, CGM.UnqualPtrTy};
llvm::FunctionType *FTy =
llvm::FunctionType::get(ObjectPtrTy, args, false);
return CGM.CreateRuntimeFunction(FTy, "objc_assign_threadlocal");
Expand All @@ -401,8 +401,7 @@ class ObjCCommonTypesHelper {
/// GcAssignIvarFn -- LLVM objc_assign_ivar function.
llvm::FunctionCallee getGcAssignIvarFn() {
// id objc_assign_ivar(id, id *, ptrdiff_t)
llvm::Type *args[] = { ObjectPtrTy, ObjectPtrTy->getPointerTo(),
CGM.PtrDiffTy };
llvm::Type *args[] = {ObjectPtrTy, CGM.UnqualPtrTy, CGM.PtrDiffTy};
llvm::FunctionType *FTy =
llvm::FunctionType::get(ObjectPtrTy, args, false);
return CGM.CreateRuntimeFunction(FTy, "objc_assign_ivar");
Expand All @@ -419,7 +418,7 @@ class ObjCCommonTypesHelper {
/// GcAssignStrongCastFn -- LLVM objc_assign_strongCast function.
llvm::FunctionCallee getGcAssignStrongCastFn() {
// id objc_assign_strongCast(id, id *)
llvm::Type *args[] = { ObjectPtrTy, ObjectPtrTy->getPointerTo() };
llvm::Type *args[] = {ObjectPtrTy, CGM.UnqualPtrTy};
llvm::FunctionType *FTy =
llvm::FunctionType::get(ObjectPtrTy, args, false);
return CGM.CreateRuntimeFunction(FTy, "objc_assign_strongCast");
Expand Down Expand Up @@ -554,23 +553,23 @@ class ObjCTypesHelper : public ObjCCommonTypesHelper {

/// ExceptionTryEnterFn - LLVM objc_exception_try_enter function.
llvm::FunctionCallee getExceptionTryEnterFn() {
llvm::Type *params[] = { ExceptionDataTy->getPointerTo() };
llvm::Type *params[] = {CGM.UnqualPtrTy};
return CGM.CreateRuntimeFunction(
llvm::FunctionType::get(CGM.VoidTy, params, false),
"objc_exception_try_enter");
}

/// ExceptionTryExitFn - LLVM objc_exception_try_exit function.
llvm::FunctionCallee getExceptionTryExitFn() {
llvm::Type *params[] = { ExceptionDataTy->getPointerTo() };
llvm::Type *params[] = {CGM.UnqualPtrTy};
return CGM.CreateRuntimeFunction(
llvm::FunctionType::get(CGM.VoidTy, params, false),
"objc_exception_try_exit");
}

/// ExceptionExtractFn - LLVM objc_exception_extract function.
llvm::FunctionCallee getExceptionExtractFn() {
llvm::Type *params[] = { ExceptionDataTy->getPointerTo() };
llvm::Type *params[] = {CGM.UnqualPtrTy};
return CGM.CreateRuntimeFunction(llvm::FunctionType::get(ObjectPtrTy,
params, false),
"objc_exception_extract");
Expand All @@ -587,7 +586,7 @@ class ObjCTypesHelper : public ObjCCommonTypesHelper {
/// SetJmpFn - LLVM _setjmp function.
llvm::FunctionCallee getSetJmpFn() {
// This is specifically the prototype for x86.
llvm::Type *params[] = { CGM.Int32Ty->getPointerTo() };
llvm::Type *params[] = {CGM.UnqualPtrTy};
return CGM.CreateRuntimeFunction(
llvm::FunctionType::get(CGM.Int32Ty, params, false), "_setjmp",
llvm::AttributeList::get(CGM.getLLVMContext(),
Expand Down Expand Up @@ -6051,9 +6050,7 @@ ObjCNonFragileABITypesHelper::ObjCNonFragileABITypesHelper(CodeGen::CodeGenModul
Int8PtrTy, PropertyListPtrTy);

// ImpnfABITy - LLVM for id (*)(id, SEL, ...)
llvm::Type *params[] = { ObjectPtrTy, SelectorPtrTy };
ImpnfABITy = llvm::FunctionType::get(ObjectPtrTy, params, false)
->getPointerTo();
ImpnfABITy = CGM.UnqualPtrTy;

// struct _class_t {
// struct _class_t *isa;
Expand Down Expand Up @@ -6469,8 +6466,7 @@ void CGObjCNonFragileABIMac::GenerateClass(const ObjCImplementationDecl *ID) {
llvm::GlobalValue::ExternalLinkage, nullptr,
"_objc_empty_vtable");
else
ObjCEmptyVtableVar =
llvm::ConstantPointerNull::get(ObjCTypes.ImpnfABITy->getPointerTo());
ObjCEmptyVtableVar = llvm::ConstantPointerNull::get(CGM.UnqualPtrTy);
}

// FIXME: Is this correct (that meta class size is never computed)?
Expand Down
21 changes: 10 additions & 11 deletions clang/lib/Driver/Driver.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -149,13 +149,9 @@ static std::optional<llvm::Triple>
getHIPOffloadTargetTriple(const Driver &D, const ArgList &Args) {
if (!Args.hasArg(options::OPT_offload_EQ)) {
auto OffloadArchs = Args.getAllArgValues(options::OPT_offload_arch_EQ);
if (llvm::is_contained(OffloadArchs, "amdgcnspirv")) {
if (OffloadArchs.size() == 1)
return llvm::Triple("spirv64-amd-amdhsa");
// Mixing specific & SPIR-V compilation is not supported for now.
D.Diag(diag::err_drv_only_one_offload_target_supported);
return std::nullopt;
}
if (llvm::is_contained(OffloadArchs, "amdgcnspirv") &&
OffloadArchs.size() == 1)
return llvm::Triple("spirv64-amd-amdhsa");
return llvm::Triple("amdgcn-amd-amdhsa"); // Default HIP triple.
}
auto TT = getOffloadTargetTriple(D, Args);
Expand Down Expand Up @@ -458,6 +454,7 @@ DerivedArgList *Driver::TranslateInputArgs(const InputArgList &Args) const {
// some build systems. We don't try to be complete here because we don't
// care to encourage this usage model.
if (A->getOption().matches(options::OPT_Wp_COMMA) &&
A->getNumValues() > 0 &&
(A->getValue(0) == StringRef("-MD") ||
A->getValue(0) == StringRef("-MMD"))) {
// Rewrite to -MD/-MMD along with -MF.
Expand Down Expand Up @@ -3477,9 +3474,11 @@ class OffloadingActionBuilder final {
llvm::StringMap<bool> Features;
// getHIPOffloadTargetTriple() is known to return valid value as it has
// been called successfully in the CreateOffloadingDeviceToolChains().
auto ArchStr = parseTargetID(
*getHIPOffloadTargetTriple(C.getDriver(), C.getInputArgs()), IdStr,
&Features);
auto T =
(IdStr == "amdgcnspirv")
? llvm::Triple("spirv64-amd-amdhsa")
: *getHIPOffloadTargetTriple(C.getDriver(), C.getInputArgs());
auto ArchStr = parseTargetID(T, IdStr, &Features);
if (!ArchStr) {
C.getDriver().Diag(clang::diag::err_drv_bad_target_id) << IdStr;
C.setContainsError();
Expand Down Expand Up @@ -5755,7 +5754,7 @@ InputInfoList Driver::BuildJobsForActionNoCache(
// We only have to generate a prefix for the host if this is not a top-level
// action.
std::string OffloadingPrefix = Action::GetOffloadingFileNamePrefix(
A->getOffloadingDeviceKind(), TC->getTriple().normalize(),
A->getOffloadingDeviceKind(), EffectiveTriple.normalize(),
/*CreatePrefixForHost=*/isa<OffloadPackagerJobAction>(A) ||
!(A->getOffloadingHostActiveKinds() == Action::OFK_None ||
AtTopLevel));
Expand Down
6 changes: 6 additions & 0 deletions clang/lib/Driver/ToolChain.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1099,6 +1099,12 @@ std::string ToolChain::ComputeLLVMTriple(const ArgList &Args,
}
case llvm::Triple::aarch64_32:
return getTripleString();
case llvm::Triple::amdgcn: {
llvm::Triple Triple = getTriple();
if (Args.getLastArgValue(options::OPT_mcpu_EQ) == "amdgcnspirv")
Triple.setArch(llvm::Triple::ArchType::spirv64);
return Triple.getTriple();
}
case llvm::Triple::arm:
case llvm::Triple::armeb:
case llvm::Triple::thumb:
Expand Down
23 changes: 12 additions & 11 deletions clang/lib/Driver/ToolChains/HIPAMD.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -205,7 +205,7 @@ void AMDGCN::Linker::ConstructJob(Compilation &C, const JobAction &JA,
if (JA.getType() == types::TY_LLVM_BC)
return constructLlvmLinkCommand(C, JA, Inputs, Output, Args);

if (getToolChain().getTriple().isSPIRV())
if (getToolChain().getEffectiveTriple().isSPIRV())
return constructLinkAndEmitSpirvCommand(C, JA, Inputs, Output, Args);

return constructLldCommand(C, JA, Inputs, Output, Args);
Expand Down Expand Up @@ -264,12 +264,14 @@ void HIPAMDToolChain::addClangTargetOptions(
CC1Args.push_back("-fapply-global-visibility-to-externs");
}

// For SPIR-V we embed the command-line into the generated binary, in order to
// retrieve it at JIT time and be able to do target specific compilation with
// options that match the user-supplied ones.
if (getTriple().isSPIRV() &&
!DriverArgs.hasArg(options::OPT_fembed_bitcode_marker))
CC1Args.push_back("-fembed-bitcode=marker");
if (getEffectiveTriple().isSPIRV()) {
// For SPIR-V we embed the command-line into the generated binary, in order
// to retrieve it at JIT time and be able to do target specific compilation
// with options that match the user-supplied ones.
if (!DriverArgs.hasArg(options::OPT_fembed_bitcode_marker))
CC1Args.push_back("-fembed-bitcode=marker");
return; // No DeviceLibs for SPIR-V.
}

for (auto BCFile : getDeviceLibs(DriverArgs)) {
CC1Args.push_back(BCFile.ShouldInternalize ? "-mlink-builtin-bitcode"
Expand Down Expand Up @@ -361,8 +363,7 @@ llvm::SmallVector<ToolChain::BitCodeLibraryInfo, 12>
HIPAMDToolChain::getDeviceLibs(const llvm::opt::ArgList &DriverArgs) const {
llvm::SmallVector<BitCodeLibraryInfo, 12> BCLibs;
if (DriverArgs.hasArg(options::OPT_nogpulib) ||
(getTriple().getArch() == llvm::Triple::spirv64 &&
getTriple().getVendor() == llvm::Triple::AMD))
getGPUArch(DriverArgs) == "amdgcnspirv")
return {};
ArgStringList LibraryPaths;

Expand Down Expand Up @@ -437,8 +438,8 @@ HIPAMDToolChain::getDeviceLibs(const llvm::opt::ArgList &DriverArgs) const {
void HIPAMDToolChain::checkTargetID(
const llvm::opt::ArgList &DriverArgs) const {
auto PTID = getParsedTargetID(DriverArgs);
if (PTID.OptionalTargetID && !PTID.OptionalGPUArch) {
if (PTID.OptionalTargetID && !PTID.OptionalGPUArch &&
PTID.OptionalTargetID != "amdgcnspirv")
getDriver().Diag(clang::diag::err_drv_bad_target_id)
<< *PTID.OptionalTargetID;
}
}
10 changes: 7 additions & 3 deletions clang/lib/Driver/ToolChains/HIPUtility.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -304,10 +304,14 @@ void HIP::constructHIPFatbinCommand(Compilation &C, const JobAction &JA,
for (const auto &II : Inputs) {
const auto *A = II.getAction();
auto ArchStr = llvm::StringRef(A->getOffloadingArch());
BundlerTargetArg +=
"," + OffloadKind + "-" + normalizeForBundler(TT, !ArchStr.empty());
BundlerTargetArg += ',' + OffloadKind + '-';
if (ArchStr == "amdgcnspirv")
BundlerTargetArg +=
normalizeForBundler(llvm::Triple("spirv64-amd-amdhsa"), true);
else
BundlerTargetArg += normalizeForBundler(TT, !ArchStr.empty());
if (!ArchStr.empty())
BundlerTargetArg += "-" + ArchStr.str();
BundlerTargetArg += '-' + ArchStr.str();
}
BundlerArgs.push_back(Args.MakeArgString(BundlerTargetArg));

Expand Down
83 changes: 34 additions & 49 deletions clang/lib/Lex/PPMacroExpansion.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -323,84 +323,69 @@ void Preprocessor::dumpMacroInfo(const IdentifierInfo *II) {
}
}

/// RegisterBuiltinMacro - Register the specified identifier in the identifier
/// table and mark it as a builtin macro to be expanded.
static IdentifierInfo *RegisterBuiltinMacro(Preprocessor &PP, const char *Name){
// Get the identifier.
IdentifierInfo *Id = PP.getIdentifierInfo(Name);

// Mark it as being a macro that is builtin.
MacroInfo *MI = PP.AllocateMacroInfo(SourceLocation());
MI->setIsBuiltinMacro();
PP.appendDefMacroDirective(Id, MI);
return Id;
}

/// RegisterBuiltinMacros - Register builtin macros, such as __LINE__ with the
/// identifier table.
void Preprocessor::RegisterBuiltinMacros() {
Ident__LINE__ = RegisterBuiltinMacro(*this, "__LINE__");
Ident__FILE__ = RegisterBuiltinMacro(*this, "__FILE__");
Ident__DATE__ = RegisterBuiltinMacro(*this, "__DATE__");
Ident__TIME__ = RegisterBuiltinMacro(*this, "__TIME__");
Ident__COUNTER__ = RegisterBuiltinMacro(*this, "__COUNTER__");
Ident_Pragma = RegisterBuiltinMacro(*this, "_Pragma");
Ident__FLT_EVAL_METHOD__ = RegisterBuiltinMacro(*this, "__FLT_EVAL_METHOD__");
Ident__LINE__ = RegisterBuiltinMacro("__LINE__");
Ident__FILE__ = RegisterBuiltinMacro("__FILE__");
Ident__DATE__ = RegisterBuiltinMacro("__DATE__");
Ident__TIME__ = RegisterBuiltinMacro("__TIME__");
Ident__COUNTER__ = RegisterBuiltinMacro("__COUNTER__");
Ident_Pragma = RegisterBuiltinMacro("_Pragma");
Ident__FLT_EVAL_METHOD__ = RegisterBuiltinMacro("__FLT_EVAL_METHOD__");

// C++ Standing Document Extensions.
if (getLangOpts().CPlusPlus)
Ident__has_cpp_attribute =
RegisterBuiltinMacro(*this, "__has_cpp_attribute");
Ident__has_cpp_attribute = RegisterBuiltinMacro("__has_cpp_attribute");
else
Ident__has_cpp_attribute = nullptr;

// GCC Extensions.
Ident__BASE_FILE__ = RegisterBuiltinMacro(*this, "__BASE_FILE__");
Ident__INCLUDE_LEVEL__ = RegisterBuiltinMacro(*this, "__INCLUDE_LEVEL__");
Ident__TIMESTAMP__ = RegisterBuiltinMacro(*this, "__TIMESTAMP__");
Ident__BASE_FILE__ = RegisterBuiltinMacro("__BASE_FILE__");
Ident__INCLUDE_LEVEL__ = RegisterBuiltinMacro("__INCLUDE_LEVEL__");
Ident__TIMESTAMP__ = RegisterBuiltinMacro("__TIMESTAMP__");

// Microsoft Extensions.
if (getLangOpts().MicrosoftExt) {
Ident__identifier = RegisterBuiltinMacro(*this, "__identifier");
Ident__pragma = RegisterBuiltinMacro(*this, "__pragma");
Ident__identifier = RegisterBuiltinMacro("__identifier");
Ident__pragma = RegisterBuiltinMacro("__pragma");
} else {
Ident__identifier = nullptr;
Ident__pragma = nullptr;
}

// Clang Extensions.
Ident__FILE_NAME__ = RegisterBuiltinMacro(*this, "__FILE_NAME__");
Ident__has_feature = RegisterBuiltinMacro(*this, "__has_feature");
Ident__has_extension = RegisterBuiltinMacro(*this, "__has_extension");
Ident__has_builtin = RegisterBuiltinMacro(*this, "__has_builtin");
Ident__FILE_NAME__ = RegisterBuiltinMacro("__FILE_NAME__");
Ident__has_feature = RegisterBuiltinMacro("__has_feature");
Ident__has_extension = RegisterBuiltinMacro("__has_extension");
Ident__has_builtin = RegisterBuiltinMacro("__has_builtin");
Ident__has_constexpr_builtin =
RegisterBuiltinMacro(*this, "__has_constexpr_builtin");
Ident__has_attribute = RegisterBuiltinMacro(*this, "__has_attribute");
RegisterBuiltinMacro("__has_constexpr_builtin");
Ident__has_attribute = RegisterBuiltinMacro("__has_attribute");
if (!getLangOpts().CPlusPlus)
Ident__has_c_attribute = RegisterBuiltinMacro(*this, "__has_c_attribute");
Ident__has_c_attribute = RegisterBuiltinMacro("__has_c_attribute");
else
Ident__has_c_attribute = nullptr;

Ident__has_declspec = RegisterBuiltinMacro(*this, "__has_declspec_attribute");
Ident__has_embed = RegisterBuiltinMacro(*this, "__has_embed");
Ident__has_include = RegisterBuiltinMacro(*this, "__has_include");
Ident__has_include_next = RegisterBuiltinMacro(*this, "__has_include_next");
Ident__has_warning = RegisterBuiltinMacro(*this, "__has_warning");
Ident__is_identifier = RegisterBuiltinMacro(*this, "__is_identifier");
Ident__is_target_arch = RegisterBuiltinMacro(*this, "__is_target_arch");
Ident__is_target_vendor = RegisterBuiltinMacro(*this, "__is_target_vendor");
Ident__is_target_os = RegisterBuiltinMacro(*this, "__is_target_os");
Ident__has_declspec = RegisterBuiltinMacro("__has_declspec_attribute");
Ident__has_embed = RegisterBuiltinMacro("__has_embed");
Ident__has_include = RegisterBuiltinMacro("__has_include");
Ident__has_include_next = RegisterBuiltinMacro("__has_include_next");
Ident__has_warning = RegisterBuiltinMacro("__has_warning");
Ident__is_identifier = RegisterBuiltinMacro("__is_identifier");
Ident__is_target_arch = RegisterBuiltinMacro("__is_target_arch");
Ident__is_target_vendor = RegisterBuiltinMacro("__is_target_vendor");
Ident__is_target_os = RegisterBuiltinMacro("__is_target_os");
Ident__is_target_environment =
RegisterBuiltinMacro(*this, "__is_target_environment");
Ident__is_target_variant_os =
RegisterBuiltinMacro(*this, "__is_target_variant_os");
RegisterBuiltinMacro("__is_target_environment");
Ident__is_target_variant_os = RegisterBuiltinMacro("__is_target_variant_os");
Ident__is_target_variant_environment =
RegisterBuiltinMacro(*this, "__is_target_variant_environment");
RegisterBuiltinMacro("__is_target_variant_environment");

// Modules.
Ident__building_module = RegisterBuiltinMacro(*this, "__building_module");
Ident__building_module = RegisterBuiltinMacro("__building_module");
if (!getLangOpts().CurrentModule.empty())
Ident__MODULE__ = RegisterBuiltinMacro(*this, "__MODULE__");
Ident__MODULE__ = RegisterBuiltinMacro("__MODULE__");
else
Ident__MODULE__ = nullptr;
}
Expand Down
3 changes: 3 additions & 0 deletions clang/lib/Sema/SemaDecl.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -12094,6 +12094,9 @@ bool Sema::CheckFunctionDeclaration(Scope *S, FunctionDecl *NewFD,
if (LangOpts.OpenMP)
OpenMP().ActOnFinishedFunctionDefinitionInOpenMPAssumeScope(NewFD);

if (LangOpts.isSYCL() && NewFD->hasAttr<SYCLKernelEntryPointAttr>())
getASTContext().registerSYCLEntryPointFunction(NewFD);

// Semantic checking for this function declaration (in isolation).

if (getLangOpts().CPlusPlus) {
Expand Down
3 changes: 3 additions & 0 deletions clang/lib/Sema/SemaDeclAttr.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -6620,6 +6620,9 @@ ProcessDeclAttribute(Sema &S, Scope *scope, Decl *D, const ParsedAttr &AL,
case ParsedAttr::AT_SYCLKernel:
S.SYCL().handleKernelAttr(D, AL);
break;
case ParsedAttr::AT_SYCLKernelEntryPoint:
S.SYCL().handleKernelEntryPointAttr(D, AL);
break;
case ParsedAttr::AT_SYCLSpecialClass:
handleSimpleAttribute<SYCLSpecialClassAttr>(S, D, AL);
break;
Expand Down
16 changes: 16 additions & 0 deletions clang/lib/Sema/SemaFunctionEffects.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -971,6 +971,7 @@ class Analyzer {
PendingFunctionAnalysis &CurrentFunction;
CallableInfo &CurrentCaller;
ViolationSite VSite;
const Expr *TrailingRequiresClause = nullptr;

FunctionBodyASTVisitor(Analyzer &Outer,
PendingFunctionAnalysis &CurrentFunction,
Expand All @@ -985,6 +986,9 @@ class Analyzer {
if (auto *Dtor = dyn_cast<CXXDestructorDecl>(CurrentCaller.CDecl))
followDestructor(dyn_cast<CXXRecordDecl>(Dtor->getParent()), Dtor);

if (auto *FD = dyn_cast<FunctionDecl>(CurrentCaller.CDecl))
TrailingRequiresClause = FD->getTrailingRequiresClause();

// Do an AST traversal of the function/block body
TraverseDecl(const_cast<Decl *>(CurrentCaller.CDecl));
}
Expand Down Expand Up @@ -1259,6 +1263,17 @@ class Analyzer {
return true;
}

bool TraverseStmt(Stmt *Statement) {
// If this statement is a `requires` clause from the top-level function
// being traversed, ignore it, since it's not generating runtime code.
// We skip the traversal of lambdas (beyond their captures, see
// TraverseLambdaExpr below), so just caching this from our constructor
// should suffice.
if (Statement != TrailingRequiresClause)
return Base::TraverseStmt(Statement);
return true;
}

bool TraverseConstructorInitializer(CXXCtorInitializer *Init) {
ViolationSite PrevVS = VSite;
if (Init->isAnyMemberInitializer())
Expand Down Expand Up @@ -1297,6 +1312,7 @@ class Analyzer {
}

bool TraverseBlockExpr(BlockExpr * /*unused*/) {
// As with lambdas, don't traverse the block's body.
// TODO: are the capture expressions (ctor call?) safe?
return true;
}
Expand Down
9 changes: 9 additions & 0 deletions clang/lib/Sema/SemaSYCL.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -198,3 +198,12 @@ void SemaSYCL::handleKernelAttr(Decl *D, const ParsedAttr &AL) {

handleSimpleAttribute<SYCLKernelAttr>(*this, D, AL);
}

void SemaSYCL::handleKernelEntryPointAttr(Decl *D, const ParsedAttr &AL) {
ParsedType PT = AL.getTypeArg();
TypeSourceInfo *TSI = nullptr;
(void)SemaRef.GetTypeFromParser(PT, &TSI);
assert(TSI && "no type source info for attribute argument");
D->addAttr(::new (SemaRef.Context)
SYCLKernelEntryPointAttr(SemaRef.Context, AL, TSI));
}
8 changes: 8 additions & 0 deletions clang/lib/Serialization/ASTReaderDecl.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1155,6 +1155,14 @@ void ASTDeclReader::VisitFunctionDecl(FunctionDecl *FD) {
for (unsigned I = 0; I != NumParams; ++I)
Params.push_back(readDeclAs<ParmVarDecl>());
FD->setParams(Reader.getContext(), Params);

// If the declaration is a SYCL kernel entry point function as indicated by
// the presence of a sycl_kernel_entry_point attribute, register it so that
// associated metadata is recreated.
if (FD->hasAttr<SYCLKernelEntryPointAttr>()) {
ASTContext &C = Reader.getContext();
C.registerSYCLEntryPointFunction(FD);
}
}

void ASTDeclReader::VisitObjCMethodDecl(ObjCMethodDecl *MD) {
Expand Down
51 changes: 51 additions & 0 deletions clang/test/AST/ByteCode/builtin-bit-cast.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,13 @@ constexpr Init round_trip(const Init &init) {
return bit_cast<Init>(bit_cast<Intermediate>(init));
}


namespace Discarding {
struct S { int a; };
constexpr int f = (__builtin_bit_cast(int, 2), 0);
constexpr int f2 = (__builtin_bit_cast(S, 2), 0);
}

namespace std {
enum byte : unsigned char {};
} // namespace std
Expand Down Expand Up @@ -468,8 +475,52 @@ struct ref_mem {
// both-note@+1 {{bit_cast from a type with a reference member is not allowed in a constant expression}}
constexpr intptr_t run_ref_mem = __builtin_bit_cast(intptr_t, ref_mem{global_int});

namespace test_vector {

typedef unsigned uint2 __attribute__((vector_size(2 * sizeof(unsigned))));
typedef char byte8 __attribute__((vector_size(sizeof(unsigned long long))));

constexpr uint2 test_vector = { 0x0C05FEFE, 0xCAFEBABE };

static_assert(bit_cast<unsigned long long>(test_vector) == (LITTLE_END
? 0xCAFEBABE0C05FEFE
: 0x0C05FEFECAFEBABE), "");
static_assert(check_round_trip<uint2>(0xCAFEBABE0C05FEFEULL), "");
static_assert(check_round_trip<byte8>(0xCAFEBABE0C05FEFEULL), "");

typedef bool bool8 __attribute__((ext_vector_type(8)));
typedef bool bool9 __attribute__((ext_vector_type(9)));
typedef bool bool16 __attribute__((ext_vector_type(16)));
typedef bool bool17 __attribute__((ext_vector_type(17)));
typedef bool bool32 __attribute__((ext_vector_type(32)));
typedef bool bool128 __attribute__((ext_vector_type(128)));

static_assert(bit_cast<unsigned char>(bool8{1,0,1,0,1,0,1,0}) == (LITTLE_END ? 0x55 : 0xAA), "");
constexpr bool8 b8 = __builtin_bit_cast(bool8, 0x55); // both-error {{__builtin_bit_cast source size does not equal destination size (4 vs 1)}}
#if 0
static_assert(check_round_trip<bool8>(static_cast<unsigned char>(0)), "");
static_assert(check_round_trip<bool8>(static_cast<unsigned char>(1)), "");
static_assert(check_round_trip<bool8>(static_cast<unsigned char>(0x55)), "");

static_assert(bit_cast<unsigned short>(bool16{1,1,1,1,1,0,0,0, 1,1,1,1,0,1,0,0}) == (LITTLE_END ? 0x2F1F : 0xF8F4), "");

static_assert(check_round_trip<bool16>(static_cast<short>(0xCAFE)), "");
static_assert(check_round_trip<bool32>(static_cast<int>(0xCAFEBABE)), "");
static_assert(check_round_trip<bool128>(static_cast<__int128_t>(0xCAFEBABE0C05FEFEULL)), "");
#endif

#if 0
// expected-error@+2 {{constexpr variable 'bad_bool9_to_short' must be initialized by a constant expression}}
// expected-note@+1 {{bit_cast involving type 'bool __attribute__((ext_vector_type(9)))' (vector of 9 'bool' values) is not allowed in a constant expression; element size 1 * element count 9 is not a multiple of the byte size 8}}
constexpr unsigned short bad_bool9_to_short = __builtin_bit_cast(unsigned short, bool9{1,1,0,1,0,1,0,1,0});
// expected-error@+2 {{constexpr variable 'bad_short_to_bool9' must be initialized by a constant expression}}
// expected-note@+1 {{bit_cast involving type 'bool __attribute__((ext_vector_type(9)))' (vector of 9 'bool' values) is not allowed in a constant expression; element size 1 * element count 9 is not a multiple of the byte size 8}}
constexpr bool9 bad_short_to_bool9 = __builtin_bit_cast(bool9, static_cast<unsigned short>(0));
// expected-error@+2 {{constexpr variable 'bad_int_to_bool17' must be initialized by a constant expression}}
// expected-note@+1 {{bit_cast involving type 'bool __attribute__((ext_vector_type(17)))' (vector of 17 'bool' values) is not allowed in a constant expression; element size 1 * element count 17 is not a multiple of the byte size 8}}
constexpr bool17 bad_int_to_bool17 = __builtin_bit_cast(bool17, 0x0001CAFEU);
#endif
}

namespace test_complex {
constexpr _Complex unsigned test_int_complex = { 0x0C05FEFE, 0xCAFEBABE };
Expand Down
144 changes: 144 additions & 0 deletions clang/test/ASTSYCL/ast-dump-sycl-kernel-entry-point.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,144 @@
// Tests without serialization:
// RUN: %clang_cc1 -std=c++17 -triple x86_64-unknown-unknown -fsycl-is-device \
// RUN: -ast-dump %s \
// RUN: | FileCheck --match-full-lines %s
// RUN: %clang_cc1 -std=c++17 -triple x86_64-unknown-unknown -fsycl-is-host \
// RUN: -ast-dump %s \
// RUN: | FileCheck --match-full-lines %s
//
// Tests with serialization:
// RUN: %clang_cc1 -std=c++17 -triple x86_64-unknown-unknown -fsycl-is-device \
// RUN: -emit-pch -o %t %s
// RUN: %clang_cc1 -x c++ -std=c++17 -triple x86_64-unknown-unknown -fsycl-is-device \
// RUN: -include-pch %t -ast-dump-all /dev/null \
// RUN: | sed -e "s/ <undeserialized declarations>//" -e "s/ imported//" \
// RUN: | FileCheck --match-full-lines %s
// RUN: %clang_cc1 -std=c++17 -triple x86_64-unknown-unknown -fsycl-is-host \
// RUN: -emit-pch -o %t %s
// RUN: %clang_cc1 -x c++ -std=c++17 -triple x86_64-unknown-unknown -fsycl-is-host \
// RUN: -include-pch %t -ast-dump-all /dev/null \
// RUN: | sed -e "s/ <undeserialized declarations>//" -e "s/ imported//" \
// RUN: | FileCheck --match-full-lines %s

// These tests validate the AST produced for functions declared with the
// sycl_kernel_entry_point attribute.

// CHECK: TranslationUnitDecl {{.*}}

// A unique kernel name type is required for each declared kernel entry point.
template<int, int=0> struct KN;

__attribute__((sycl_kernel_entry_point(KN<1>)))
void skep1() {
}
// CHECK: |-FunctionDecl {{.*}} skep1 'void ()'
// CHECK: | `-SYCLKernelEntryPointAttr {{.*}} KN<1>

using KN2 = KN<2>;
__attribute__((sycl_kernel_entry_point(KN2)))
void skep2() {
}
// CHECK: |-FunctionDecl {{.*}} skep2 'void ()'
// CHECK: | `-SYCLKernelEntryPointAttr {{.*}} KN2

template<int I> using KNT = KN<I>;
__attribute__((sycl_kernel_entry_point(KNT<3>)))
void skep3() {
}
// CHECK: |-FunctionDecl {{.*}} skep3 'void ()'
// CHECK: | `-SYCLKernelEntryPointAttr {{.*}} KNT<3>

template<typename KNT, typename F>
[[clang::sycl_kernel_entry_point(KNT)]]
void skep4(F f) {
f();
}
// CHECK: |-FunctionTemplateDecl {{.*}} skep4
// CHECK-NEXT: | |-TemplateTypeParmDecl {{.*}} KNT
// CHECK-NEXT: | |-TemplateTypeParmDecl {{.*}} F
// CHECK-NEXT: | |-FunctionDecl {{.*}} skep4 'void (F)'
// CHECK: | | `-SYCLKernelEntryPointAttr {{.*}} KNT

void test_skep4() {
skep4<KNT<4>>([]{});
}
// CHECK: | `-FunctionDecl {{.*}} used skep4 'void ((lambda at {{.*}}))' implicit_instantiation
// CHECK-NEXT: | |-TemplateArgument type 'KN<4>'
// CHECK: | |-TemplateArgument type '(lambda at {{.*}})'
// CHECK: | `-SYCLKernelEntryPointAttr {{.*}} struct KN<4>
// CHECK-NEXT: |-FunctionDecl {{.*}} test_skep4 'void ()'

template<typename KNT, typename T>
[[clang::sycl_kernel_entry_point(KNT)]]
void skep5(T) {
}
// CHECK: |-FunctionTemplateDecl {{.*}} skep5
// CHECK-NEXT: | |-TemplateTypeParmDecl {{.*}} KNT
// CHECK-NEXT: | |-TemplateTypeParmDecl {{.*}} T
// CHECK-NEXT: | |-FunctionDecl {{.*}} skep5 'void (T)'
// CHECK: | | `-SYCLKernelEntryPointAttr {{.*}} KNT

// Checks for the explicit template instantiation declaration below.
// CHECK: | `-FunctionDecl {{.*}} skep5 'void (int)' explicit_instantiation_definition
// CHECK-NEXT: | |-TemplateArgument type 'KN<5, 4>'
// CHECK: | |-TemplateArgument type 'int'
// CHECK: | `-SYCLKernelEntryPointAttr {{.*}} KN<5, 4>

// FIXME: C++23 [temp.expl.spec]p12 states:
// FIXME: ... Similarly, attributes appearing in the declaration of a template
// FIXME: have no effect on an explicit specialization of that template.
// FIXME: Clang currently instantiates and propagates attributes from a function
// FIXME: template to its explicit specializations resulting in the following
// FIXME: explicit specialization having an attribute incorrectly attached.
template<>
void skep5<KN<5,1>>(short) {
}
// CHECK: |-FunctionDecl {{.*}} prev {{.*}} skep5 'void (short)' explicit_specialization
// CHECK-NEXT: | |-TemplateArgument type 'KN<5, 1>'
// CHECK: | |-TemplateArgument type 'short'
// CHECK: | `-SYCLKernelEntryPointAttr {{.*}} Inherited struct KN<5, 1>

template<>
[[clang::sycl_kernel_entry_point(KN<5,2>)]]
void skep5<KN<5,2>>(long) {
}
// CHECK: |-FunctionDecl {{.*}} prev {{.*}} skep5 'void (long)' explicit_specialization
// CHECK-NEXT: | |-TemplateArgument type 'KN<5, 2>'
// CHECK: | |-TemplateArgument type 'long'
// CHECK: | `-SYCLKernelEntryPointAttr {{.*}} KN<5, 2>

template<>
[[clang::sycl_kernel_entry_point(KN<5,3>)]]
void skep5<KN<5,-1>>(long long) {
}
// CHECK: |-FunctionDecl {{.*}} prev {{.*}} skep5 'void (long long)' explicit_specialization
// CHECK-NEXT: | |-TemplateArgument type 'KN<5, -1>'
// CHECK: | |-TemplateArgument type 'long long'
// CHECK: | `-SYCLKernelEntryPointAttr {{.*}} KN<5, 3>

template void skep5<KN<5,4>>(int);
// Checks are located with the primary template declaration above.

// Ensure that matching attributes from multiple declarations are ok.
[[clang::sycl_kernel_entry_point(KN<6>)]]
void skep6();
[[clang::sycl_kernel_entry_point(KN<6>)]]
void skep6() {
}
// CHECK: |-FunctionDecl {{.*}} skep6 'void ()'
// CHECK-NEXT: | `-SYCLKernelEntryPointAttr {{.*}} KN<6>
// CHECK-NEXT: |-FunctionDecl {{.*}} prev {{.*}} skep6 'void ()'
// CHECK-NEXT: | |-CompoundStmt {{.*}}
// CHECK-NEXT: | `-SYCLKernelEntryPointAttr {{.*}} KN<6>

// Ensure that matching attributes from the same declaration are ok.
[[clang::sycl_kernel_entry_point(KN<7>), clang::sycl_kernel_entry_point(KN<7>)]]
void skep7() {
}
// CHECK: |-FunctionDecl {{.*}} skep7 'void ()'
// CHECK-NEXT: | |-CompoundStmt {{.*}}
// CHECK-NEXT: | |-SYCLKernelEntryPointAttr {{.*}} KN<7>
// CHECK-NEXT: | `-SYCLKernelEntryPointAttr {{.*}} KN<7>

void the_end() {}
// CHECK: `-FunctionDecl {{.*}} the_end 'void ()'
2 changes: 1 addition & 1 deletion clang/test/CodeGen/target-data.c
Original file line number Diff line number Diff line change
Expand Up @@ -271,4 +271,4 @@

// RUN: %clang_cc1 -triple spirv64-amd-amdhsa -o - -emit-llvm %s | \
// RUN: FileCheck %s -check-prefix=AMDGPUSPIRV64
// AMDGPUSPIRV64: target datalayout = "e-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-G1-P4-A0"
// AMDGPUSPIRV64: target datalayout = "e-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-n32:64-S32-G1-P4-A0"
2 changes: 1 addition & 1 deletion clang/test/CodeGenOpenCL/builtins-amdgcn.cl
Original file line number Diff line number Diff line change
Expand Up @@ -638,7 +638,7 @@ void test_get_workgroup_size(int d, global int *out)

// CHECK-LABEL: @test_get_grid_size(
// CHECK: {{.*}}call align 4 dereferenceable(64){{.*}} ptr addrspace(4) @llvm.amdgcn.dispatch.ptr()
// CHECK: getelementptr inbounds i8, ptr addrspace(4) %{{.*}}, i64 %.sink
// CHECK: getelementptr inbounds i8, ptr addrspace(4) %{{.*}}, i64 %{{.+}}
// CHECK: load i32, ptr addrspace(4) %{{.*}}, align 4, !invariant.load
void test_get_grid_size(int d, global int *out)
{
Expand Down
18 changes: 18 additions & 0 deletions clang/test/Driver/hip-toolchain-no-rdc.hip
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,11 @@
// RUN: %t/a.o %t/b.o \
// RUN: 2>&1 | FileCheck -check-prefixes=LKONLY %s

// RUN: %clang -### --target=x86_64-linux-gnu \
// RUN: --offload-arch=amdgcnspirv --offload-arch=gfx900 \
// RUN: %s -nogpuinc -nogpulib \
// RUN: 2>&1 | FileCheck -check-prefixes=AMDGCNSPIRV %s

//
// Compile device code in a.cu to code object for gfx803.
//
Expand Down Expand Up @@ -177,3 +182,16 @@
// LKONLY-NOT: {{".*/llc"}}
// LKONLY: [[LD:".*ld.*"]] {{.*}} "{{.*/a.o}}" "{{.*/b.o}}"
// LKONLY-NOT: "-T" "{{.*}}.lk"

//
// Check mixed AMDGCNSPIRV and concrete GPU arch.
//

// AMDGCNSPIRV: "-cc1" "-triple" "spirv64-amd-amdhsa" {{.*}}"-emit-obj" {{.*}} "-o" "[[AMDGCNSPV_OBJ:.*o]]"
// AMDGCNSPIRV: {{".*llvm-link.*"}} "-o" "[[AMDGCNSPV_TMP:.*out]]" "[[AMDGCNSPV_OBJ]]"
// AMDGCNSPIRV: {{".*llvm-spirv.*"}} "--spirv-max-version=1.6" "--spirv-ext=+all" {{.*}} "[[AMDGCNSPV_TMP]]" {{.*}}"-o" "[[AMDGCNSPV_CO:.*out]]"
// AMDGCNSPIRV: "-cc1" "-triple" "amdgcn-amd-amdhsa" {{.*}}"-emit-obj" {{.*}}"-target-cpu" "gfx900"{{.*}} "-o" "[[GFX900_OBJ:.*o]]"
// AMDGCNSPIRV: {{".*lld.*"}} {{.*}}"-plugin-opt=mcpu=gfx900" {{.*}} "-o" "[[GFX900_CO:.*out]]" {{.*}}"[[GFX900_OBJ]]"
// AMDGCNSPIRV: {{".*clang-offload-bundler.*"}} "-type=o"
// AMDGCNSPIRV-SAME: "-targets={{.*}}hipv4-spirv64-amd-amdhsa--amdgcnspirv,hipv4-amdgcn-amd-amdhsa--gfx900"
// AMDGCNSPIRV-SAME: "-input=[[AMDGCNSPV_CO]]" "-input=[[GFX900_CO]]"
Original file line number Diff line number Diff line change
Expand Up @@ -178,6 +178,7 @@
// CHECK-NEXT: ReturnTypestate (SubjectMatchRule_function, SubjectMatchRule_variable_is_parameter)
// CHECK-NEXT: ReturnsNonNull (SubjectMatchRule_objc_method, SubjectMatchRule_function)
// CHECK-NEXT: ReturnsTwice (SubjectMatchRule_function)
// CHECK-NEXT: SYCLKernelEntryPoint (SubjectMatchRule_function)
// CHECK-NEXT: SYCLSpecialClass (SubjectMatchRule_record)
// CHECK-NEXT: ScopedLockable (SubjectMatchRule_record)
// CHECK-NEXT: Section (SubjectMatchRule_function, SubjectMatchRule_variable_is_global, SubjectMatchRule_objc_method, SubjectMatchRule_objc_property)
Expand Down
45 changes: 45 additions & 0 deletions clang/test/Sema/attr-nonblocking-constraints.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -388,6 +388,51 @@ void nb26() [[clang::nonblocking]] {
abort_wrapper(); // no diagnostic
}

// --- Make sure we don't traverse a requires clause. ---

// Apparently some requires clauses are able to be collapsed into a constant before the nonblocking
// analysis sees any function calls. This example (extracted from a real-world case where
// `operator&&` in <valarray>, preceding the inclusion of <expected>) is sufficiently complex
// to look like it contains function calls. There may be simpler examples.

namespace ExpectedTest {

template <class _Tp>
inline constexpr bool is_copy_constructible_v = __is_constructible(_Tp, _Tp&);

template <bool, class _Tp = void>
struct enable_if {};
template <class _Tp>
struct enable_if<true, _Tp> {
typedef _Tp type;
};

template <bool _Bp, class _Tp = void>
using enable_if_t = typename enable_if<_Bp, _Tp>::type;

// Doesn't seem to matter whether the enable_if is true or false.
template <class E1, class E2, enable_if_t<is_copy_constructible_v<E1>> = 0>
inline bool operator&&(const E1& x, const E2& y);

template <class _Tp, class _Err>
class expected {
public:
constexpr expected()
{}

constexpr expected(const expected&)
requires(is_copy_constructible_v<_Tp> && is_copy_constructible_v<_Err>)
= default;
};

void test() [[clang::nonblocking]]
{
expected<int, int> a;
auto b = a;
}

} // namespace ExpectedTest

// --- nonblocking implies noexcept ---
#pragma clang diagnostic warning "-Wperf-constraint-implies-noexcept"

Expand Down
137 changes: 137 additions & 0 deletions clang/test/SemaSYCL/sycl-kernel-entry-point-attr-grammar.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,137 @@
// RUN: %clang_cc1 -triple x86_64-linux-gnu -std=c++17 -fsyntax-only -fsycl-is-device -verify %s
// RUN: %clang_cc1 -triple x86_64-linux-gnu -std=c++20 -fsyntax-only -fsycl-is-device -verify %s

// These tests validate parsing of the sycl_kernel_entry_point argument list
// and that the single argument names a type.

// Templates used to exercise class template specializations.
template<int> struct ST; // #ST-decl
template<int N> using TTA = ST<N>; // #TTA-decl


////////////////////////////////////////////////////////////////////////////////
// Valid declarations.
////////////////////////////////////////////////////////////////////////////////

struct S1;
[[clang::sycl_kernel_entry_point(S1)]] void ok1();

typedef struct {} TA2;
[[clang::sycl_kernel_entry_point(TA2)]] void ok2();

using TA3 = struct {};
[[clang::sycl_kernel_entry_point(TA3)]] void ok3();

[[clang::sycl_kernel_entry_point(ST<4>)]] void ok4();

[[clang::sycl_kernel_entry_point(TTA<5>)]] void ok5();

namespace NS6 {
struct NSS;
}
[[clang::sycl_kernel_entry_point(NS6::NSS)]] void ok6();

namespace {
struct UNSS7;
}
[[clang::sycl_kernel_entry_point(UNSS7)]] void ok7();

struct {} s;
[[clang::sycl_kernel_entry_point(decltype(s))]] void ok8();

template<typename KN>
[[clang::sycl_kernel_entry_point(KN)]] void ok9();
void test_ok9() {
ok9<struct LS1>();
}

template<int, typename KN>
[[clang::sycl_kernel_entry_point(KN)]] void ok10();
void test_ok10() {
ok10<1, struct LS2>();
}

namespace NS11 {
struct NSS;
}
template<typename T>
[[clang::sycl_kernel_entry_point(T)]] void ok11() {}
template<>
[[clang::sycl_kernel_entry_point(NS11::NSS)]] void ok11<NS11::NSS>() {}

struct S12;
[[clang::sycl_kernel_entry_point(S12)]] void ok12();
[[clang::sycl_kernel_entry_point(S12)]] void ok12() {}

template<typename T>
[[clang::sycl_kernel_entry_point(T)]] void ok13(T k);
void test_ok13() {
ok13([]{});
}


////////////////////////////////////////////////////////////////////////////////
// Invalid declarations.
////////////////////////////////////////////////////////////////////////////////

// expected-error@+1 {{'sycl_kernel_entry_point' attribute takes one argument}}
[[clang::sycl_kernel_entry_point]] void bad1();

// expected-error@+1 {{'sycl_kernel_entry_point' attribute takes one argument}}
[[clang::sycl_kernel_entry_point()]] void bad2();

struct B3;
// expected-error@+2 {{expected ')'}}
// expected-error@+1 {{expected ']'}}
[[clang::sycl_kernel_entry_point(B3,)]] void bad3();

struct B4;
// expected-error@+3 {{expected ')'}}
// expected-error@+2 {{expected ','}}
// expected-warning@+1 {{unknown attribute 'X' ignored}}
[[clang::sycl_kernel_entry_point(B4, X)]] void bad4();

// expected-error@+1 {{expected a type}}
[[clang::sycl_kernel_entry_point(1)]] void bad5();

void f6();
// expected-error@+1 {{unknown type name 'f6'}}
[[clang::sycl_kernel_entry_point(f6)]] void bad6();

// expected-error@+2 {{use of class template 'ST' requires template arguments; argument deduction not allowed here}}
// expected-note@#ST-decl {{template is declared here}}
[[clang::sycl_kernel_entry_point(ST)]] void bad7();

// expected-error@+2 {{use of alias template 'TTA' requires template arguments; argument deduction not allowed here}}
// expected-note@#TTA-decl {{template is declared here}}
[[clang::sycl_kernel_entry_point(TTA)]] void bad8();

enum {
e9
};
// expected-error@+1 {{unknown type name 'e9'}}
[[clang::sycl_kernel_entry_point(e9)]] void bad9();

#if __cplusplus >= 202002L
template<typename> concept C = true;
// expected-error@+1 {{expected a type}}
[[clang::sycl_kernel_entry_point(C)]] void bad10();

// expected-error@+1 {{expected a type}}
[[clang::sycl_kernel_entry_point(C<int>)]] void bad11();
#endif

struct B12; // #B12-decl
// FIXME: C++23 [temp.expl.spec]p12 states:
// FIXME: ... Similarly, attributes appearing in the declaration of a template
// FIXME: have no effect on an explicit specialization of that template.
// FIXME: Clang currently instantiates and propagates attributes from a function
// FIXME: template to its explicit specializations resulting in the following
// FIXME: spurious error.
// expected-error@+4 {{incomplete type 'B12' named in nested name specifier}}
// expected-note@+5 {{in instantiation of function template specialization 'bad12<B12>' requested here}}
// expected-note@#B12-decl {{forward declaration of 'B12'}}
template<typename T>
[[clang::sycl_kernel_entry_point(typename T::not_found)]] void bad12() {}
template<>
void bad12<B12>() {}
17 changes: 17 additions & 0 deletions clang/test/SemaSYCL/sycl-kernel-entry-point-attr-ignored.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
// RUN: %clang_cc1 -triple x86_64-linux-gnu -std=c++17 -fsyntax-only -verify %s

// These tests validate that the sycl_kernel_entry_point attribute is ignored
// when SYCL support is not enabled.

// A unique kernel name type is required for each declared kernel entry point.
template<int> struct KN;

// expected-warning@+1 {{'sycl_kernel_entry_point' attribute ignored}}
[[clang::sycl_kernel_entry_point(KN<1>)]]
void ok1();

// expected-warning@+2 {{'sycl_kernel_entry_point' attribute ignored}}
template<typename KNT>
[[clang::sycl_kernel_entry_point(KNT)]]
void ok2() {}
template void ok2<KN<2>>();
6 changes: 4 additions & 2 deletions clang/utils/TableGen/ClangAttrEmitter.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -2727,7 +2727,8 @@ static void emitAttributes(const RecordKeeper &Records, raw_ostream &OS,
}

if (Header)
OS << "class " << R.getName() << "Attr : public " << SuperName << " {\n";
OS << "class CLANG_ABI " << R.getName() << "Attr : public " << SuperName
<< " {\n";
else
OS << "\n// " << R.getName() << "Attr implementation\n\n";

Expand Down Expand Up @@ -3185,7 +3186,8 @@ void clang::EmitClangAttrClass(const RecordKeeper &Records, raw_ostream &OS) {
emitSourceFileHeader("Attribute classes' definitions", OS, Records);

OS << "#ifndef LLVM_CLANG_ATTR_CLASSES_INC\n";
OS << "#define LLVM_CLANG_ATTR_CLASSES_INC\n\n";
OS << "#define LLVM_CLANG_ATTR_CLASSES_INC\n";
OS << "#include \"clang/Support/Compiler.h\"\n\n";

emitAttributes(Records, OS, true);

Expand Down
8 changes: 6 additions & 2 deletions compiler-rt/cmake/config-ix.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,11 @@ check_c_compiler_flag(-nodefaultlibs C_SUPPORTS_NODEFAULTLIBS_FLAG)
if (C_SUPPORTS_NODEFAULTLIBS_FLAG)
set(CMAKE_REQUIRED_FLAGS "${CMAKE_REQUIRED_FLAGS} -nodefaultlibs")
if (COMPILER_RT_HAS_LIBC)
list(APPEND CMAKE_REQUIRED_LIBRARIES c)
if (HAIKU)
list(APPEND CMAKE_REQUIRED_LIBRARIES root)
else()
list(APPEND CMAKE_REQUIRED_LIBRARIES c)
endif()
endif ()
if (COMPILER_RT_USE_BUILTINS_LIBRARY)
# TODO: remote this check once we address PR51389.
Expand Down Expand Up @@ -826,7 +830,7 @@ else()
endif()

if (PROFILE_SUPPORTED_ARCH AND NOT LLVM_USE_SANITIZER AND
OS_NAME MATCHES "Darwin|Linux|FreeBSD|Windows|Android|Fuchsia|SunOS|NetBSD|AIX|WASI")
OS_NAME MATCHES "Darwin|Linux|FreeBSD|Windows|Android|Fuchsia|SunOS|NetBSD|AIX|WASI|Haiku")
set(COMPILER_RT_HAS_PROFILE TRUE)
else()
set(COMPILER_RT_HAS_PROFILE FALSE)
Expand Down
152 changes: 114 additions & 38 deletions compiler-rt/lib/interception/interception_win.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -187,8 +187,12 @@ static uptr GetMmapGranularity() {
return si.dwAllocationGranularity;
}

UNUSED static uptr RoundDownTo(uptr size, uptr boundary) {
return size & ~(boundary - 1);
}

UNUSED static uptr RoundUpTo(uptr size, uptr boundary) {
return (size + boundary - 1) & ~(boundary - 1);
return RoundDownTo(size + boundary - 1, boundary);
}

// FIXME: internal_str* and internal_mem* functions should be moved from the
Expand Down Expand Up @@ -285,8 +289,11 @@ static void WriteJumpInstruction(uptr from, uptr target) {

static void WriteShortJumpInstruction(uptr from, uptr target) {
sptr offset = target - from - kShortJumpInstructionLength;
if (offset < -128 || offset > 127)
if (offset < -128 || offset > 127) {
ReportError("interception_win: cannot write short jmp from %p to %p\n",
(void *)from, (void *)target);
InterceptionFailed();
}
*(u8*)from = 0xEB;
*(u8*)(from + 1) = (u8)offset;
}
Expand Down Expand Up @@ -340,32 +347,78 @@ struct TrampolineMemoryRegion {
uptr max_size;
};

UNUSED static const uptr kTrampolineScanLimitRange = 1ull << 31; // 2 gig
UNUSED static const uptr kTrampolineRangeLimit = 1ull << 31; // 2 gig
static const int kMaxTrampolineRegion = 1024;
static TrampolineMemoryRegion TrampolineRegions[kMaxTrampolineRegion];

static void *AllocateTrampolineRegion(uptr image_address, size_t granularity) {
#if SANITIZER_WINDOWS64
uptr address = image_address;
uptr scanned = 0;
while (scanned < kTrampolineScanLimitRange) {
static void *AllocateTrampolineRegion(uptr min_addr, uptr max_addr,
uptr func_addr, size_t granularity) {
# if SANITIZER_WINDOWS64
// Clamp {min,max}_addr to the accessible address space.
SYSTEM_INFO system_info;
::GetSystemInfo(&system_info);
uptr min_virtual_addr =
RoundUpTo((uptr)system_info.lpMinimumApplicationAddress, granularity);
uptr max_virtual_addr =
RoundDownTo((uptr)system_info.lpMaximumApplicationAddress, granularity);
if (min_addr < min_virtual_addr)
min_addr = min_virtual_addr;
if (max_addr > max_virtual_addr)
max_addr = max_virtual_addr;

// This loop probes the virtual address space to find free memory in the
// [min_addr, max_addr] interval. The search starts from func_addr and
// proceeds "outwards" towards the interval bounds using two probes, lo_addr
// and hi_addr, for addresses lower/higher than func_addr. At each step, it
// considers the probe closest to func_addr. If that address is not free, the
// probe is advanced (lower or higher depending on the probe) to the next
// memory block and the search continues.
uptr lo_addr = RoundDownTo(func_addr, granularity);
uptr hi_addr = RoundUpTo(func_addr, granularity);
while (lo_addr >= min_addr || hi_addr <= max_addr) {
// Consider the in-range address closest to func_addr.
uptr addr;
if (lo_addr < min_addr)
addr = hi_addr;
else if (hi_addr > max_addr)
addr = lo_addr;
else
addr = (hi_addr - func_addr < func_addr - lo_addr) ? hi_addr : lo_addr;

MEMORY_BASIC_INFORMATION info;
if (!::VirtualQuery((void*)address, &info, sizeof(info)))
if (!::VirtualQuery((void *)addr, &info, sizeof(info))) {
ReportError(
"interception_win: VirtualQuery in AllocateTrampolineRegion failed "
"for %p\n",
(void *)addr);
return nullptr;
}

// Check whether a region can be allocated at |address|.
// Check whether a region can be allocated at |addr|.
if (info.State == MEM_FREE && info.RegionSize >= granularity) {
void *page = ::VirtualAlloc((void*)RoundUpTo(address, granularity),
granularity,
MEM_RESERVE | MEM_COMMIT,
PAGE_EXECUTE_READWRITE);
void *page =
::VirtualAlloc((void *)addr, granularity, MEM_RESERVE | MEM_COMMIT,
PAGE_EXECUTE_READWRITE);
if (page == nullptr)
ReportError(
"interception_win: VirtualAlloc in AllocateTrampolineRegion failed "
"for %p\n",
(void *)addr);
return page;
}

// Move to the next region.
address = (uptr)info.BaseAddress + info.RegionSize;
scanned += info.RegionSize;
if (addr == lo_addr)
lo_addr =
RoundDownTo((uptr)info.AllocationBase - granularity, granularity);
if (addr == hi_addr)
hi_addr =
RoundUpTo((uptr)info.BaseAddress + info.RegionSize, granularity);
}

ReportError(
"interception_win: AllocateTrampolineRegion failed to find free memory; "
"min_addr: %p, max_addr: %p, func_addr: %p, granularity: %zu\n",
(void *)min_addr, (void *)max_addr, granularity);
return nullptr;
#else
return ::VirtualAlloc(nullptr,
Expand All @@ -387,37 +440,50 @@ void TestOnlyReleaseTrampolineRegions() {
}

static uptr AllocateMemoryForTrampoline(uptr func_address, size_t size) {
uptr image_address = func_address;
# if SANITIZER_WINDOWS64
uptr min_addr = func_address - kTrampolineRangeLimit;
uptr max_addr = func_address + kTrampolineRangeLimit - size;

#if SANITIZER_WINDOWS64
// Allocate memory after the module (DLL or EXE file), but within 2GB
// of the start of the module so that any address within the module can be
// referenced with PC-relative operands.
// Allocate memory within 2GB of the module (DLL or EXE file) so that any
// address within the module can be referenced with PC-relative operands.
// This allows us to not just jump to the trampoline with a PC-relative
// offset, but to relocate any instructions that we copy to the trampoline
// which have references to the original module. If we can't find the base
// address of the module (e.g. if func_address is in mmap'ed memory), just
// use func_address as is.
// stay within 2GB of func_address.
HMODULE module;
if (::GetModuleHandleExW(GET_MODULE_HANDLE_EX_FLAG_FROM_ADDRESS |
GET_MODULE_HANDLE_EX_FLAG_UNCHANGED_REFCOUNT,
(LPCWSTR)func_address, &module)) {
MODULEINFO module_info;
if (::GetModuleInformation(::GetCurrentProcess(), module,
&module_info, sizeof(module_info))) {
image_address = (uptr)module_info.lpBaseOfDll;
min_addr = (uptr)module_info.lpBaseOfDll + module_info.SizeOfImage -
kTrampolineRangeLimit;
max_addr = (uptr)module_info.lpBaseOfDll + kTrampolineRangeLimit - size;
}
}
#endif

// Find a region within 2G with enough space to allocate |size| bytes.
// Check for overflow.
if (min_addr > func_address)
min_addr = 0;
if (max_addr < func_address)
max_addr = ~(uptr)0;
# else
uptr min_addr = 0;
uptr max_addr = ~min_addr;
# endif

// Find a region within [min_addr,max_addr] with enough space to allocate
// |size| bytes.
TrampolineMemoryRegion *region = nullptr;
for (size_t bucket = 0; bucket < kMaxTrampolineRegion; ++bucket) {
TrampolineMemoryRegion* current = &TrampolineRegions[bucket];
if (current->content == 0) {
// No valid region found, allocate a new region.
size_t bucket_size = GetMmapGranularity();
void *content = AllocateTrampolineRegion(image_address, bucket_size);
void *content = AllocateTrampolineRegion(min_addr, max_addr, func_address,
bucket_size);
if (content == nullptr)
return 0U;

Expand All @@ -427,13 +493,9 @@ static uptr AllocateMemoryForTrampoline(uptr func_address, size_t size) {
region = current;
break;
} else if (current->max_size - current->allocated_size > size) {
#if SANITIZER_WINDOWS64
// In 64-bits, the memory space must be allocated within 2G boundary.
uptr next_address = current->content + current->allocated_size;
if (next_address < image_address ||
next_address - image_address >= 0x7FFF0000)
continue;
#endif
uptr next_address = current->content + current->allocated_size;
if (next_address < min_addr || next_address > max_addr)
continue;
// The space can be allocated in the current region.
region = current;
break;
Expand Down Expand Up @@ -872,8 +934,14 @@ static bool CopyInstructions(uptr to, uptr from, size_t size) {
// this will be untrue if relocated_offset \notin [-2**31, 2**31)
s64 delta = to - from;
s64 relocated_offset = *(s32 *)(to + cursor + rel_offset) - delta;
if (-0x8000'0000ll > relocated_offset || relocated_offset > 0x7FFF'FFFFll)
if (-0x8000'0000ll > relocated_offset ||
relocated_offset > 0x7FFF'FFFFll) {
ReportError(
"interception_win: CopyInstructions relocated_offset %lld outside "
"32-bit range\n",
(long long)relocated_offset);
return false;
}
# else
// on 32-bit, the relative offset will always be correct
s32 delta = to - from;
Expand Down Expand Up @@ -1167,19 +1235,27 @@ uptr InternalGetProcAddress(void *module, const char *func_name) {
// exported directory.
char function_name[256];
size_t funtion_name_length = _strlen(func);
if (funtion_name_length >= sizeof(function_name) - 1)
if (funtion_name_length >= sizeof(function_name) - 1) {
ReportError("interception_win: func too long: '%s'\n", func);
InterceptionFailed();
}

_memcpy(function_name, func, funtion_name_length);
function_name[funtion_name_length] = '\0';
char* separator = _strchr(function_name, '.');
if (!separator)
if (!separator) {
ReportError("interception_win: no separator in '%s'\n",
function_name);
InterceptionFailed();
}
*separator = '\0';

void* redirected_module = GetModuleHandleA(function_name);
if (!redirected_module)
if (!redirected_module) {
ReportError("interception_win: GetModuleHandleA failed for '%s'\n",
function_name);
InterceptionFailed();
}
return InternalGetProcAddress(redirected_module, separator + 1);
}

Expand Down
2 changes: 1 addition & 1 deletion compiler-rt/lib/profile/InstrProfilingPlatformLinux.c
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@

#if defined(__linux__) || defined(__FreeBSD__) || defined(__Fuchsia__) || \
(defined(__sun__) && defined(__svr4__)) || defined(__NetBSD__) || \
defined(_AIX) || defined(__wasm__)
defined(_AIX) || defined(__wasm__) || defined(__HAIKU__)

#if !defined(_AIX) && !defined(__wasm__)
#include <elf.h>
Expand Down
2 changes: 1 addition & 1 deletion compiler-rt/lib/profile/InstrProfilingPlatformOther.c
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@
#if !defined(__APPLE__) && !defined(__linux__) && !defined(__FreeBSD__) && \
!defined(__Fuchsia__) && !(defined(__sun__) && defined(__svr4__)) && \
!defined(__NetBSD__) && !defined(_WIN32) && !defined(_AIX) && \
!defined(__wasm__)
!defined(__wasm__) && !defined(__HAIKU__)

#include <stdlib.h>
#include <stdio.h>
Expand Down
6 changes: 2 additions & 4 deletions compiler-rt/lib/rtsan/rtsan_interceptors_posix.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -15,12 +15,10 @@

#include "interception/interception.h"
#include "sanitizer_common/sanitizer_allocator_dlsym.h"
#include "sanitizer_common/sanitizer_allocator_internal.h"
#include "sanitizer_common/sanitizer_platform_interceptors.h"

#include "interception/interception.h"
#include "rtsan/rtsan.h"
#include "rtsan/rtsan_context.h"

#if SANITIZER_APPLE

Expand All @@ -33,11 +31,11 @@ extern "C" {
typedef int32_t OSSpinLock;
void OSSpinLockLock(volatile OSSpinLock *__lock);
}
#endif
#endif // TARGET_OS_MAC

#include <libkern/OSAtomic.h>
#include <os/lock.h>
#endif
#endif // SANITIZER_APPLE

#if SANITIZER_INTERCEPT_MEMALIGN || SANITIZER_INTERCEPT_PVALLOC
#include <malloc.h>
Expand Down
44 changes: 13 additions & 31 deletions compiler-rt/lib/rtsan/tests/rtsan_test_interceptors_posix.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -28,11 +28,6 @@
#include <malloc.h>
#endif

#include <atomic>
#include <chrono>
#include <string>
#include <thread>

#include <fcntl.h>
#include <pthread.h>
#include <stdio.h>
Expand All @@ -41,23 +36,10 @@
#include <sys/uio.h>

#if _FILE_OFFSET_BITS == 64 && SANITIZER_GLIBC
const char *const kCreatFunctionName = "creat64";
const char *const kFcntlFunctionName = "fcntl64";
const char *const kFopenFunctionName = "fopen64";
const char *const kOpenAtFunctionName = "openat64";
const char *const kOpenFunctionName = "open64";
const char *const kPreadFunctionName = "pread64";
const char *const kPwriteFunctionName = "pwrite64";
const char *const kMmapFunctionName = "mmap64";
// Under these conditions, some system calls are `foo64` instead of `foo`
#define MAYBE_APPEND_64(func) func "64"
#else
const char *const kCreatFunctionName = "creat";
const char *const kFcntlFunctionName = "fcntl";
const char *const kFopenFunctionName = "fopen";
const char *const kOpenAtFunctionName = "openat";
const char *const kOpenFunctionName = "open";
const char *const kPreadFunctionName = "pread";
const char *const kPwriteFunctionName = "pwrite";
const char *const kMmapFunctionName = "mmap";
#define MAYBE_APPEND_64(func) func
#endif

using namespace testing;
Expand Down Expand Up @@ -187,7 +169,7 @@ TEST(TestRtsanInterceptors, MmapDiesWhenRealtime) {
void *_ = mmap(nullptr, 8, PROT_READ | PROT_WRITE,
MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
};
ExpectRealtimeDeath(Func, kMmapFunctionName);
ExpectRealtimeDeath(Func, MAYBE_APPEND_64("mmap"));
ExpectNonRealtimeSurvival(Func);
}

Expand Down Expand Up @@ -244,13 +226,13 @@ TEST(TestRtsanInterceptors, NanosleepDiesWhenRealtime) {

TEST_F(RtsanFileTest, OpenDiesWhenRealtime) {
auto Func = [this]() { open(GetTemporaryFilePath(), O_RDONLY); };
ExpectRealtimeDeath(Func, kOpenFunctionName);
ExpectRealtimeDeath(Func, MAYBE_APPEND_64("open"));
ExpectNonRealtimeSurvival(Func);
}

TEST_F(RtsanFileTest, OpenatDiesWhenRealtime) {
auto Func = [this]() { openat(0, GetTemporaryFilePath(), O_RDONLY); };
ExpectRealtimeDeath(Func, kOpenAtFunctionName);
ExpectRealtimeDeath(Func, MAYBE_APPEND_64("openat"));
ExpectNonRealtimeSurvival(Func);
}

Expand All @@ -275,13 +257,13 @@ TEST_F(RtsanFileTest, OpenCreatesFileWithProperMode) {

TEST_F(RtsanFileTest, CreatDiesWhenRealtime) {
auto Func = [this]() { creat(GetTemporaryFilePath(), S_IWOTH | S_IROTH); };
ExpectRealtimeDeath(Func, kCreatFunctionName);
ExpectRealtimeDeath(Func, MAYBE_APPEND_64("creat"));
ExpectNonRealtimeSurvival(Func);
}

TEST(TestRtsanInterceptors, FcntlDiesWhenRealtime) {
auto Func = []() { fcntl(0, F_GETFL); };
ExpectRealtimeDeath(Func, kFcntlFunctionName);
ExpectRealtimeDeath(Func, MAYBE_APPEND_64("fcntl"));
ExpectNonRealtimeSurvival(Func);
}

Expand All @@ -300,7 +282,7 @@ TEST_F(RtsanFileTest, FcntlFlockDiesWhenRealtime) {
ASSERT_THAT(fcntl(fd, F_GETLK, &lock), Eq(0));
ASSERT_THAT(lock.l_type, F_UNLCK);
};
ExpectRealtimeDeath(Func, kFcntlFunctionName);
ExpectRealtimeDeath(Func, MAYBE_APPEND_64("fcntl"));
ExpectNonRealtimeSurvival(Func);

close(fd);
Expand All @@ -322,7 +304,7 @@ TEST_F(RtsanFileTest, FcntlSetFdDiesWhenRealtime) {
ASSERT_THAT(fcntl(fd, F_GETFD), Eq(old_flags));
};

ExpectRealtimeDeath(Func, kFcntlFunctionName);
ExpectRealtimeDeath(Func, MAYBE_APPEND_64("fcntl"));
ExpectNonRealtimeSurvival(Func);

close(fd);
Expand All @@ -340,7 +322,7 @@ TEST_F(RtsanFileTest, FopenDiesWhenRealtime) {
EXPECT_THAT(f, Ne(nullptr));
};

ExpectRealtimeDeath(Func, kFopenFunctionName);
ExpectRealtimeDeath(Func, MAYBE_APPEND_64("fopen"));
ExpectNonRealtimeSurvival(Func);
}

Expand Down Expand Up @@ -428,7 +410,7 @@ TEST_F(RtsanOpenedFileTest, PreadDiesWhenRealtime) {
char c{};
pread(GetOpenFd(), &c, 1, 0);
};
ExpectRealtimeDeath(Func, kPreadFunctionName);
ExpectRealtimeDeath(Func, MAYBE_APPEND_64("pread"));
ExpectNonRealtimeSurvival(Func);
}

Expand All @@ -447,7 +429,7 @@ TEST_F(RtsanOpenedFileTest, PwriteDiesWhenRealtime) {
char c = 'a';
pwrite(GetOpenFd(), &c, 1, 0);
};
ExpectRealtimeDeath(Func, kPwriteFunctionName);
ExpectRealtimeDeath(Func, MAYBE_APPEND_64("pwrite"));
ExpectNonRealtimeSurvival(Func);
}

Expand Down
2 changes: 1 addition & 1 deletion compiler-rt/test/builtins/Unit/ctor_dtor.c
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
// REQUIRES: crt

// RUN: %clang -fno-use-init-array -g -c %s -o %t.o
// RUN: %clang -o %t -no-pie -nostdlib %crt1 %crti %crtbegin %t.o -lc %libgcc %crtend %crtn
// RUN: %clang -o %t -no-pie -nostdlib %crt1 %crti %crtbegin %t.o %libc %libgcc %crtend %crtn
// RUN: %run %t 2>&1 | FileCheck %s

#include <stdio.h>
Expand Down
4 changes: 2 additions & 2 deletions compiler-rt/test/builtins/Unit/dso_handle.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,8 @@

// RUN: %clangxx -g -fno-exceptions -DCRT_SHARED -c %s -fPIC -o %tshared.o
// RUN: %clangxx -g -fno-exceptions -c %s -fPIC -o %t.o
// RUN: %clangxx -g -shared -o %t.so -nostdlib %crti %crtbegin %tshared.o %libstdcxx -lc -lm %libgcc %crtend %crtn
// RUN: %clangxx -g -o %t -fno-pic -no-pie -nostdlib %crt1 %crti %crtbegin %t.o %libstdcxx -lc -lm %libgcc %t.so %crtend %crtn
// RUN: %clangxx -g -shared -o %t.so -nostdlib %crti %crtbegin %tshared.o %libstdcxx %libc -lm %libgcc %crtend %crtn
// RUN: %clangxx -g -o %t -fno-pic -no-pie -nostdlib %crt1 %crti %crtbegin %t.o %libstdcxx %libc -lm %libgcc %t.so %crtend %crtn
// RUN: %run %t 2>&1 | FileCheck %s

// UNSUPPORTED: target={{(arm|aarch64).*}}
Expand Down
8 changes: 7 additions & 1 deletion compiler-rt/test/builtins/Unit/lit.cfg.py
Original file line number Diff line number Diff line change
Expand Up @@ -104,7 +104,10 @@ def get_libgcc_file_name():
if sys.platform in ["win32"] and execute_external:
# Don't pass dosish path separator to msys bash.exe.
base_lib = base_lib.replace("\\", "/")
config.substitutions.append(("%librt ", base_lib + " -lc -lm "))
if config.host_os == "Haiku":
config.substitutions.append(("%librt ", base_lib + " -lroot "))
else:
config.substitutions.append(("%librt ", base_lib + " -lc -lm "))

builtins_build_crt = get_required_attr(config, "builtins_build_crt")
if builtins_build_crt:
Expand All @@ -123,6 +126,9 @@ def get_libgcc_file_name():
config.substitutions.append(("%crtn", get_library_path("crtn.o")))

config.substitutions.append(("%libgcc", get_libgcc_file_name()))
config.substitutions.append(
("%libc", "-lroot" if sys.platform.startswith("haiku") else "-lc")
)

config.substitutions.append(
("%libstdcxx", "-l" + config.sanitizer_cxx_lib.lstrip("lib"))
Expand Down
3 changes: 2 additions & 1 deletion compiler-rt/test/lit.common.cfg.py
Original file line number Diff line number Diff line change
Expand Up @@ -82,6 +82,8 @@ def push_dynamic_library_lookup_path(config, new_path):
dynamic_library_lookup_var = "PATH"
elif platform.system() == "Darwin":
dynamic_library_lookup_var = "DYLD_LIBRARY_PATH"
elif platform.system() == "Haiku":
dynamic_library_lookup_var = "LIBRARY_PATH"
else:
dynamic_library_lookup_var = "LD_LIBRARY_PATH"

Expand Down Expand Up @@ -275,7 +277,6 @@ def push_dynamic_library_lookup_path(config, new_path):
"COMPILER_PATH",
"RC_DEBUG_OPTIONS",
"CINDEXTEST_PREAMBLE_FILE",
"LIBRARY_PATH",
"CPATH",
"C_INCLUDE_PATH",
"CPLUS_INCLUDE_PATH",
Expand Down
1 change: 1 addition & 0 deletions compiler-rt/test/profile/Posix/gcov-destructor.c
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
/// Test that destructors and destructors whose priorities are greater than 100 are tracked.
// XFAIL: target={{.*haiku.*}}
// RUN: mkdir -p %t.dir && cd %t.dir
// RUN: %clang --coverage %s -o %t -dumpdir ./
// RUN: rm -f gcov-destructor.gcda && %run %t
Expand Down
1 change: 1 addition & 0 deletions compiler-rt/test/profile/Posix/gcov-dlopen.c
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
/// atexit(3) not supported in dlopen(3)ed+dlclose(3)d DSO
// XFAIL: target={{.*netbsd.*}}
// XFAIL: target={{.*haiku.*}}

// RUN: mkdir -p %t.d && cd %t.d

Expand Down
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
XFAIL: target={{.*haiku.*}}
RUN: rm -rf %t && split-file %s %t && cd %t
RUN: %clang_pgogen -fprofile-update=atomic -fPIC foo.c -c -Xclang -fprofile-instrument-path="default_foo_%m.profraw"
RUN: %clang_pgogen -fprofile-update=atomic -fPIC foo2.c -c -Xclang -fprofile-instrument-path="default_foo2_%m.profraw"
Expand Down
1 change: 1 addition & 0 deletions compiler-rt/test/profile/instrprof-error.c
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
// XFAIL: target={{.*haiku.*}}
// RUN: %clang_profgen -o %t -O3 %s
// RUN: env LLVM_PROFILE_FILE=%t/ %run %t 1 2>&1 | FileCheck %s

Expand Down
1 change: 1 addition & 0 deletions compiler-rt/test/profile/lit.cfg.py
Original file line number Diff line number Diff line change
Expand Up @@ -162,6 +162,7 @@ def exclude_unsupported_files_for_aix(dirname):
"NetBSD",
"SunOS",
"AIX",
"Haiku",
]:
config.unsupported = True

Expand Down
3 changes: 2 additions & 1 deletion flang/include/flang/Parser/parse-tree.h
Original file line number Diff line number Diff line change
Expand Up @@ -3447,7 +3447,8 @@ WRAPPER_CLASS(OmpObjectList, std::list<OmpObject>);
// MUTEXINOUTSET | DEPOBJ | // since 5.0
// INOUTSET // since 5.2
struct OmpTaskDependenceType {
ENUM_CLASS(Type, In, Out, Inout, Source, Sink, Depobj)
ENUM_CLASS(
Type, In, Out, Inout, Inoutset, Mutexinoutset, Source, Sink, Depobj)
WRAPPER_CLASS_BOILERPLATE(OmpTaskDependenceType, Type);
};

Expand Down
55 changes: 23 additions & 32 deletions flang/lib/Lower/DirectivesCommon.h
Original file line number Diff line number Diff line change
Expand Up @@ -179,7 +179,11 @@ static inline void genOmpAccAtomicWriteStatement(
fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder();

mlir::Type varType = fir::unwrapRefType(lhsAddr.getType());
// Create a conversion outside the capture block.
auto insertionPoint = firOpBuilder.saveInsertionPoint();
firOpBuilder.setInsertionPointAfter(rhsExpr.getDefiningOp());
rhsExpr = firOpBuilder.createConvert(loc, varType, rhsExpr);
firOpBuilder.restoreInsertionPoint(insertionPoint);

processOmpAtomicTODO<AtomicListT>(varType, loc);

Expand Down Expand Up @@ -410,10 +414,6 @@ void genOmpAccAtomicRead(Fortran::lower::AbstractConverter &converter,
fir::getBase(converter.genExprAddr(fromExpr, stmtCtx));
mlir::Value toAddress = fir::getBase(converter.genExprAddr(
*Fortran::semantics::GetExpr(assignmentStmtVariable), stmtCtx));
fir::FirOpBuilder &builder = converter.getFirOpBuilder();
if (fromAddress.getType() != toAddress.getType())
fromAddress =
builder.create<fir::ConvertOp>(loc, toAddress.getType(), fromAddress);
genOmpAccAtomicCaptureStatement(converter, fromAddress, toAddress,
leftHandClauseList, rightHandClauseList,
elementType, loc);
Expand Down Expand Up @@ -497,23 +497,12 @@ void genOmpAccAtomicCapture(Fortran::lower::AbstractConverter &converter,
// a `atomic.read`, `atomic.write`, or `atomic.update` operation
// inside `atomic.capture`
Fortran::lower::StatementContext stmtCtx;
mlir::Value stmt1LHSArg, stmt1RHSArg, stmt2LHSArg, stmt2RHSArg;
mlir::Type elementType;
// LHS evaluations are common to all combinations of `atomic.capture`
stmt1LHSArg = fir::getBase(converter.genExprAddr(assign1.lhs, stmtCtx));
stmt2LHSArg = fir::getBase(converter.genExprAddr(assign2.lhs, stmtCtx));
mlir::Value stmt1LHSArg =
fir::getBase(converter.genExprAddr(assign1.lhs, stmtCtx));
mlir::Value stmt2LHSArg =
fir::getBase(converter.genExprAddr(assign2.lhs, stmtCtx));

// Operation specific RHS evaluations
if (Fortran::semantics::checkForSingleVariableOnRHS(stmt1)) {
// Atomic capture construct is of the form [capture-stmt, update-stmt] or
// of the form [capture-stmt, write-stmt]
stmt1RHSArg = fir::getBase(converter.genExprAddr(assign1.rhs, stmtCtx));
stmt2RHSArg = fir::getBase(converter.genExprValue(assign2.rhs, stmtCtx));
} else {
// Atomic capture construct is of the form [update-stmt, capture-stmt]
stmt1RHSArg = fir::getBase(converter.genExprValue(assign1.rhs, stmtCtx));
stmt2RHSArg = fir::getBase(converter.genExprAddr(assign2.lhs, stmtCtx));
}
// Type information used in generation of `atomic.update` operation
mlir::Type stmt1VarType =
fir::getBase(converter.genExprValue(assign1.lhs, stmtCtx)).getType();
Expand Down Expand Up @@ -545,44 +534,46 @@ void genOmpAccAtomicCapture(Fortran::lower::AbstractConverter &converter,
// Atomic capture construct is of the form [capture-stmt, update-stmt]
const Fortran::semantics::SomeExpr &fromExpr =
*Fortran::semantics::GetExpr(stmt1Expr);
elementType = converter.genType(fromExpr);
mlir::Type elementType = converter.genType(fromExpr);
genOmpAccAtomicCaptureStatement<AtomicListT>(
converter, stmt1RHSArg, stmt1LHSArg,
converter, stmt2LHSArg, stmt1LHSArg,
/*leftHandClauseList=*/nullptr,
/*rightHandClauseList=*/nullptr, elementType, loc);
genOmpAccAtomicUpdateStatement<AtomicListT>(
converter, stmt1RHSArg, stmt2VarType, stmt2Var, stmt2Expr,
converter, stmt2LHSArg, stmt2VarType, stmt2Var, stmt2Expr,
/*leftHandClauseList=*/nullptr,
/*rightHandClauseList=*/nullptr, loc, atomicCaptureOp);
} else {
// Atomic capture construct is of the form [capture-stmt, write-stmt]
firOpBuilder.setInsertionPoint(atomicCaptureOp);
mlir::Value stmt2RHSArg =
fir::getBase(converter.genExprValue(assign2.rhs, stmtCtx));
firOpBuilder.setInsertionPointToStart(&block);
const Fortran::semantics::SomeExpr &fromExpr =
*Fortran::semantics::GetExpr(stmt1Expr);
elementType = converter.genType(fromExpr);
mlir::Type elementType = converter.genType(fromExpr);
genOmpAccAtomicCaptureStatement<AtomicListT>(
converter, stmt1RHSArg, stmt1LHSArg,
converter, stmt2LHSArg, stmt1LHSArg,
/*leftHandClauseList=*/nullptr,
/*rightHandClauseList=*/nullptr, elementType, loc);
genOmpAccAtomicWriteStatement<AtomicListT>(
converter, stmt1RHSArg, stmt2RHSArg,
converter, stmt2LHSArg, stmt2RHSArg,
/*leftHandClauseList=*/nullptr,
/*rightHandClauseList=*/nullptr, loc);
}
} else {
// Atomic capture construct is of the form [update-stmt, capture-stmt]
firOpBuilder.setInsertionPointToEnd(&block);
const Fortran::semantics::SomeExpr &fromExpr =
*Fortran::semantics::GetExpr(stmt2Expr);
elementType = converter.genType(fromExpr);
genOmpAccAtomicCaptureStatement<AtomicListT>(
converter, stmt1LHSArg, stmt2LHSArg,
/*leftHandClauseList=*/nullptr,
/*rightHandClauseList=*/nullptr, elementType, loc);
firOpBuilder.setInsertionPointToStart(&block);
mlir::Type elementType = converter.genType(fromExpr);
genOmpAccAtomicUpdateStatement<AtomicListT>(
converter, stmt1LHSArg, stmt1VarType, stmt1Var, stmt1Expr,
/*leftHandClauseList=*/nullptr,
/*rightHandClauseList=*/nullptr, loc, atomicCaptureOp);
genOmpAccAtomicCaptureStatement<AtomicListT>(
converter, stmt1LHSArg, stmt2LHSArg,
/*leftHandClauseList=*/nullptr,
/*rightHandClauseList=*/nullptr, elementType, loc);
}
firOpBuilder.setInsertionPointToEnd(&block);
if constexpr (std::is_same<AtomicListT,
Expand Down
11 changes: 7 additions & 4 deletions flang/lib/Lower/OpenMP/ClauseProcessor.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -121,8 +121,11 @@ genProcBindKindAttr(fir::FirOpBuilder &firOpBuilder,
}

static mlir::omp::ClauseTaskDependAttr
genDependKindAttr(fir::FirOpBuilder &firOpBuilder,
genDependKindAttr(lower::AbstractConverter &converter,
const omp::clause::Depend::TaskDependenceType kind) {
fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder();
mlir::Location currentLocation = converter.getCurrentLocation();

mlir::omp::ClauseTaskDepend pbKind;
switch (kind) {
case omp::clause::Depend::TaskDependenceType::In:
Expand All @@ -136,6 +139,8 @@ genDependKindAttr(fir::FirOpBuilder &firOpBuilder,
break;
case omp::clause::Depend::TaskDependenceType::Mutexinoutset:
case omp::clause::Depend::TaskDependenceType::Inoutset:
TODO(currentLocation, "INOUTSET and MUTEXINOUTSET are not supported yet");
break;
case omp::clause::Depend::TaskDependenceType::Depobj:
case omp::clause::Depend::TaskDependenceType::Sink:
case omp::clause::Depend::TaskDependenceType::Source:
Expand Down Expand Up @@ -795,8 +800,6 @@ bool ClauseProcessor::processCopyprivate(
}

bool ClauseProcessor::processDepend(mlir::omp::DependClauseOps &result) const {
fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder();

auto process = [&](const omp::clause::Depend &clause,
const parser::CharBlock &) {
using Depend = omp::clause::Depend;
Expand All @@ -813,7 +816,7 @@ bool ClauseProcessor::processDepend(mlir::omp::DependClauseOps &result) const {
"Support for iterator modifiers is not implemented yet");
}
mlir::omp::ClauseTaskDependAttr dependTypeOperand =
genDependKindAttr(firOpBuilder, kind);
genDependKindAttr(converter, kind);
result.dependKinds.append(objects.size(), dependTypeOperand);

for (const omp::Object &object : objects) {
Expand Down
6 changes: 4 additions & 2 deletions flang/lib/Lower/OpenMP/Clauses.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -347,8 +347,10 @@ makeDepType(const parser::OmpTaskDependenceType &inp) {
return clause::TaskDependenceType::In;
case parser::OmpTaskDependenceType::Type::Inout:
return clause::TaskDependenceType::Inout;
// Inoutset // missing-in-parser
// Mutexinoutset // missing-in-parser
case parser::OmpTaskDependenceType::Type::Inoutset:
return clause::TaskDependenceType::Inoutset;
case parser::OmpTaskDependenceType::Type::Mutexinoutset:
return clause::TaskDependenceType::Mutexinoutset;
case parser::OmpTaskDependenceType::Type::Out:
return clause::TaskDependenceType::Out;
case parser::OmpTaskDependenceType::Type::Sink:
Expand Down
68 changes: 54 additions & 14 deletions flang/lib/Optimizer/Analysis/AliasAnalysis.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -372,6 +372,29 @@ getAttrsFromVariable(fir::FortranVariableOpInterface var) {
return attrs;
}

template <typename OMPTypeOp, typename DeclTypeOp>
static Value getPrivateArg(omp::BlockArgOpenMPOpInterface &argIface,
OMPTypeOp &op, DeclTypeOp &declOp) {
Value privateArg;
if (!op.getPrivateSyms().has_value())
return privateArg;
for (auto [opSym, blockArg] :
llvm::zip_equal(*op.getPrivateSyms(), argIface.getPrivateBlockArgs())) {
if (blockArg == declOp.getMemref()) {
omp::PrivateClauseOp privateOp =
SymbolTable::lookupNearestSymbolFrom<omp::PrivateClauseOp>(
op, cast<SymbolRefAttr>(opSym));
privateOp.walk([&](omp::YieldOp yieldOp) {
llvm::TypeSwitch<Operation *>(yieldOp.getResults()[0].getDefiningOp())
.template Case<fir::DeclareOp, hlfir::DeclareOp>(
[&](auto declOp) { privateArg = declOp.getMemref(); });
});
return privateArg;
}
}
return privateArg;
}

AliasAnalysis::Source AliasAnalysis::getSource(mlir::Value v,
bool getInstantiationPoint) {
auto *defOp = v.getDefiningOp();
Expand Down Expand Up @@ -470,20 +493,37 @@ AliasAnalysis::Source AliasAnalysis::getSource(mlir::Value v,
breakFromLoop = true;
})
.Case<hlfir::DeclareOp, fir::DeclareOp>([&](auto op) {
// If declare operation is inside omp target region,
// continue alias analysis outside the target region
if (auto targetOp =
llvm::dyn_cast<omp::TargetOp>(op->getParentOp())) {
auto argIface = cast<omp::BlockArgOpenMPOpInterface>(*targetOp);
for (auto [opArg, blockArg] : llvm::zip_equal(
targetOp.getMapVars(), argIface.getMapBlockArgs())) {
if (blockArg == op.getMemref()) {
omp::MapInfoOp mapInfo =
llvm::cast<omp::MapInfoOp>(opArg.getDefiningOp());
v = mapInfo.getVarPtr();
defOp = v.getDefiningOp();
return;
}
if (omp::BlockArgOpenMPOpInterface argIface =
dyn_cast<omp::BlockArgOpenMPOpInterface>(op->getParentOp())) {
Value ompValArg;
llvm::TypeSwitch<Operation *>(op->getParentOp())
.template Case<omp::TargetOp>([&](auto targetOp) {
// If declare operation is inside omp target region,
// continue alias analysis outside the target region
for (auto [opArg, blockArg] : llvm::zip_equal(
targetOp.getMapVars(), argIface.getMapBlockArgs())) {
if (blockArg == op.getMemref()) {
omp::MapInfoOp mapInfo =
llvm::cast<omp::MapInfoOp>(opArg.getDefiningOp());
ompValArg = mapInfo.getVarPtr();
break;
}
}
// If given operation does not reflect mapping item,
// check private clause
if (!ompValArg)
ompValArg = getPrivateArg(argIface, targetOp, op);
})
.template Case<omp::DistributeOp, omp::ParallelOp,
omp::SectionsOp, omp::SimdOp, omp::SingleOp,
omp::TaskloopOp, omp::TaskOp, omp::WsloopOp>(
[&](auto privateOp) {
ompValArg = getPrivateArg(argIface, privateOp, op);
});
if (ompValArg) {
v = ompValArg;
defOp = ompValArg.getDefiningOp();
return;
}
}
auto varIf = llvm::cast<fir::FortranVariableOpInterface>(defOp);
Expand Down
4 changes: 2 additions & 2 deletions flang/lib/Optimizer/Passes/Pipelines.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -264,10 +264,10 @@ void createDefaultFIRCodeGenPassPipeline(mlir::PassManager &pm,
addNestedPassToAllTopLevelOperations(pm, fir::createAbstractResultOpt);
fir::addCodeGenRewritePass(
pm, (config.DebugInfo != llvm::codegenoptions::NoDebugInfo));
fir::addTargetRewritePass(pm);
fir::addCompilerGeneratedNamesConversionPass(pm);
fir::addExternalNameConversionPass(pm, config.Underscoring);
fir::createDebugPasses(pm, config.DebugInfo, config.OptLevel, inputFilename);
fir::addTargetRewritePass(pm);
fir::addCompilerGeneratedNamesConversionPass(pm);

if (config.VScaleMin != 0)
pm.addPass(fir::createVScaleAttr({{config.VScaleMin, config.VScaleMax}}));
Expand Down
4 changes: 3 additions & 1 deletion flang/lib/Parser/openmp-parsers.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -402,7 +402,9 @@ TYPE_PARSER(
TYPE_PARSER(construct<OmpTaskDependenceType>(
"DEPOBJ" >> pure(OmpTaskDependenceType::Type::Depobj) ||
"IN"_id >> pure(OmpTaskDependenceType::Type::In) ||
"INOUT" >> pure(OmpTaskDependenceType::Type::Inout) ||
"INOUT"_id >> pure(OmpTaskDependenceType::Type::Inout) ||
"INOUTSET"_id >> pure(OmpTaskDependenceType::Type::Inoutset) ||
"MUTEXINOUTSET" >> pure(OmpTaskDependenceType::Type::Mutexinoutset) ||
"OUT" >> pure(OmpTaskDependenceType::Type::Out) ||
"SINK" >> pure(OmpTaskDependenceType::Type::Sink) ||
"SOURCE" >> pure(OmpTaskDependenceType::Type::Source)))
Expand Down
56 changes: 42 additions & 14 deletions flang/lib/Semantics/check-omp-structure.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1732,6 +1732,45 @@ void OmpStructureChecker::CheckTargetUpdate() {
}
}

void OmpStructureChecker::CheckTaskDependenceType(
const parser::OmpTaskDependenceType::Type &x) {
// Common checks for task-dependence-type (DEPEND and UPDATE clauses).
unsigned version{context_.langOptions().OpenMPVersion};
unsigned since{0}, deprecatedIn{~0u};

switch (x) {
case parser::OmpTaskDependenceType::Type::In:
case parser::OmpTaskDependenceType::Type::Out:
case parser::OmpTaskDependenceType::Type::Inout:
break;
case parser::OmpTaskDependenceType::Type::Source:
case parser::OmpTaskDependenceType::Type::Sink:
deprecatedIn = 52;
break;
case parser::OmpTaskDependenceType::Type::Mutexinoutset:
case parser::OmpTaskDependenceType::Type::Depobj:
since = 50;
break;
case parser::OmpTaskDependenceType::Type::Inoutset:
since = 52;
break;
}

if (version >= deprecatedIn) {
context_.Say(GetContext().clauseSource,
"%s task-dependence-type is deprecated in %s"_warn_en_US,
parser::ToUpperCaseLetters(
parser::OmpTaskDependenceType::EnumToString(x)),
ThisVersion(deprecatedIn));
} else if (version < since) {
context_.Say(GetContext().clauseSource,
"%s task-dependence-type is not supported in %s, %s"_warn_en_US,
parser::ToUpperCaseLetters(
parser::OmpTaskDependenceType::EnumToString(x)),
ThisVersion(version), TryVersion(since));
}
}

void OmpStructureChecker::Enter(
const parser::OpenMPSimpleStandaloneConstruct &x) {
const auto &dir{std::get<parser::OmpSimpleStandaloneDirective>(x.t)};
Expand Down Expand Up @@ -3393,20 +3432,7 @@ void OmpStructureChecker::Enter(const parser::OmpClause::Depend &x) {
using DepType = parser::OmpTaskDependenceType::Type;
DepType depType = x.v.GetDepType();

if (version >= 52) {
switch (depType) {
case DepType::Sink:
case DepType::Source:
context_.Say(GetContext().clauseSource,
"The %s task-dependence-type is deprecated in %s"_warn_en_US,
parser::ToUpperCaseLetters(
parser::OmpTaskDependenceType::EnumToString(depType)),
ThisVersion(version));
break;
default:
break;
}
}
CheckTaskDependenceType(depType);

if (directive == llvm::omp::OMPD_depobj) {
// [5.0:255:11], [5.1:288:3]
Expand Down Expand Up @@ -3593,6 +3619,8 @@ void OmpStructureChecker::Enter(const parser::OmpClause::Update &x) {
llvm::omp::Directive directive{GetContext().directive};
unsigned version{context_.langOptions().OpenMPVersion};

CheckTaskDependenceType(x.v.v.v);

// [5.1:288:4-5]
// An update clause on a depobj construct must not have source, sink or depobj
// as dependence-type.
Expand Down
1 change: 1 addition & 0 deletions flang/lib/Semantics/check-omp-structure.h
Original file line number Diff line number Diff line change
Expand Up @@ -202,6 +202,7 @@ class OmpStructureChecker
void CheckSIMDNest(const parser::OpenMPConstruct &x);
void CheckTargetNest(const parser::OpenMPConstruct &x);
void CheckTargetUpdate();
void CheckTaskDependenceType(const parser::OmpTaskDependenceType::Type &x);
void CheckCancellationNest(
const parser::CharBlock &source, const parser::OmpCancelType::Type &type);
std::int64_t GetOrdCollapseLevel(const parser::OpenMPLoopConstruct &x);
Expand Down
Loading