Skip to content

Commit

Permalink
IR: Define byref parameter attribute
Browse files Browse the repository at this point in the history
This allows tracking the in-memory type of a pointer argument to a
function for ABI purposes. This is essentially a stripped down version
of byval to remove some of the stack-copy implications in its
definition.

This includes the base IR changes, and some tests for places where it
should be treated similarly to byval. Codegen support will be in a
future patch.

My original attempt at solving some of these problems was to repurpose
byval with a different address space from the stack. However, it is
technically permitted for the callee to introduce a write to the
argument, although nothing does this in reality. There is also talk of
removing and replacing the byval attribute, so a new attribute would
need to take its place anyway.

This is intended avoid some optimization issues with the current
handling of aggregate arguments, as well as fixes inflexibilty in how
frontends can specify the kernel ABI. The most honest representation
of the amdgpu_kernel convention is to expose all kernel arguments as
loads from constant memory. Today, these are raw, SSA Argument values
and codegen is responsible for turning these into loads.

Background:

There currently isn't a satisfactory way to represent how arguments
for the amdgpu_kernel calling convention are passed. In reality,
arguments are passed in a single, flat, constant memory buffer
implicitly passed to the function. It is also illegal to call this
function in the IR, and this is only ever invoked by a driver of some
kind.

It does not make sense to have a stack passed parameter in this
context as is implied by byval. It is never valid to write to the
kernel arguments, as this would corrupt the inputs seen by other
dispatches of the kernel. These argumets are also not in the same
address space as the stack, so a copy is needed to an alloca. From a
source C-like language, the kernel parameters are invisible.
Semantically, a copy is always required from the constant argument
memory to a mutable variable.

The current clang calling convention lowering emits raw values,
including aggregates into the function argument list, since using
byval would not make sense. This has some unfortunate consequences for
the optimizer. In the aggregate case, we end up with an aggregate
store to alloca, which both SROA and instcombine turn into a store of
each aggregate field. The optimizer never pieces this back together to
see that this is really just a copy from constant memory, so we end up
stuck with expensive stack usage.

This also means the backend dictates the alignment of arguments, and
arbitrarily picks the LLVM IR ABI type alignment. By allowing an
explicit alignment, frontends can make better decisions. For example,
there's real no advantage to an aligment higher than 4, so a frontend
could choose to compact the argument layout. Similarly, there is a
high penalty to using an alignment lower than 4, so a frontend could
opt into more padding for small arguments.

Another design consideration is when it is appropriate to expose the
fact that these arguments are all really passed in adjacent
memory. Currently we have a late IR optimization pass in codegen to
rewrite the kernel argument values into explicit loads to enable
vectorization. In most programs, unrelated argument loads can be
merged together. However, exposing this property directly from the
frontend has some disadvantages. We still need a way to track the
original argument sizes and alignments to report to the driver. I find
using some side-channel, metadata mechanism to track this
unappealing. If the kernel arguments were exposed as a single buffer
to begin with, alias analysis would be unaware that the padding bits
betewen arguments are meaningless. Another family of problems is there
are still some gaps in replacing all of the available parameter
attributes with metadata equivalents once lowered to loads.

The immediate plan is to start using this new attribute to handle all
aggregate argumets for kernels. Long term, it makes sense to migrate
all kernel arguments, including scalars, to be passed indirectly in
the same manner.

Additional context is in D79744.
  • Loading branch information
arsenm committed Jul 20, 2020
1 parent 017e5c9 commit 5e999cb
Show file tree
Hide file tree
Showing 38 changed files with 516 additions and 29 deletions.
24 changes: 24 additions & 0 deletions llvm/docs/LangRef.rst
Expand Up @@ -1066,6 +1066,30 @@ Currently, only the following parameter attributes are defined:
site. If the alignment is not specified, then the code generator
makes a target-specific assumption.

.. _attr_byref:

``byref(<ty>)``

The ``byref`` argument attribute allows specifying the pointee
memory type of an argument. This is similar to ``byval``, but does
not imply a copy is made anywhere, or that the argument is passed
on the stack. This implies the pointer is dereferenceable up to
the storage size of the type.

It is not generally permissible to introduce a write to an
``byref`` pointer. The pointer may have any address space and may
be read only.

This is not a valid attribute for return values.

The alignment for an ``byref`` parameter can be explicitly
specified by combining it with the ``align`` attribute, similar to
``byval``. If the alignment is not specified, then the code generator
makes a target-specific assumption.

This is intended for representing ABI constraints, and is not
intended to be inferred for optimization use.

.. _attr_preallocated:

``preallocated(<ty>)``
Expand Down
6 changes: 6 additions & 0 deletions llvm/docs/ReleaseNotes.rst
Expand Up @@ -59,6 +59,9 @@ Changes to the LLVM IR

* ...

* Added the ``byref`` attribute to better represent argument passing
for the `amdgpu_kernel` calling convention.

Changes to building LLVM
------------------------

Expand Down Expand Up @@ -88,6 +91,9 @@ Changes to the AMDGPU Target

During this release ...

* The new ``byref`` attribute is now the preferred method for
representing aggregate kernel arguments.

Changes to the AVR Target
-----------------------------

Expand Down
1 change: 1 addition & 0 deletions llvm/include/llvm/Bitcode/LLVMBitCodes.h
Expand Up @@ -644,6 +644,7 @@ enum AttributeKindCodes {
ATTR_KIND_NO_MERGE = 66,
ATTR_KIND_NULL_POINTER_IS_VALID = 67,
ATTR_KIND_NOUNDEF = 68,
ATTR_KIND_BYREF = 69,
};

enum ComdatSelectionKindCodes {
Expand Down
15 changes: 15 additions & 0 deletions llvm/include/llvm/IR/Argument.h
Expand Up @@ -65,6 +65,9 @@ class Argument final : public Value {
/// Return true if this argument has the byval attribute.
bool hasByValAttr() const;

/// Return true if this argument has the byref attribute.
bool hasByRefAttr() const;

/// Return true if this argument has the swiftself attribute.
bool hasSwiftSelfAttr() const;

Expand All @@ -80,6 +83,15 @@ class Argument final : public Value {
/// in-memory ABI size copied to the stack for the call. Otherwise, return 0.
uint64_t getPassPointeeByValueCopySize(const DataLayout &DL) const;

/// Return true if this argument has the byval, inalloca, preallocated, or
/// byref attribute. These attributes represent arguments being passed by
/// value (which may or may not involve a stack copy)
bool hasPointeeInMemoryValueAttr() const;

/// If hasPointeeInMemoryValueAttr returns true, the in-memory ABI type is
/// returned. Otherwise, nullptr.
Type *getPointeeInMemoryValueType() const;

/// If this is a byval or inalloca argument, return its alignment.
/// FIXME: Remove this function once transition to Align is over.
/// Use getParamAlign() instead.
Expand All @@ -91,6 +103,9 @@ class Argument final : public Value {
/// If this is a byval argument, return its type.
Type *getParamByValType() const;

/// If this is a byref argument, return its type.
Type *getParamByRefType() const;

/// Return true if this argument has the nest attribute.
bool hasNestAttr() const;

Expand Down
12 changes: 12 additions & 0 deletions llvm/include/llvm/IR/Attributes.h
Expand Up @@ -108,6 +108,7 @@ class Attribute {
unsigned ElemSizeArg,
const Optional<unsigned> &NumElemsArg);
static Attribute getWithByValType(LLVMContext &Context, Type *Ty);
static Attribute getWithByRefType(LLVMContext &Context, Type *Ty);
static Attribute getWithPreallocatedType(LLVMContext &Context, Type *Ty);

static Attribute::AttrKind getAttrKindFromName(StringRef AttrName);
Expand Down Expand Up @@ -303,6 +304,7 @@ class AttributeSet {
uint64_t getDereferenceableBytes() const;
uint64_t getDereferenceableOrNullBytes() const;
Type *getByValType() const;
Type *getByRefType() const;
Type *getPreallocatedType() const;
std::pair<unsigned, Optional<unsigned>> getAllocSizeArgs() const;
std::string getAsString(bool InAttrGrp = false) const;
Expand Down Expand Up @@ -626,6 +628,9 @@ class AttributeList {
/// Return the byval type for the specified function parameter.
Type *getParamByValType(unsigned ArgNo) const;

/// Return the byref type for the specified function parameter.
Type *getParamByRefType(unsigned ArgNo) const;

/// Return the preallocated type for the specified function parameter.
Type *getParamPreallocatedType(unsigned ArgNo) const;

Expand Down Expand Up @@ -729,6 +734,7 @@ class AttrBuilder {
uint64_t DerefOrNullBytes = 0;
uint64_t AllocSizeArgs = 0;
Type *ByValType = nullptr;
Type *ByRefType = nullptr;
Type *PreallocatedType = nullptr;

public:
Expand Down Expand Up @@ -808,6 +814,9 @@ class AttrBuilder {
/// Retrieve the byval type.
Type *getByValType() const { return ByValType; }

/// Retrieve the byref type.
Type *getByRefType() const { return ByRefType; }

/// Retrieve the preallocated type.
Type *getPreallocatedType() const { return PreallocatedType; }

Expand Down Expand Up @@ -854,6 +863,9 @@ class AttrBuilder {
/// This turns a byval type into the form used internally in Attribute.
AttrBuilder &addByValAttr(Type *Ty);

/// This turns a byref type into the form used internally in Attribute.
AttrBuilder &addByRefAttr(Type *Ty);

/// This turns a preallocated type into the form used internally in Attribute.
AttrBuilder &addPreallocatedAttr(Type *Ty);

Expand Down
3 changes: 3 additions & 0 deletions llvm/include/llvm/IR/Attributes.td
Expand Up @@ -39,6 +39,9 @@ def Builtin : EnumAttr<"builtin">;
/// Pass structure by value.
def ByVal : TypeAttr<"byval">;

/// Mark in-memory ABI type.
def ByRef : TypeAttr<"byref">;

/// Parameter or return value may not contain uninitialized or poison bits.
def NoUndef : EnumAttr<"noundef">;

Expand Down
5 changes: 5 additions & 0 deletions llvm/include/llvm/IR/Function.h
Expand Up @@ -467,6 +467,11 @@ class Function : public GlobalObject, public ilist_node<Function> {
return Ty ? Ty : (arg_begin() + ArgNo)->getType()->getPointerElementType();
}

/// Extract the byref type for a parameter.
Type *getParamByRefType(unsigned ArgNo) const {
return AttributeSets.getParamByRefType(ArgNo);
}

/// Extract the number of dereferenceable bytes for a call or
/// parameter (0=unknown).
/// @param i AttributeList index, referring to a return value or argument.
Expand Down
7 changes: 4 additions & 3 deletions llvm/lib/Analysis/MemoryBuiltins.cpp
Expand Up @@ -676,13 +676,14 @@ SizeOffsetType ObjectSizeOffsetVisitor::visitAllocaInst(AllocaInst &I) {
}

SizeOffsetType ObjectSizeOffsetVisitor::visitArgument(Argument &A) {
Type *MemoryTy = A.getPointeeInMemoryValueType();
// No interprocedural analysis is done at the moment.
if (!A.hasPassPointeeByValueCopyAttr()) {
if (!MemoryTy) {
++ObjectVisitorArgument;
return unknown();
}
PointerType *PT = cast<PointerType>(A.getType());
APInt Size(IntTyBits, DL.getTypeAllocSize(PT->getElementType()));

APInt Size(IntTyBits, DL.getTypeAllocSize(MemoryTy));
return std::make_pair(align(Size, A.getParamAlignment()), Zero);
}

Expand Down
1 change: 1 addition & 0 deletions llvm/lib/AsmParser/LLLexer.cpp
Expand Up @@ -697,6 +697,7 @@ lltok::Kind LLLexer::LexIdentifier() {
KEYWORD(writeonly);
KEYWORD(zeroext);
KEYWORD(immarg);
KEYWORD(byref);

KEYWORD(type);
KEYWORD(opaque);
Expand Down
29 changes: 25 additions & 4 deletions llvm/lib/AsmParser/LLParser.cpp
Expand Up @@ -1382,6 +1382,7 @@ bool LLParser::ParseFnAttributeValuePairs(AttrBuilder &B,
case lltok::kw_swifterror:
case lltok::kw_swiftself:
case lltok::kw_immarg:
case lltok::kw_byref:
HaveError |=
Error(Lex.getLoc(),
"invalid use of parameter-only attribute on a function");
Expand Down Expand Up @@ -1675,6 +1676,13 @@ bool LLParser::ParseOptionalParamAttrs(AttrBuilder &B) {
B.addDereferenceableOrNullAttr(Bytes);
continue;
}
case lltok::kw_byref: {
Type *Ty;
if (ParseByRef(Ty))
return true;
B.addByRefAttr(Ty);
continue;
}
case lltok::kw_inalloca: B.addAttribute(Attribute::InAlloca); break;
case lltok::kw_inreg: B.addAttribute(Attribute::InReg); break;
case lltok::kw_nest: B.addAttribute(Attribute::Nest); break;
Expand Down Expand Up @@ -1795,6 +1803,7 @@ bool LLParser::ParseOptionalReturnAttrs(AttrBuilder &B) {
case lltok::kw_swifterror:
case lltok::kw_swiftself:
case lltok::kw_immarg:
case lltok::kw_byref:
HaveError |= Error(Lex.getLoc(), "invalid use of parameter-only attribute");
break;

Expand Down Expand Up @@ -2568,11 +2577,11 @@ bool LLParser::ParseByValWithOptionalType(Type *&Result) {
return false;
}

/// ParsePreallocated
/// ::= preallocated(<ty>)
bool LLParser::ParsePreallocated(Type *&Result) {
/// ParseRequiredTypeAttr
/// ::= attrname(<ty>)
bool LLParser::ParseRequiredTypeAttr(Type *&Result, lltok::Kind AttrName) {
Result = nullptr;
if (!EatIfPresent(lltok::kw_preallocated))
if (!EatIfPresent(AttrName))
return true;
if (!EatIfPresent(lltok::lparen))
return Error(Lex.getLoc(), "expected '('");
Expand All @@ -2583,6 +2592,18 @@ bool LLParser::ParsePreallocated(Type *&Result) {
return false;
}

/// ParsePreallocated
/// ::= preallocated(<ty>)
bool LLParser::ParsePreallocated(Type *&Result) {
return ParseRequiredTypeAttr(Result, lltok::kw_preallocated);
}

/// ParseByRef
/// ::= byref(<type>)
bool LLParser::ParseByRef(Type *&Result) {
return ParseRequiredTypeAttr(Result, lltok::kw_byref);
}

/// ParseOptionalOperandBundles
/// ::= /*empty*/
/// ::= '[' OperandBundle [, OperandBundle ]* ']'
Expand Down
3 changes: 3 additions & 0 deletions llvm/lib/AsmParser/LLParser.h
Expand Up @@ -333,7 +333,10 @@ namespace llvm {
std::vector<unsigned> &FwdRefAttrGrps,
bool inAttrGrp, LocTy &BuiltinLoc);
bool ParseByValWithOptionalType(Type *&Result);

bool ParseRequiredTypeAttr(Type *&Result, lltok::Kind AttrName);
bool ParsePreallocated(Type *&Result);
bool ParseByRef(Type *&Result);

// Module Summary Index Parsing.
bool SkipModuleSummaryEntry();
Expand Down
1 change: 1 addition & 0 deletions llvm/lib/AsmParser/LLToken.h
Expand Up @@ -240,6 +240,7 @@ enum Kind {
kw_writeonly,
kw_zeroext,
kw_immarg,
kw_byref,

kw_type,
kw_opaque,
Expand Down
4 changes: 4 additions & 0 deletions llvm/lib/Bitcode/Reader/BitcodeReader.cpp
Expand Up @@ -1532,6 +1532,8 @@ static Attribute::AttrKind getAttrFromCode(uint64_t Code) {
return Attribute::Preallocated;
case bitc::ATTR_KIND_NOUNDEF:
return Attribute::NoUndef;
case bitc::ATTR_KIND_BYREF:
return Attribute::ByRef;
}
}

Expand Down Expand Up @@ -1649,6 +1651,8 @@ Error BitcodeReader::parseAttributeGroupBlock() {
return Err;
if (Kind == Attribute::ByVal) {
B.addByValAttr(HasType ? getTypeByID(Record[++i]) : nullptr);
} else if (Kind == Attribute::ByRef) {
B.addByRefAttr(getTypeByID(Record[++i]));
} else if (Kind == Attribute::Preallocated) {
B.addPreallocatedAttr(getTypeByID(Record[++i]));
}
Expand Down
2 changes: 2 additions & 0 deletions llvm/lib/Bitcode/Writer/BitcodeWriter.cpp
Expand Up @@ -733,6 +733,8 @@ static uint64_t getAttrKindEncoding(Attribute::AttrKind Kind) {
return bitc::ATTR_KIND_PREALLOCATED;
case Attribute::NoUndef:
return bitc::ATTR_KIND_NOUNDEF;
case Attribute::ByRef:
return bitc::ATTR_KIND_BYREF;
case Attribute::EndAttrKinds:
llvm_unreachable("Can not encode end-attribute kinds marker.");
case Attribute::None:
Expand Down
3 changes: 3 additions & 0 deletions llvm/lib/IR/AsmWriter.cpp
Expand Up @@ -4269,11 +4269,14 @@ void AssemblyWriter::writeAttribute(const Attribute &Attr, bool InAttrGroup) {
}

assert((Attr.hasAttribute(Attribute::ByVal) ||
Attr.hasAttribute(Attribute::ByRef) ||
Attr.hasAttribute(Attribute::Preallocated)) &&
"unexpected type attr");

if (Attr.hasAttribute(Attribute::ByVal)) {
Out << "byval";
} else if (Attr.hasAttribute(Attribute::ByRef)) {
Out << "byref";
} else {
Out << "preallocated";
}
Expand Down
1 change: 1 addition & 0 deletions llvm/lib/IR/AttributeImpl.h
Expand Up @@ -251,6 +251,7 @@ class AttributeSetNode final
std::pair<unsigned, Optional<unsigned>> getAllocSizeArgs() const;
std::string getAsString(bool InAttrGrp) const;
Type *getByValType() const;
Type *getByRefType() const;
Type *getPreallocatedType() const;

using iterator = const Attribute *;
Expand Down

0 comments on commit 5e999cb

Please sign in to comment.