Skip to content

Commit

Permalink
[Clang] Allow "ext_vector_type" applied to Booleans
Browse files Browse the repository at this point in the history
This is the `ext_vector_type` alternative to D81083.

This patch extends Clang to allow 'bool' as a valid vector element type
(attribute ext_vector_type) in C/C++.

This is intended as the canonical type for SIMD masks and facilitates
clean vector intrinsic declarations.  Vectors of i1 are supported on IR
level and below down to many SIMD ISAs, such as AVX512, ARM SVE (fixed
vector length) and the VE target (NEC SX-Aurora TSUBASA).

The RFC on cfe-dev: https://lists.llvm.org/pipermail/cfe-dev/2020-May/065434.html

Reviewed By: erichkeane

Differential Revision: https://reviews.llvm.org/D88905
  • Loading branch information
simoll committed Mar 16, 2022
1 parent 9e24f0f commit 0aab344
Show file tree
Hide file tree
Showing 22 changed files with 485 additions and 72 deletions.
65 changes: 65 additions & 0 deletions clang/docs/LanguageExtensions.rst
Expand Up @@ -445,6 +445,67 @@ NEON vector types are created using ``neon_vector_type`` and
return v;
}

GCC vector types are created using the ``vector_size(N)`` attribute. The
argument ``N`` specifies the number of bytes that will be allocated for an
object of this type. The size has to be multiple of the size of the vector
element type. For example:

.. code-block:: c++

// OK: This declares a vector type with four 'int' elements
typedef int int4 __attribute__((vector_size(4 * sizeof(int))));

// ERROR: '11' is not a multiple of sizeof(int)
typedef int int_impossible __attribute__((vector_size(11)));

int4 foo(int4 a) {
int4 v;
v = a;
return v;
}


Boolean Vectors
---------------

Clang also supports the ext_vector_type attribute with boolean element types in
C and C++. For example:

.. code-block:: c++

// legal for Clang, error for GCC:
typedef bool bool4 __attribute__((ext_vector_type(4)));
// Objects of bool4 type hold 8 bits, sizeof(bool4) == 1

bool4 foo(bool4 a) {
bool4 v;
v = a;
return v;
}

Boolean vectors are a Clang extension of the ext vector type. Boolean vectors
are intended, though not guaranteed, to map to vector mask registers. The size
parameter of a boolean vector type is the number of bits in the vector. The
boolean vector is dense and each bit in the boolean vector is one vector
element.

The semantics of boolean vectors borrows from C bit-fields with the following
differences:

* Distinct boolean vectors are always distinct memory objects (there is no
packing).
* Only the operators `?:`, `!`, `~`, `|`, `&`, `^` and comparison are allowed on
boolean vectors.
* Casting a scalar bool value to a boolean vector type means broadcasting the
scalar value onto all lanes (same as general ext_vector_type).
* It is not possible to access or swizzle elements of a boolean vector
(different than general ext_vector_type).

The size and alignment are both the number of bits rounded up to the next power
of two, but the alignment is at most the maximum vector alignment of the
target.


Vector Literals
---------------

Expand Down Expand Up @@ -496,6 +557,7 @@ C-style cast yes yes yes no
reinterpret_cast yes no yes no
static_cast yes no yes no
const_cast no no no no
address &v[i] no no no [#]_ no
============================== ======= ======= ============= =======

See also :ref:`langext-__builtin_shufflevector`, :ref:`langext-__builtin_convertvector`.
Expand All @@ -505,6 +567,9 @@ See also :ref:`langext-__builtin_shufflevector`, :ref:`langext-__builtin_convert
it's only available in C++ and uses normal bool conversions (that is, != 0).
If it's an extension (OpenCL) vector, it's only available in C and OpenCL C.
And it selects base on signedness of the condition operands (OpenCL v1.1 s6.3.9).
.. [#] Clang does not allow the address of an element to be taken while GCC
allows this. This is intentional for vectors with a boolean element type and
not implemented otherwise.
Vector Builtins
---------------
Expand Down
7 changes: 7 additions & 0 deletions clang/include/clang/AST/Type.h
Expand Up @@ -2048,6 +2048,7 @@ class alignas(8) Type : public ExtQualsTypeCommonBase {
bool isComplexIntegerType() const; // GCC _Complex integer type.
bool isVectorType() const; // GCC vector type.
bool isExtVectorType() const; // Extended vector type.
bool isExtVectorBoolType() const; // Extended vector type with bool element.
bool isMatrixType() const; // Matrix type.
bool isConstantMatrixType() const; // Constant matrix type.
bool isDependentAddressSpaceType() const; // value-dependent address space qualifier
Expand Down Expand Up @@ -6809,6 +6810,12 @@ inline bool Type::isExtVectorType() const {
return isa<ExtVectorType>(CanonicalType);
}

inline bool Type::isExtVectorBoolType() const {
if (!isExtVectorType())
return false;
return cast<ExtVectorType>(CanonicalType)->getElementType()->isBooleanType();
}

inline bool Type::isMatrixType() const {
return isa<MatrixType>(CanonicalType);
}
Expand Down
3 changes: 2 additions & 1 deletion clang/include/clang/Sema/Sema.h
Expand Up @@ -11932,7 +11932,8 @@ class Sema final {
/// type checking for vector binary operators.
QualType CheckVectorOperands(ExprResult &LHS, ExprResult &RHS,
SourceLocation Loc, bool IsCompAssign,
bool AllowBothBool, bool AllowBoolConversion);
bool AllowBothBool, bool AllowBoolConversion,
bool AllowBoolOperation, bool ReportInvalid);
QualType GetSignedVectorType(QualType V);
QualType CheckVectorCompareOperands(ExprResult &LHS, ExprResult &RHS,
SourceLocation Loc,
Expand Down
7 changes: 5 additions & 2 deletions clang/lib/AST/ASTContext.cpp
Expand Up @@ -1982,8 +1982,11 @@ TypeInfo ASTContext::getTypeInfoImpl(const Type *T) const {
case Type::Vector: {
const auto *VT = cast<VectorType>(T);
TypeInfo EltInfo = getTypeInfo(VT->getElementType());
Width = EltInfo.Width * VT->getNumElements();
Align = Width;
Width = VT->isExtVectorBoolType() ? VT->getNumElements()
: EltInfo.Width * VT->getNumElements();
// Enforce at least byte alignment.
Align = std::max<unsigned>(8, Width);

// If the alignment is not a power of 2, round up to the next power of 2.
// This happens for non-power-of-2 length vectors.
if (Align & (Align-1)) {
Expand Down
17 changes: 17 additions & 0 deletions clang/lib/CodeGen/CGDebugInfo.cpp
Expand Up @@ -3036,6 +3036,23 @@ llvm::DIType *CGDebugInfo::CreateTypeDefinition(const ObjCInterfaceType *Ty,

llvm::DIType *CGDebugInfo::CreateType(const VectorType *Ty,
llvm::DIFile *Unit) {
if (Ty->isExtVectorBoolType()) {
// Boolean ext_vector_type(N) are special because their real element type
// (bits of bit size) is not their Clang element type (_Bool of size byte).
// For now, we pretend the boolean vector were actually a vector of bytes
// (where each byte represents 8 bits of the actual vector).
// FIXME Debug info should actually represent this proper as a vector mask
// type.
auto &Ctx = CGM.getContext();
uint64_t Size = CGM.getContext().getTypeSize(Ty);
uint64_t NumVectorBytes = Size / Ctx.getCharWidth();

// Construct the vector of 'char' type.
QualType CharVecTy = Ctx.getVectorType(Ctx.CharTy, NumVectorBytes,
VectorType::GenericVector);
return CreateType(CharVecTy->getAs<VectorType>(), Unit);
}

llvm::DIType *ElementTy = getOrCreateType(Ty->getElementType(), Unit);
int64_t Count = Ty->getNumElements();

Expand Down
97 changes: 71 additions & 26 deletions clang/lib/CodeGen/CGExpr.cpp
Expand Up @@ -1707,27 +1707,42 @@ llvm::Value *CodeGenFunction::EmitLoadOfScalar(Address Addr, bool Volatile,
LValueBaseInfo BaseInfo,
TBAAAccessInfo TBAAInfo,
bool isNontemporal) {
if (!CGM.getCodeGenOpts().PreserveVec3Type) {
// For better performance, handle vector loads differently.
if (Ty->isVectorType()) {
const llvm::Type *EltTy = Addr.getElementType();

const auto *VTy = cast<llvm::FixedVectorType>(EltTy);

// Handle vectors of size 3 like size 4 for better performance.
if (VTy->getNumElements() == 3) {

// Bitcast to vec4 type.
auto *vec4Ty = llvm::FixedVectorType::get(VTy->getElementType(), 4);
Address Cast = Builder.CreateElementBitCast(Addr, vec4Ty, "castToVec4");
// Now load value.
llvm::Value *V = Builder.CreateLoad(Cast, Volatile, "loadVec4");

// Shuffle vector to get vec3.
V = Builder.CreateShuffleVector(V, ArrayRef<int>{0, 1, 2},
"extractVec");
return EmitFromMemory(V, Ty);
}
if (const auto *ClangVecTy = Ty->getAs<VectorType>()) {
// Boolean vectors use `iN` as storage type.
if (ClangVecTy->isExtVectorBoolType()) {
llvm::Type *ValTy = ConvertType(Ty);
unsigned ValNumElems =
cast<llvm::FixedVectorType>(ValTy)->getNumElements();
// Load the `iP` storage object (P is the padded vector size).
auto *RawIntV = Builder.CreateLoad(Addr, Volatile, "load_bits");
const auto *RawIntTy = RawIntV->getType();
assert(RawIntTy->isIntegerTy() && "compressed iN storage for bitvectors");
// Bitcast iP --> <P x i1>.
auto *PaddedVecTy = llvm::FixedVectorType::get(
Builder.getInt1Ty(), RawIntTy->getPrimitiveSizeInBits());
llvm::Value *V = Builder.CreateBitCast(RawIntV, PaddedVecTy);
// Shuffle <P x i1> --> <N x i1> (N is the actual bit size).
V = emitBoolVecConversion(V, ValNumElems, "extractvec");

return EmitFromMemory(V, Ty);
}

// Handle vectors of size 3 like size 4 for better performance.
const llvm::Type *EltTy = Addr.getElementType();
const auto *VTy = cast<llvm::FixedVectorType>(EltTy);

if (!CGM.getCodeGenOpts().PreserveVec3Type && VTy->getNumElements() == 3) {

// Bitcast to vec4 type.
llvm::VectorType *vec4Ty =
llvm::FixedVectorType::get(VTy->getElementType(), 4);
Address Cast = Builder.CreateElementBitCast(Addr, vec4Ty, "castToVec4");
// Now load value.
llvm::Value *V = Builder.CreateLoad(Cast, Volatile, "loadVec4");

// Shuffle vector to get vec3.
V = Builder.CreateShuffleVector(V, ArrayRef<int>{0, 1, 2}, "extractVec");
return EmitFromMemory(V, Ty);
}
}

Expand Down Expand Up @@ -1778,6 +1793,17 @@ llvm::Value *CodeGenFunction::EmitFromMemory(llvm::Value *Value, QualType Ty) {
"wrong value rep of bool");
return Builder.CreateTrunc(Value, Builder.getInt1Ty(), "tobool");
}
if (Ty->isExtVectorBoolType()) {
const auto *RawIntTy = Value->getType();
// Bitcast iP --> <P x i1>.
auto *PaddedVecTy = llvm::FixedVectorType::get(
Builder.getInt1Ty(), RawIntTy->getPrimitiveSizeInBits());
auto *V = Builder.CreateBitCast(Value, PaddedVecTy);
// Shuffle <P x i1> --> <N x i1> (N is the actual bit size).
llvm::Type *ValTy = ConvertType(Ty);
unsigned ValNumElems = cast<llvm::FixedVectorType>(ValTy)->getNumElements();
return emitBoolVecConversion(V, ValNumElems, "extractvec");
}

return Value;
}
Expand Down Expand Up @@ -1822,11 +1848,19 @@ void CodeGenFunction::EmitStoreOfScalar(llvm::Value *Value, Address Addr,
LValueBaseInfo BaseInfo,
TBAAAccessInfo TBAAInfo,
bool isInit, bool isNontemporal) {
if (!CGM.getCodeGenOpts().PreserveVec3Type) {
// Handle vectors differently to get better performance.
if (Ty->isVectorType()) {
llvm::Type *SrcTy = Value->getType();
auto *VecTy = dyn_cast<llvm::VectorType>(SrcTy);
llvm::Type *SrcTy = Value->getType();
if (const auto *ClangVecTy = Ty->getAs<VectorType>()) {
auto *VecTy = dyn_cast<llvm::FixedVectorType>(SrcTy);
if (VecTy && ClangVecTy->isExtVectorBoolType()) {
auto *MemIntTy =
cast<llvm::IntegerType>(Addr.getType()->getPointerElementType());
// Expand to the memory bit width.
unsigned MemNumElems = MemIntTy->getPrimitiveSizeInBits();
// <N x i1> --> <P x i1>.
Value = emitBoolVecConversion(Value, MemNumElems, "insertvec");
// <P x i1> --> iP.
Value = Builder.CreateBitCast(Value, MemIntTy);
} else if (!CGM.getCodeGenOpts().PreserveVec3Type) {
// Handle vec3 special.
if (VecTy && cast<llvm::FixedVectorType>(VecTy)->getNumElements() == 3) {
// Our source is a vec3, do a shuffle vector to make it a vec4.
Expand Down Expand Up @@ -2063,8 +2097,19 @@ void CodeGenFunction::EmitStoreThroughLValue(RValue Src, LValue Dst,
// Read/modify/write the vector, inserting the new element.
llvm::Value *Vec = Builder.CreateLoad(Dst.getVectorAddress(),
Dst.isVolatileQualified());
auto *IRStoreTy = dyn_cast<llvm::IntegerType>(Vec->getType());
if (IRStoreTy) {
auto *IRVecTy = llvm::FixedVectorType::get(
Builder.getInt1Ty(), IRStoreTy->getPrimitiveSizeInBits());
Vec = Builder.CreateBitCast(Vec, IRVecTy);
// iN --> <N x i1>.
}
Vec = Builder.CreateInsertElement(Vec, Src.getScalarVal(),
Dst.getVectorIdx(), "vecins");
if (IRStoreTy) {
// <N x i1> --> <iN>.
Vec = Builder.CreateBitCast(Vec, IRStoreTy);
}
Builder.CreateStore(Vec, Dst.getVectorAddress(),
Dst.isVolatileQualified());
return;
Expand Down
5 changes: 4 additions & 1 deletion clang/lib/CodeGen/CGExprScalar.cpp
Expand Up @@ -2147,7 +2147,6 @@ Value *ScalarExprEmitter::VisitCastExpr(CastExpr *CE) {
DestLV.setTBAAInfo(TBAAAccessInfo::getMayAliasInfo());
return EmitLoadOfLValue(DestLV, CE->getExprLoc());
}

return Builder.CreateBitCast(Src, DstTy);
}
case CK_AddressSpaceConversion: {
Expand Down Expand Up @@ -4818,6 +4817,10 @@ Value *ScalarExprEmitter::VisitAsTypeExpr(AsTypeExpr *E) {
? cast<llvm::FixedVectorType>(DstTy)->getNumElements()
: 0;

// Use bit vector expansion for ext_vector_type boolean vectors.
if (E->getType()->isExtVectorBoolType())
return CGF.emitBoolVecConversion(Src, NumElementsDst, "astype");

// Going from vec3 to non-vec3 is a special case and requires a shuffle
// vector to get a vec4, then a bitcast if the target type is different.
if (NumElementsSrc == 3 && NumElementsDst != 3) {
Expand Down
16 changes: 16 additions & 0 deletions clang/lib/CodeGen/CodeGenFunction.cpp
Expand Up @@ -2765,3 +2765,19 @@ CodeGenFunction::emitCondLikelihoodViaExpectIntrinsic(llvm::Value *Cond,
}
llvm_unreachable("Unknown Likelihood");
}

llvm::Value *CodeGenFunction::emitBoolVecConversion(llvm::Value *SrcVec,
unsigned NumElementsDst,
const llvm::Twine &Name) {
auto *SrcTy = cast<llvm::FixedVectorType>(SrcVec->getType());
unsigned NumElementsSrc = SrcTy->getNumElements();
if (NumElementsSrc == NumElementsDst)
return SrcVec;

std::vector<int> ShuffleMask(NumElementsDst, -1);
for (unsigned MaskIdx = 0;
MaskIdx < std::min<>(NumElementsDst, NumElementsSrc); ++MaskIdx)
ShuffleMask[MaskIdx] = MaskIdx;

return Builder.CreateShuffleVector(SrcVec, ShuffleMask, Name);
}
5 changes: 5 additions & 0 deletions clang/lib/CodeGen/CodeGenFunction.h
Expand Up @@ -4649,6 +4649,11 @@ class CodeGenFunction : public CodeGenTypeCache {
/// Set the codegen fast-math flags.
void SetFastMathFlags(FPOptions FPFeatures);

// Truncate or extend a boolean vector to the requested number of elements.
llvm::Value *emitBoolVecConversion(llvm::Value *SrcVec,
unsigned NumElementsDst,
const llvm::Twine &Name = "");

private:
llvm::MDNode *getRangeForLoadFromType(QualType Ty);
void EmitReturnOfRValue(RValue RV, QualType Ty);
Expand Down
17 changes: 14 additions & 3 deletions clang/lib/CodeGen/CodeGenTypes.cpp
Expand Up @@ -98,6 +98,14 @@ llvm::Type *CodeGenTypes::ConvertTypeForMem(QualType T, bool ForBitField) {

llvm::Type *R = ConvertType(T);

// Check for the boolean vector case.
if (T->isExtVectorBoolType()) {
auto *FixedVT = cast<llvm::FixedVectorType>(R);
// Pad to at least one byte.
uint64_t BytePadded = std::max<uint64_t>(FixedVT->getNumElements(), 8);
return llvm::IntegerType::get(FixedVT->getContext(), BytePadded);
}

// If this is a bool type, or a bit-precise integer type in a bitfield
// representation, map this integer to the target-specified size.
if ((ForBitField && T->isBitIntType()) ||
Expand Down Expand Up @@ -701,9 +709,12 @@ llvm::Type *CodeGenTypes::ConvertType(QualType T) {
}
case Type::ExtVector:
case Type::Vector: {
const VectorType *VT = cast<VectorType>(Ty);
ResultType = llvm::FixedVectorType::get(ConvertType(VT->getElementType()),
VT->getNumElements());
const auto *VT = cast<VectorType>(Ty);
// An ext_vector_type of Bool is really a vector of bits.
llvm::Type *IRElemTy = VT->isExtVectorBoolType()
? llvm::Type::getInt1Ty(getLLVMContext())
: ConvertType(VT->getElementType());
ResultType = llvm::FixedVectorType::get(IRElemTy, VT->getNumElements());
break;
}
case Type::ConstantMatrix: {
Expand Down

0 comments on commit 0aab344

Please sign in to comment.