Skip to content

Commit

Permalink
Rebase: [Facebook] [MC] Introduce NeverAlign fragment type
Browse files Browse the repository at this point in the history
Summary:
Introduce NeverAlign fragment type.

The intended usage of this fragment is to insert it before a pair of
macro-op fusion eligible instructions. NeverAlign fragment ensures that
the next fragment (first instruction in the pair) does not end at a
given alignment boundary by emitting a minimal size nop if necessary.

In effect, it ensures that a pair of macro-fusible instructions is not
split by a given alignment boundary, which is a precondition for
macro-op fusion in modern Intel Cores (64B = cache line size, see Intel
Architecture Optimization Reference Manual, 2.3.2.1 Legacy Decode
Pipeline: Macro-Fusion).

This patch introduces functionality used by BOLT when emitting code with
MacroFusion alignment already in place.

The use case is different from BoundaryAlign and instruction bundling:
- BoundaryAlign can be extended to perform the desired alignment for the
first instruction in the macro-op fusion pair (D101817). However, this
approach has higher overhead due to reliance on relaxation as
BoundaryAlign requires in the general case - see
https://reviews.llvm.org/D97982#2710638.
- Instruction bundling: the intent of NeverAlign fragment is to prevent
the first instruction in a pair ending at a given alignment boundary, by
inserting at most one minimum size nop. It's OK if either instruction
crosses the cache line. Padding both instructions using bundles to not
cross the alignment boundary would result in excessive padding. There's
no straightforward way to request instruction bundling to avoid a given
end alignment for the first instruction in the bundle.

LLVM: https://reviews.llvm.org/D97982

Manual rebase conflict history:
https://phabricator.intern.facebook.com/D30142613

Test Plan: sandcastle

Reviewers: #llvm-bolt

Subscribers: phabricatorlinter

Differential Revision: https://phabricator.intern.facebook.com/D31361547
  • Loading branch information
rafaelauler authored and spupyrev committed Jul 11, 2022
1 parent f921985 commit 6d05286
Show file tree
Hide file tree
Showing 10 changed files with 363 additions and 37 deletions.
1 change: 1 addition & 0 deletions bolt/lib/Core/BinaryEmitter.cpp
Expand Up @@ -453,6 +453,7 @@ void BinaryEmitter::emitFunctionBody(BinaryFunction &BF, bool EmitColdPart,
// This assumes the second instruction in the macro-op pair will get
// assigned to its own MCRelaxableFragment. Since all JCC instructions
// are relaxable, we should be safe.
Streamer.emitNeverAlignCodeAtEnd(/*Alignment to avoid=*/64, *BC.STI);
}

if (!EmitCodeOnly && opts::UpdateDebugSections && BF.getDWARFUnit()) {
Expand Down
22 changes: 22 additions & 0 deletions llvm/include/llvm/MC/MCFragment.h
Expand Up @@ -33,6 +33,7 @@ class MCFragment : public ilist_node_with_parent<MCFragment, MCSection> {
public:
enum FragmentType : uint8_t {
FT_Align,
FT_NeverAlign,
FT_Data,
FT_CompactEncodedInst,
FT_Fill,
Expand Down Expand Up @@ -340,6 +341,27 @@ class MCAlignFragment : public MCFragment {
}
};

class MCNeverAlignFragment : public MCFragment {
/// The alignment the end of the next fragment should avoid.
unsigned Alignment;

/// When emitting Nops some subtargets have specific nop encodings.
const MCSubtargetInfo &STI;

public:
MCNeverAlignFragment(unsigned Alignment, const MCSubtargetInfo &STI,
MCSection *Sec = nullptr)
: MCFragment(FT_NeverAlign, false, Sec), Alignment(Alignment), STI(STI) {}

unsigned getAlignment() const { return Alignment; }

const MCSubtargetInfo &getSubtargetInfo() const { return STI; }

static bool classof(const MCFragment *F) {
return F->getKind() == MCFragment::FT_NeverAlign;
}
};

class MCFillFragment : public MCFragment {
uint8_t ValueSize;
/// Value to use for filling bytes.
Expand Down
2 changes: 2 additions & 0 deletions llvm/include/llvm/MC/MCObjectStreamer.h
Expand Up @@ -157,6 +157,8 @@ class MCObjectStreamer : public MCStreamer {
unsigned MaxBytesToEmit = 0) override;
void emitCodeAlignment(unsigned ByteAlignment, const MCSubtargetInfo *STI,
unsigned MaxBytesToEmit = 0) override;
void emitNeverAlignCodeAtEnd(unsigned ByteAlignment,
const MCSubtargetInfo &STI) override;
void emitValueToOffset(const MCExpr *Offset, unsigned char Value,
SMLoc Loc) override;
void emitDwarfLocDirective(unsigned FileNo, unsigned Line, unsigned Column,
Expand Down
6 changes: 6 additions & 0 deletions llvm/include/llvm/MC/MCStreamer.h
Expand Up @@ -872,6 +872,12 @@ class MCStreamer {
const MCSubtargetInfo *STI,
unsigned MaxBytesToEmit = 0);

/// If the end of the fragment following this NeverAlign fragment ever gets
/// aligned to \p ByteAlignment, this fragment emits a single nop before the
/// following fragment to break this end-alignment.
virtual void emitNeverAlignCodeAtEnd(unsigned ByteAlignment,
const MCSubtargetInfo &STI);

/// Emit some number of copies of \p Value until the byte offset \p
/// Offset is reached.
///
Expand Down
118 changes: 81 additions & 37 deletions llvm/lib/MC/MCAssembler.cpp
Expand Up @@ -290,6 +290,43 @@ bool MCAssembler::evaluateFixup(const MCAsmLayout &Layout,
return IsResolved;
}

/// Check if the branch crosses the boundary.
///
/// \param StartAddr start address of the fused/unfused branch.
/// \param Size size of the fused/unfused branch.
/// \param BoundaryAlignment alignment requirement of the branch.
/// \returns true if the branch cross the boundary.
static bool mayCrossBoundary(uint64_t StartAddr, uint64_t Size,
Align BoundaryAlignment) {
uint64_t EndAddr = StartAddr + Size;
return (StartAddr >> Log2(BoundaryAlignment)) !=
((EndAddr - 1) >> Log2(BoundaryAlignment));
}

/// Check if the branch is against the boundary.
///
/// \param StartAddr start address of the fused/unfused branch.
/// \param Size size of the fused/unfused branch.
/// \param BoundaryAlignment alignment requirement of the branch.
/// \returns true if the branch is against the boundary.
static bool isAgainstBoundary(uint64_t StartAddr, uint64_t Size,
Align BoundaryAlignment) {
uint64_t EndAddr = StartAddr + Size;
return (EndAddr & (BoundaryAlignment.value() - 1)) == 0;
}

/// Check if the branch needs padding.
///
/// \param StartAddr start address of the fused/unfused branch.
/// \param Size size of the fused/unfused branch.
/// \param BoundaryAlignment alignment requirement of the branch.
/// \returns true if the branch needs padding.
static bool needPadding(uint64_t StartAddr, uint64_t Size,
Align BoundaryAlignment) {
return mayCrossBoundary(StartAddr, Size, BoundaryAlignment) ||
isAgainstBoundary(StartAddr, Size, BoundaryAlignment);
}

uint64_t MCAssembler::computeFragmentSize(const MCAsmLayout &Layout,
const MCFragment &F) const {
assert(getBackendPtr() && "Requires assembler backend");
Expand Down Expand Up @@ -350,6 +387,41 @@ uint64_t MCAssembler::computeFragmentSize(const MCAsmLayout &Layout,
return Size;
}

case MCFragment::FT_NeverAlign: {
// Disclaimer: NeverAlign fragment size depends on the size of its immediate
// successor, but NeverAlign need not be a MCRelaxableFragment.
// NeverAlign fragment size is recomputed if the successor is relaxed:
// - If RelaxableFragment is relaxed, it gets invalidated by marking its
// predecessor as LastValidFragment.
// - This forces the assembler to call MCAsmLayout::layoutFragment on that
// relaxable fragment, which in turn will always ask the predecessor to
// compute its size (see "computeFragmentSize(prev)" in layoutFragment).
//
// In short, the simplest way to ensure that computeFragmentSize() is sane
// is to establish the following rule: it should never examine fragments
// after the current fragment in the section. If we logically need to
// examine any fragment after the current fragment, we need to do that using
// relaxation, inside MCAssembler::layoutSectionOnce.
const MCNeverAlignFragment &NAF = cast<MCNeverAlignFragment>(F);
const MCFragment *NF = F.getNextNode();
uint64_t Offset = Layout.getFragmentOffset(&NAF);
size_t NextFragSize = 0;
if (const auto *NextFrag = dyn_cast<MCRelaxableFragment>(NF)) {
NextFragSize = NextFrag->getContents().size();
} else if (const auto *NextFrag = dyn_cast<MCDataFragment>(NF)) {
NextFragSize = NextFrag->getContents().size();
} else {
llvm_unreachable("Didn't find the expected fragment after NeverAlign");
}
// Check if the next fragment ends at the alignment we want to avoid.
if (isAgainstBoundary(Offset, NextFragSize, Align(NAF.getAlignment()))) {
// Avoid this alignment by introducing minimum nop.
assert(getBackend().getMinimumNopSize() != NAF.getAlignment());
return getBackend().getMinimumNopSize();
}
return 0;
}

case MCFragment::FT_Org: {
const MCOrgFragment &OF = cast<MCOrgFragment>(F);
MCValue Value;
Expand Down Expand Up @@ -574,6 +646,15 @@ static void writeFragment(raw_ostream &OS, const MCAssembler &Asm,
break;
}

case MCFragment::FT_NeverAlign: {
const MCNeverAlignFragment &NAF = cast<MCNeverAlignFragment>(F);
if (!Asm.getBackend().writeNopData(OS, FragmentSize,
&NAF.getSubtargetInfo()))
report_fatal_error("unable to write nop sequence of " +
Twine(FragmentSize) + " bytes");
break;
}

case MCFragment::FT_Data:
++stats::EmittedDataFragments;
OS << cast<MCDataFragment>(F).getContents();
Expand Down Expand Up @@ -1027,43 +1108,6 @@ bool MCAssembler::relaxLEB(MCAsmLayout &Layout, MCLEBFragment &LF) {
return OldSize != LF.getContents().size();
}

/// Check if the branch crosses the boundary.
///
/// \param StartAddr start address of the fused/unfused branch.
/// \param Size size of the fused/unfused branch.
/// \param BoundaryAlignment alignment requirement of the branch.
/// \returns true if the branch cross the boundary.
static bool mayCrossBoundary(uint64_t StartAddr, uint64_t Size,
Align BoundaryAlignment) {
uint64_t EndAddr = StartAddr + Size;
return (StartAddr >> Log2(BoundaryAlignment)) !=
((EndAddr - 1) >> Log2(BoundaryAlignment));
}

/// Check if the branch is against the boundary.
///
/// \param StartAddr start address of the fused/unfused branch.
/// \param Size size of the fused/unfused branch.
/// \param BoundaryAlignment alignment requirement of the branch.
/// \returns true if the branch is against the boundary.
static bool isAgainstBoundary(uint64_t StartAddr, uint64_t Size,
Align BoundaryAlignment) {
uint64_t EndAddr = StartAddr + Size;
return (EndAddr & (BoundaryAlignment.value() - 1)) == 0;
}

/// Check if the branch needs padding.
///
/// \param StartAddr start address of the fused/unfused branch.
/// \param Size size of the fused/unfused branch.
/// \param BoundaryAlignment alignment requirement of the branch.
/// \returns true if the branch needs padding.
static bool needPadding(uint64_t StartAddr, uint64_t Size,
Align BoundaryAlignment) {
return mayCrossBoundary(StartAddr, Size, BoundaryAlignment) ||
isAgainstBoundary(StartAddr, Size, BoundaryAlignment);
}

bool MCAssembler::relaxBoundaryAlign(MCAsmLayout &Layout,
MCBoundaryAlignFragment &BF) {
// BoundaryAlignFragment that doesn't need to align any fragment should not be
Expand Down
12 changes: 12 additions & 0 deletions llvm/lib/MC/MCFragment.cpp
Expand Up @@ -274,6 +274,9 @@ void MCFragment::destroy() {
case FT_Align:
delete cast<MCAlignFragment>(this);
return;
case FT_NeverAlign:
delete cast<MCNeverAlignFragment>(this);
return;
case FT_Data:
delete cast<MCDataFragment>(this);
return;
Expand Down Expand Up @@ -342,6 +345,9 @@ LLVM_DUMP_METHOD void MCFragment::dump() const {
OS << "<";
switch (getKind()) {
case MCFragment::FT_Align: OS << "MCAlignFragment"; break;
case MCFragment::FT_NeverAlign:
OS << "MCNeverAlignFragment";
break;
case MCFragment::FT_Data: OS << "MCDataFragment"; break;
case MCFragment::FT_CompactEncodedInst:
OS << "MCCompactEncodedInstFragment"; break;
Expand Down Expand Up @@ -381,6 +387,12 @@ LLVM_DUMP_METHOD void MCFragment::dump() const {
<< " MaxBytesToEmit:" << AF->getMaxBytesToEmit() << ">";
break;
}
case MCFragment::FT_NeverAlign: {
const MCNeverAlignFragment *NAF = cast<MCNeverAlignFragment>(this);
OS << "\n ";
OS << " Alignment:" << NAF->getAlignment() << ">";
break;
}
case MCFragment::FT_Data: {
const auto *DF = cast<MCDataFragment>(this);
OS << "\n ";
Expand Down
5 changes: 5 additions & 0 deletions llvm/lib/MC/MCObjectStreamer.cpp
Expand Up @@ -663,6 +663,11 @@ void MCObjectStreamer::emitCodeAlignment(unsigned ByteAlignment,
cast<MCAlignFragment>(getCurrentFragment())->setEmitNops(true, STI);
}

void MCObjectStreamer::emitNeverAlignCodeAtEnd(unsigned ByteAlignment,
const MCSubtargetInfo &STI) {
insert(new MCNeverAlignFragment(ByteAlignment, STI));
}

void MCObjectStreamer::emitValueToOffset(const MCExpr *Offset,
unsigned char Value,
SMLoc Loc) {
Expand Down
2 changes: 2 additions & 0 deletions llvm/lib/MC/MCStreamer.cpp
Expand Up @@ -1215,6 +1215,8 @@ void MCStreamer::emitValueToAlignment(unsigned ByteAlignment, int64_t Value,
void MCStreamer::emitCodeAlignment(unsigned ByteAlignment,
const MCSubtargetInfo *STI,
unsigned MaxBytesToEmit) {}
void MCStreamer::emitNeverAlignCodeAtEnd(unsigned ByteAlignment,
const MCSubtargetInfo &STI) {}
void MCStreamer::emitValueToOffset(const MCExpr *Offset, unsigned char Value,
SMLoc Loc) {}
void MCStreamer::emitBundleAlignMode(unsigned AlignPow2) {}
Expand Down
24 changes: 24 additions & 0 deletions llvm/lib/Target/X86/AsmParser/X86AsmParser.cpp
Expand Up @@ -1145,6 +1145,7 @@ class X86AsmParser : public MCTargetAsmParser {
bool parseDirectiveArch();
bool parseDirectiveNops(SMLoc L);
bool parseDirectiveEven(SMLoc L);
bool parseDirectiveAvoidEndAlign(SMLoc L);
bool ParseDirectiveCode(StringRef IDVal, SMLoc L);

/// CodeView FPO data directives.
Expand Down Expand Up @@ -4633,6 +4634,8 @@ bool X86AsmParser::ParseDirective(AsmToken DirectiveID) {
return false;
} else if (IDVal == ".nops")
return parseDirectiveNops(DirectiveID.getLoc());
else if (IDVal == ".avoid_end_align")
return parseDirectiveAvoidEndAlign(DirectiveID.getLoc());
else if (IDVal == ".even")
return parseDirectiveEven(DirectiveID.getLoc());
else if (IDVal == ".cv_fpo_proc")
Expand Down Expand Up @@ -4727,6 +4730,27 @@ bool X86AsmParser::parseDirectiveEven(SMLoc L) {
return false;
}

/// Directive for NeverAlign fragment testing, not for general usage!
/// parseDirectiveAvoidEndAlign
/// ::= .avoid_end_align alignment
bool X86AsmParser::parseDirectiveAvoidEndAlign(SMLoc L) {
int64_t Alignment = 0;
SMLoc AlignmentLoc;
AlignmentLoc = getTok().getLoc();
if (getParser().checkForValidSection() ||
getParser().parseAbsoluteExpression(Alignment))
return true;

if (getParser().parseEOL("unexpected token in directive"))
return true;

if (Alignment <= 0)
return Error(AlignmentLoc, "expected a positive alignment");

getParser().getStreamer().emitNeverAlignCodeAtEnd(Alignment, getSTI());
return false;
}

/// ParseDirectiveCode
/// ::= .code16 | .code32 | .code64
bool X86AsmParser::ParseDirectiveCode(StringRef IDVal, SMLoc L) {
Expand Down

0 comments on commit 6d05286

Please sign in to comment.