Skip to content

Commit

Permalink
[BOLT] Fix shrink-wrapping bugs
Browse files Browse the repository at this point in the history
Summary:
Make shrink-wrapping more stable. Changes:

* Correctly detect landing pads at the dominance frontier, bailing
  on such cases because we are not prepared to split LPs that are target
  of a critical edge.
* Disable FOP's store removal by default - this is experimental and
  shouldn t go to prod because removing a store that we failed to detect
  it's actually necessary is disastrous. This pass currently doesn't
  have a great impact on the number of stores reduced, so it is not a
  problem. Most stores reduced are due shrink wrapping anyway.
* Fix stack access identification - correctly estimate memory length of
  weird instructions, bail if we don't know.
* Make rules for shrink-wrapping more strict: cancel shrink wrapping on
  a number of cases when we are not 100% sure that we are dealing with a
  regular callee-saved register.
* Add basic block folding to SW. Sometimes when splitting critical edges
  we create a lot of redundant BBs with the same instructions, same
  successor but different predecessor. Fold all identical BBs created by
  splitting critical edges.
* Change defaults: now the threshold used to determine when to perform
  SW is more conservative, to be sure we are moving a spill to a colder
  area. This effort, along with BB folding, helps us to avoid hurting
  icache performance by indiscriminately increasing code size.

(cherry picked from FBD5315086)
  • Loading branch information
rafaelauler authored and maksfb committed Jun 22, 2017
1 parent ec30439 commit 4ecd385
Show file tree
Hide file tree
Showing 7 changed files with 218 additions and 52 deletions.
3 changes: 2 additions & 1 deletion bolt/BinaryContext.h
Expand Up @@ -278,7 +278,8 @@ class BinaryContext {
uint64_t getHotThreshold() const {
static uint64_t Threshold{0};
if (Threshold == 0) {
Threshold = NumProfiledFuncs ? SumExecutionCount / NumProfiledFuncs : 1;
Threshold =
NumProfiledFuncs ? SumExecutionCount / (2 * NumProfiledFuncs) : 1;
}
return Threshold;
}
Expand Down
13 changes: 8 additions & 5 deletions bolt/Passes/FrameAnalysis.cpp
Expand Up @@ -106,14 +106,17 @@ class FrameAccessAnalysis {
MCPhysReg Reg{0};
int64_t StackOffset{0};
bool IsIndexed{false};
if (!BC.MIA->isStackAccess(
Inst, FIE.IsLoad, FIE.IsStore, FIE.IsStoreFromReg, Reg, SrcImm,
FIE.StackPtrReg, StackOffset, FIE.Size, FIE.IsSimple, IsIndexed)) {
if (!BC.MIA->isStackAccess(*BC.MRI, Inst, FIE.IsLoad, FIE.IsStore,
FIE.IsStoreFromReg, Reg, SrcImm, FIE.StackPtrReg,
StackOffset, FIE.Size, FIE.IsSimple,
IsIndexed)) {
return true;
}

if (IsIndexed) {
DEBUG(dbgs() << "Giving up on indexed memory access in the frame\n");
if (IsIndexed || FIE.Size == 0) {
DEBUG(dbgs() << "Giving up on indexed memory access/unknown size\n");
DEBUG(dbgs() << "Blame insn: ");
DEBUG(Inst.dump());
return false;
}

Expand Down
11 changes: 10 additions & 1 deletion bolt/Passes/FrameOptimizer.cpp
Expand Up @@ -40,6 +40,15 @@ FrameOptimization("frame-opt",
cl::ZeroOrMore,
cl::cat(BoltOptCategory));

cl::opt<bool>
RemoveStores("frame-opt-rm-stores",
cl::init(FOP_NONE),
cl::desc("apply additional analysis to remove stores (experimental)"),
cl::init(false),
cl::ZeroOrMore,
cl::cat(BoltOptCategory));


} // namespace opts

namespace llvm {
Expand Down Expand Up @@ -243,7 +252,7 @@ void FrameOptimizerPass::runOnFunctions(BinaryContext &BC,
NamedRegionTimer T1("remove loads", "FOP breakdown", opts::TimeOpts);
removeUnnecessaryLoads(RA, FA, BC, I.second);
}
{
if (opts::RemoveStores) {
NamedRegionTimer T1("remove stores", "FOP breakdown", opts::TimeOpts);
removeUnusedStores(FA, BC, I.second);
}
Expand Down

0 comments on commit 4ecd385

Please sign in to comment.