Skip to content
Commits on Jun 11, 2012
  1. @isanbard

    Re-enable the CMN instruction.

    isanbard committed
    We turned off the CMN instruction because it had semantics which we weren't
    getting correct. If we are comparing with an immediate, then it's okay to use
    the CMN instruction.
    git-svn-id: 91177308-0d34-0410-b5e6-96231b3b80d8
  2. @d0k

    InstCombine: factor code better.

    d0k committed
    No functionality change.
    git-svn-id: 91177308-0d34-0410-b5e6-96231b3b80d8
Commits on Jun 10, 2012
  1. @d0k

    InstCombine: Turn (zext A) == (B & (1<<X)-1) into A == (trunc B), nar…

    d0k committed
    …rowing the compare.
    This saves a cast, and zext is more expensive on platforms with subreg support
    than trunc is. This occurs in the BSD implementation of memchr(3), see PR12750.
    On the synthetic benchmark from that bug stupid_memchr and bsd_memchr have the
    same performance now when not inlining either function.
    stupid_memchr: 323.0us
    bsd_memchr: 321.0us
    memchr: 479.0us
    where memchr is the llvm-gcc compiled bsd_memchr from osx lion's libc. When
    inlining is enabled bsd_memchr still regresses down to llvm-gcc memchr time,
    I haven't fully understood the issue yet, something is grossly mangling the
    loop after inlining.
    git-svn-id: 91177308-0d34-0410-b5e6-96231b3b80d8
  2. @hfinkel

    Enable ILP scheduling for all nodes by default on PPC.

    hfinkel committed
    Over the entire test-suite, this has an insignificantly negative average
    performance impact, but reduces some of the worst slowdowns from the
    anti-dep. change (r158294).
    Largest speedups:
    SingleSource/Benchmarks/Stanford/Quicksort - 28%
    SingleSource/Benchmarks/Stanford/Towers - 24%
    SingleSource/Benchmarks/Shootout-C++/matrix - 23%
    MultiSource/Benchmarks/SciMark2-C/scimark2 - 19%
    MultiSource/Benchmarks/MiBench/automotive-bitcount/automotive-bitcount - 15%
    (matrix and automotive-bitcount were both in the top-5 slowdown list from the
    anti-dep. change)
    Largest slowdowns:
    MultiSource/Benchmarks/McCat/03-testtrie/testtrie - 28%
    MultiSource/Benchmarks/mediabench/gsm/toast/toast - 26%
    MultiSource/Benchmarks/MiBench/automotive-susan/automotive-susan - 21%
    SingleSource/Benchmarks/CoyoteBench/lpbench - 20%
    MultiSource/Applications/d/make_dparser - 16%
    git-svn-id: 91177308-0d34-0410-b5e6-96231b3b80d8
  3. @nadavrot

    Add AutoUpgrade support for the SSE4 ptest intrinsics.

    nadavrot committed
    Patch by Michael Kuperstein.
    git-svn-id: 91177308-0d34-0410-b5e6-96231b3b80d8
  4. @hfinkel

    Use critical anti-dep. breaking on all PPC targets, but also add othe…

    hfinkel committed
    …r register classes.
    Using 'all' instead of 'critical' would be better because it would make it easier to
    satisfy the bundling constraints, but, as noted in the FIXME, that is currently not
    possible with the crs.
    This yields an average 1% speedup over the entire test suite (on Power 7). Largest speedups:
    SingleSource/Benchmarks/Shootout-C++/moments - 40%
    MultiSource/Benchmarks/McCat/03-testtrie/testtrie - 28%
    SingleSource/Benchmarks/BenchmarkGame/nsieve-bits - 26%
    SingleSource/Benchmarks/McGill/misr - 23%
    MultiSource/Applications/JM/ldecod/ldecod - 22%
    Largest slowdowns:
    SingleSource/Benchmarks/Shootout-C++/matrix - -29%
    SingleSource/Benchmarks/Shootout-C++/ary3 - -22%
    MultiSource/Benchmarks/BitBench/uuencode/uuencode - -18%
    SingleSource/Benchmarks/Shootout-C++/ary - -17%
    MultiSource/Benchmarks/MiBench/automotive-bitcount/automotive-bitcount - -15%
    git-svn-id: 91177308-0d34-0410-b5e6-96231b3b80d8
  5. Add intrinsics for immediate form of XOP vprot instructions. Use i128…

    Craig Topper committed
    …mem instead of f128mem for integer XOP instructions.
    git-svn-id: 91177308-0d34-0410-b5e6-96231b3b80d8
Commits on Jun 9, 2012
  1. @hfinkel

    Improve ext/trunc patterns on PPC64.

    hfinkel committed
    The PPC64 backend had patterns for i32 <-> i64 extensions and truncations that
    would leave self-moves in the final assembly. Replacing those patterns with ones
    based on the SUBREG builtins yields better-looking code.
    Thanks to Jakob and Owen for their suggestions in this matter.
    git-svn-id: 91177308-0d34-0410-b5e6-96231b3b80d8
  2. Use XOP vpcom intrinsics in patterns instead of a target specific SDN…

    Craig Topper committed
    …ode type. Remove the custom lowering code that selected the SDNode type.
    git-svn-id: 91177308-0d34-0410-b5e6-96231b3b80d8
  3. Replace XOP vpcom intrinsics with fewer intrinsics that take the imme…

    Craig Topper committed
    …diate as an argument.
    git-svn-id: 91177308-0d34-0410-b5e6-96231b3b80d8
  4. @d0k

    Hashing: Remove outdated comment. Support for reserved hash values wa…

    d0k committed
    …s removed in r151865.
    git-svn-id: 91177308-0d34-0410-b5e6-96231b3b80d8
  5. @AaronBallman

    Disabling a spurious deprecation warning about using PathV1 from with…

    AaronBallman committed
    …in the PathV1 implementation file.
    git-svn-id: 91177308-0d34-0410-b5e6-96231b3b80d8
  6. @AaronBallman

    Fixing a typo in the comments.

    AaronBallman committed
    git-svn-id: 91177308-0d34-0410-b5e6-96231b3b80d8
  7. @d0k

    Allocate the contents of DwarfDebug's StringMaps in a single big Bump…

    d0k committed
    git-svn-id: 91177308-0d34-0410-b5e6-96231b3b80d8
  8. @CunningBaldrick

    Silence a gcc-4.6 warning: GCC fails to understand that secondReg and…

    CunningBaldrick committed
    … cmpOp2 are
    correlated, and thinks that cmpOp2 may be used uninitialized.
    git-svn-id: 91177308-0d34-0410-b5e6-96231b3b80d8
  9. @hfinkel

    Enable tail merging on PPC.

    hfinkel committed
    Tail merging had been disabled on PPC because it would disturb bundling decisions
    made during pre-RA scheduling on the 970 cores. Now, however, all bundling decisions
    are made during post-RA scheduling, and tail merging is generally beneficial (the
    average test-suite speedup is insignificantly positive).
    Largest test-suite speedups:
    MultiSource/Benchmarks/mediabench/gsm/toast/toast - 30%
    MultiSource/Benchmarks/BitBench/uuencode/uuencode - 23%
    SingleSource/Benchmarks/Shootout-C++/ary - 21%
    SingleSource/Benchmarks/Stanford/Queens - 17%
    Largest slowdowns:
    MultiSource/Benchmarks/MiBench/security-sha/security-sha - 24%
    MultiSource/Benchmarks/McCat/03-testtrie/testtrie - 22%
    MultiSource/Applications/JM/ldecod/ldecod - 14%
    MultiSource/Benchmarks/mediabench/g721/g721encode/encode - 9%
    This is improved by using full (instead of just critical) anti-dependency breaking,
    but doing so still causes miscompiles and so cannot yet be enabled by default.
    git-svn-id: 91177308-0d34-0410-b5e6-96231b3b80d8
  10. @atrick

    Register pressure: added getPressureAfterInstr.

    atrick committed
    git-svn-id: 91177308-0d34-0410-b5e6-96231b3b80d8
  11. @stoklund

    Sketch a LiveRegMatrix analysis pass.

    stoklund committed
    The LiveRegMatrix represents the live range of assigned virtual
    registers in a Live interval union per register unit. This is not
    fundamentally different from the interference tracking in RegAllocBase
    that both RABasic and RAGreedy use.
    The important differences are:
    - LiveRegMatrix tracks interference per register unit instead of per
      physical register. This makes interference checks cheaper and
      assignments slightly more expensive. For example, the ARM D7 reigster
      has 24 aliases, so we would check 24 physregs before assigning to one.
      With unit-based interference, we check 2 units before assigning to 2
    - LiveRegMatrix caches regmask interference checks. That is currently
      duplicated functionality in RABasic and RAGreedy.
    - LiveRegMatrix is a pass which makes it possible to insert
      target-dependent passes between register allocation and rewriting.
      Such passes could tweak the register assignments with interference
      checking support from LiveRegMatrix.
    Eventually, RABasic and RAGreedy will be switched to LiveRegMatrix.
    git-svn-id: 91177308-0d34-0410-b5e6-96231b3b80d8
  12. Test commit

    Jack Carter committed
    git-svn-id: 91177308-0d34-0410-b5e6-96231b3b80d8
  13. @stoklund

    Also compute MBB live-in lists in the new rewriter pass.

    stoklund committed
    This deduplicates some code from the optimizing register allocators, and
    it means that it is now possible to change the register allocators'
    solutions simply by editing the VirtRegMap between the register
    allocator pass and the rewriter.
    git-svn-id: 91177308-0d34-0410-b5e6-96231b3b80d8
  14. @gribozavr

    Convert comments to proper Doxygen comments.

    gribozavr committed
    git-svn-id: 91177308-0d34-0410-b5e6-96231b3b80d8
Commits on Jun 8, 2012
  1. @atrick

    Removing strange "using" declarations form TargetInstrInfo.

    atrick committed
    I can't imagine why these were added. Trial and error.
    git-svn-id: 91177308-0d34-0410-b5e6-96231b3b80d8
  2. @stoklund

    Reintroduce VirtRegRewriter.

    stoklund committed
    OK, not really. We don't want to reintroduce the old rewriter hacks.
    This patch extracts virtual register rewriting as a separate pass that
    runs after the register allocator. This is possible now that
    CodeGen/Passes.cpp can configure the full optimizing register allocator
    The rewriter pass uses register assignments in VirtRegMap to rewrite
    virtual registers to physical registers, and it inserts kill flags based
    on live intervals.
    These finalization steps are the same for the optimizing register
    allocators: RABasic, RAGreedy, and PBQP.
    git-svn-id: 91177308-0d34-0410-b5e6-96231b3b80d8
  3. @stoklund

    Don't run RAFast in the optimizing regalloc pipeline.

    stoklund committed
    The fast register allocator is not supposed to work in the optimizing
    pipeline. It doesn't make sense to compute live intervals, run full copy
    coalescing, and then run RAFast.
    Fast register allocation in the optimizing pipeline is better done by
    git-svn-id: 91177308-0d34-0410-b5e6-96231b3b80d8
  4. @nunoplopes


    nunoplopes committed
    -%a + 42
    42 - %a
    previously we were emitting:
    -(%a + 42)
    This fixes the infinite loop in PR12338. The generated code is still not perfect, though.
    Will work on that next
    git-svn-id: 91177308-0d34-0410-b5e6-96231b3b80d8
  5. Start implementing pre-ra if-converter: using speculation and selects…

    Evan Cheng committed
    … to eliminate branches.
    git-svn-id: 91177308-0d34-0410-b5e6-96231b3b80d8
  6. @atrick

    TargetInstrInfo hooks implemented in codegen should be declared pure …

    atrick committed
    git-svn-id: 91177308-0d34-0410-b5e6-96231b3b80d8
  7. @CunningBaldrick

    Reapply commit 158073 with a fix (the testcase was already committed)…

    CunningBaldrick committed
    …. The
    problem was that by moving instructions around inside the function, the pass
    could accidentally move the iterator being used to advance over the function
    too.  Fix this by only processing the instruction equal to the iterator, and
    leaving processing of instructions that might not be equal to the iterator
    to later (later = after traversing the basic block; it could also wait until
    after traversing the entire function, but this might make the sets quite big).
    Original commit message:
    Grab-bag of reassociate tweaks.  Unify handling of dead instructions and
    instructions to reoptimize.  Exploit this to more systematically eliminate
    dead instructions (this isn't very useful in practice but is convenient for
    analysing some testcase I am working on).  No need for WeakVH any more: use
    an AssertingVH instead.
    git-svn-id: 91177308-0d34-0410-b5e6-96231b3b80d8
  8. @hfinkel

    Remove the TODO statement in the PPC README re: CTR loops

    hfinkel committed
    As Chris points out, this can now be removed!
    TODO: check if the associated section on viterbi's inner loop can also be removed.
    git-svn-id: 91177308-0d34-0410-b5e6-96231b3b80d8
  9. @hfinkel

    Enable PPC CTR loop formation by default.

    hfinkel committed
    Thanks to Jakob's help, this now causes no new test suite failures!
    Over the entire test suite, this gives an average 1% speedup. The largest speedups are:
    SingleSource/Benchmarks/Misc/pi - 108%
    SingleSource/Benchmarks/CoyoteBench/lpbench - 54%
    MultiSource/Benchmarks/Prolangs-C/unix-smail/unix-smail - 50%
    SingleSource/Benchmarks/Shootout/ary3 - 32%
    SingleSource/Benchmarks/Shootout-C++/matrix - 30%
    The largest slowdowns are:
    MultiSource/Benchmarks/mediabench/gsm/toast/toast - -30%
    MultiSource/Benchmarks/Prolangs-C/bison/mybison - -25%
    MultiSource/Benchmarks/BitBench/uuencode/uuencode - -22%
    MultiSource/Applications/d/make_dparser - -14%
    SingleSource/Benchmarks/Shootout-C++/ary - -13%
    In light of these slowdowns, additional profiling work is obviously needed!
    git-svn-id: 91177308-0d34-0410-b5e6-96231b3b80d8
  10. @hfinkel

    Mark the PPC CTRRC and CTRRC8 register classes as non-allocatable.

    hfinkel committed
    Marking these classes as non-alocatable allows CTR loop generation to
    work correctly with the block placement passes, etc. These register
    classes are currently used only by some unused TCRETURN patterns.
    In future cleanup, these will be removed.
    Thanks again to Jakob for suggesting this fix to the CTR loop problem!
    git-svn-id: 91177308-0d34-0410-b5e6-96231b3b80d8
  11. @mren2
  12. @mren2

    Test case for r158160

    mren2 committed
    git-svn-id: 91177308-0d34-0410-b5e6-96231b3b80d8
  13. @atrick

    Sched itinerary fix: Avoid static initializers.

    atrick committed
    This fixes an accidental dependence on static initialization order that I introduced yesterday.
    Thank you Lang!!!
    git-svn-id: 91177308-0d34-0410-b5e6-96231b3b80d8
  14. Fix a crash in APInt::lshr when shiftAmt > BitWidth.

    Chad Rosier committed
    Patch by James Benton <>.
    git-svn-id: 91177308-0d34-0410-b5e6-96231b3b80d8
Something went wrong with that request. Please try again.