Permalink
Commits on Aug 29, 2016
  1. @tstellarAMD

    AMDGPU/SI: Implement a custom MachineSchedStrategy

    Summary:
    GCNSchedStrategy re-uses most of GenericScheduler, it's just uses
    a different method to compute the excess and critical register
    pressure limits.
    
    It's not enabled by default, to enable it you need to pass -misched=gcn
    to llc.
    
    Shader DB stats:
    
    32464 shaders in 17874 tests
    Totals:
    SGPRS: 1542846 -> 1643125 (6.50 %)
    VGPRS: 1005595 -> 904653 (-10.04 %)
    Spilled SGPRs: 29929 -> 27745 (-7.30 %)
    Spilled VGPRs: 334 -> 352 (5.39 %)
    Scratch VGPRs: 1612 -> 1624 (0.74 %) dwords per thread
    Code Size: 36688188 -> 37034900 (0.95 %) bytes
    LDS: 1913 -> 1913 (0.00 %) blocks
    Max Waves: 254101 -> 265125 (4.34 %)
    Wait states: 0 -> 0 (0.00 %)
    
    Totals from affected shaders:
    SGPRS: 1338220 -> 1438499 (7.49 %)
    VGPRS: 886221 -> 785279 (-11.39 %)
    Spilled SGPRs: 29869 -> 27685 (-7.31 %)
    Spilled VGPRs: 334 -> 352 (5.39 %)
    Scratch VGPRs: 1612 -> 1624 (0.74 %) dwords per thread
    Code Size: 34315716 -> 34662428 (1.01 %) bytes
    LDS: 1551 -> 1551 (0.00 %) blocks
    Max Waves: 188127 -> 199151 (5.86 %)
    Wait states: 0 -> 0 (0.00 %)
    
    Reviewers: arsenm, mareko, nhaehnle, MatzeB, atrick
    
    Subscribers: arsenm, kzhuravl, llvm-commits
    
    Differential Revision: https://reviews.llvm.org/D23688
    
    git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279995 91177308-0d34-0410-b5e6-96231b3b80d8
    tstellarAMD committed Aug 29, 2016
  2. @vitalybuka

    [asan] Enable new stack poisoning with store instruction by default

    Reviewers: eugenis
    
    Subscribers: llvm-commits
    
    Differential Revision: https://reviews.llvm.org/D23968
    
    git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279993 91177308-0d34-0410-b5e6-96231b3b80d8
    vitalybuka committed Aug 29, 2016
  3. @TNorthover

    GlobalISel: switch to SmallVector for pending legalizations.

    std::queue was doing far to many heap allocations to be healthy.
    
    git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279992 91177308-0d34-0410-b5e6-96231b3b80d8
    TNorthover committed Aug 29, 2016
  4. @tstellarAMD

    AMDGPU/SI: Improve SILoadStoreOptimizer and run it before the scheduler

    Summary:
    The SILoadStoreOptimizer can now look ahead more then one instruction when
    looking for instructions to merge, which greatly improves the number of
    loads/stores that we are able to merge.
    
    Moving the pass before scheduling avoids increasing register pressure after
    the scheduler, so that the scheduler's register pressure estimates will be
    more accurate.  It also gives more consistent results, since it is no longer
    affected by minor scheduling changes.
    
    Reviewers: arsenm
    
    Subscribers: arsenm, kzhuravl, llvm-commits
    
    Differential Revision: https://reviews.llvm.org/D23814
    
    git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279991 91177308-0d34-0410-b5e6-96231b3b80d8
    tstellarAMD committed Aug 29, 2016
  5. @TNorthover

    ASan: remove variable only used in assertions build

    git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279990 91177308-0d34-0410-b5e6-96231b3b80d8
    TNorthover committed Aug 29, 2016
  6. @TNorthover

    GlobalISel: legalize frem to a libcall on AArch64.

    git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279988 91177308-0d34-0410-b5e6-96231b3b80d8
    TNorthover committed Aug 29, 2016
  7. @TNorthover

    GlobalISel: rework CallLowering so that it can be used for libcalls too.

    There should be no functional change here, I'm just making the implementation
    of "frem" (to libcall) legalization easier for a followup.
    
    git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279987 91177308-0d34-0410-b5e6-96231b3b80d8
    TNorthover committed Aug 29, 2016
  8. @arsenm

    AMDGPU/R600: Fix fixups used for constant arrays

    Fixes bug 29289
    
    git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279986 91177308-0d34-0410-b5e6-96231b3b80d8
    arsenm committed Aug 29, 2016
  9. IfConversion: Fix branch predication bug.

    This bug shows up with diamonds that share unpredicable, unanalyzable branches.
    There's an included test case from Hexagon. What was happening was that we were
    attempting to predicate the branch instruction despite the fact that it was
    checked to be the same. Now for unanalyzable branches we skip over the branch
    instructions when predicating the block.
    
    Differential Revision: https://reviews.llvm.org/D23939
    
    git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279985 91177308-0d34-0410-b5e6-96231b3b80d8
    Kyle Butt committed Aug 29, 2016
  10. @vitalybuka

    Use store operation to poison allocas for lifetime analysis.

    Summary:
    Calling __asan_poison_stack_memory and __asan_unpoison_stack_memory for small
    variables is too expensive.
    
    Code is disabled by default and can be enabled by -asan-experimental-poisoning.
    
    PR27453
    
    Reviewers: eugenis
    
    Subscribers: llvm-commits
    
    Differential Revision: https://reviews.llvm.org/D23947
    
    git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279984 91177308-0d34-0410-b5e6-96231b3b80d8
    vitalybuka committed Aug 29, 2016
  11. @vitalybuka

    [asan] Separate calculation of ShadowBytes from calculating ASanStack…

    …FrameLayout
    
    Summary: No functional changes, just refactoring to make D23947 simpler.
    
    Reviewers: eugenis
    
    Subscribers: llvm-commits
    
    Differential Revision: https://reviews.llvm.org/D23954
    
    git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279982 91177308-0d34-0410-b5e6-96231b3b80d8
    vitalybuka committed Aug 29, 2016
  12. @majnemer

    [SimplifyCFG] Hoisting invalidates metadata

    We forgot to remove optimization metadata when performing hosting during
    FoldTwoEntryPHINode.
    
    This fixes PR29163.
    
    git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279980 91177308-0d34-0410-b5e6-96231b3b80d8
    majnemer committed Aug 29, 2016
  13. @rnk

    Make vec_fabs.ll pass with MSVC 2013

    We should revert this change once we drop support for MSVC 2013.
    
    git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279979 91177308-0d34-0410-b5e6-96231b3b80d8
    rnk committed Aug 29, 2016
  14. [gold] Fix test accidentally regressed for newer gold

    With r279911 I accidentally regressed the gold/X86/start-lib-common.ll
    test for newer golds (v1.12+) that honor the --start-lib/--end-lib.
    Remove the alignment which should not be there to make this work with
    both old and new gold linkers.
    
    Additionally, now that we have a subdirectory for v1.12+ gold tests,
    copy this test there and check specifically for the v1.12+ behavior.
    
    git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279977 91177308-0d34-0410-b5e6-96231b3b80d8
    Teresa Johnson committed Aug 29, 2016
  15. [AArch64] Adjust the scheduling model for Exynos M1.

    Further refine the model for loads.
    
    git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279976 91177308-0d34-0410-b5e6-96231b3b80d8
    Evandro Menezes committed Aug 29, 2016
  16. @annamthomas

    [StatepointsForGC] Rematerialize in the presence of PHIs

    Summary:
    While walking the use chain for identifying rematerializable values in RS4GC,
    add the case where the current value and base value are the same PHI nodes.
    
    This will aid rematerialization of geps and casts instead of relocating.
    
    Reviewers: sanjoy, reames, igor
    
    Subscribers: llvm-commits
    
    Differential Revision: https://reviews.llvm.org/D23920
    
    git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279975 91177308-0d34-0410-b5e6-96231b3b80d8
    annamthomas committed Aug 29, 2016
  17. [LTO] Remove extraneous output

    Remove some debugging output to stderr that snuck in with r279576.
    
    git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279974 91177308-0d34-0410-b5e6-96231b3b80d8
    Teresa Johnson committed Aug 29, 2016
  18. @rotateright

    [Constant] remove fdiv and frem from canTrap()

    Assuming the default FP env, we should not treat fdiv and frem any differently in terms of
    trapping behavior than any other FP op. Ie, FP ops do not trap with the default FP env.
    
    This matches how we treat the fdiv/frem in IR with isSafeToSpeculativelyExecute() and in 
    the backend after:
    https://reviews.llvm.org/rL279970
    
    
    
    git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279973 91177308-0d34-0410-b5e6-96231b3b80d8
    rotateright committed Aug 29, 2016
  19. @rotateright

    [SimplifyCFG] rename test file, regenerate checks, and add test

    The fdiv test shows a problem similar to:
    https://reviews.llvm.org/rL279970
    
    
    git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279972 91177308-0d34-0410-b5e6-96231b3b80d8
    rotateright committed Aug 29, 2016
  20. @GorNishanov

    [Coroutines] Part 9: Add cleanup subfunction.

    Summary:
    [Coroutines] Part 9: Add cleanup subfunction.
    
    This patch completes coroutine heap allocation elision. Now, the heap elision example from docs\Coroutines.rst compiles and produces expected result (see test/Transform/Coroutines/ex3.ll)
    
    Intrinsic Changes:
    * coro.free gets a token parameter tying it to coro.id to allow reliably discovering all coro.frees associated with a particular coroutine.
    * coro.id gets an extra parameter that points back to a coroutine function. This allows to check whether a coro.id describes the enclosing function or it belongs to a different function that was later inlined.
    
    CoroSplit now creates three subfunctions:
    # f$resume - resume logic
    # f$destroy - cleanup logic, followed by a deallocation code
    # f$cleanup - just the cleanup code
    
    CoroElide pass during devirtualization replaces coro.destroy with either f$destroy or f$cleanup depending whether heap elision is performed or not.
    
    Other fixes, improvements:
    * Fixed buglet in Shape::buildFrame that was not creating coro.save properly if coroutine has more than one suspend point.
    
    * Switched to using variable width suspend index field (no longer limited to 32 bit index field can be as little as i1 or as large as i<whatever-size_t-is>)
    
    Reviewers: majnemer
    
    Subscribers: llvm-commits, mehdi_amini
    
    Differential Revision: https://reviews.llvm.org/D23844
    
    git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279971 91177308-0d34-0410-b5e6-96231b3b80d8
    GorNishanov committed Aug 29, 2016
  21. @rotateright

    [TargetLowering] remove fdiv and frem from canOpTrap() (PR29114)

    Assuming the default FP env, we should not treat fdiv and frem any differently in terms of 
    trapping behavior than any other FP op. Ie, FP ops do not trap with the default FP env.
    
    This matches how we treat these ops in IR with isSafeToSpeculativelyExecute(). There's a 
    similar bug in Constant::canTrap().
    
    This bug manifests in PR29114:
    https://llvm.org/bugs/show_bug.cgi?id=29114
    ...as a sequence of scalar divisions instead of a vector division on x86 for a <3 x float> 
    type.
    
    Differential Revision: https://reviews.llvm.org/D23974
    
    
    git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279970 91177308-0d34-0410-b5e6-96231b3b80d8
    rotateright committed Aug 29, 2016
  22. Do not use MRI::getMaxLaneMaskForVReg as a mask covering whole register

    MRI::getMaxLaneMaskForVReg does not always cover the whole register.
    For example, on X86 the upper 16 bits of EAX cannot be accessed via
    any subregister. Consequently, there is no lane mask that only covers
    that part of EAX. The getMaxLaneMaskForVReg will return the union of
    the lane masks for all subregisters, and in case of EAX, that union
    will not cover the upper 16 bits.
    
    This fixes https://llvm.org/bugs/show_bug.cgi?id=29132
    
    
    git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279969 91177308-0d34-0410-b5e6-96231b3b80d8
    Krzysztof Parzyszek committed Aug 29, 2016
  23. @tstellarAMD

    AMDGPU/SI: Improve register allocation hints for sopk instructions

    Summary:
    For shrinking SOPK instructions, we were creating a hint to tell the
    register allocator to use the register allocated for src0 for the dst
    operand as well.  However, this seems to not work sometimes depending
    on the order virtual registers are assigned physical registers.
    
    To fix this, I've added a second allocation hint which does the reverse,
    asks that the register allocated for dst is used for src0.
    
    Reviewers: arsenm
    
    Subscribers: arsenm, llvm-commits, kzhuravl
    
    Differential Revision: https://reviews.llvm.org/D23862
    
    git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279968 91177308-0d34-0410-b5e6-96231b3b80d8
    tstellarAMD committed Aug 29, 2016
  24. @espindola

    Use the correct ctor/dtor section for dynamic-no-pic.

    git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279967 91177308-0d34-0410-b5e6-96231b3b80d8
    espindola committed Aug 29, 2016
  25. @d0k

    Mark test as XFAIL instead of disabling it everywhere.

    There is no lit feature 'X86' so this test is just disabled completely.
    Make it XFAIL until a solution is found.
    
    git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279966 91177308-0d34-0410-b5e6-96231b3b80d8
    d0k committed Aug 29, 2016
  26. @espindola

    Move code only used by codegen out of MC. NFC.

    MC itself never needs to know about these sections.
    
    git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279965 91177308-0d34-0410-b5e6-96231b3b80d8
    espindola committed Aug 29, 2016
  27. @hokein

    Fix -Wunused-but-set-variable warning.

    Summary: A follow-up fix on r279958.
    
    Reviewers: bkramer
    
    Subscribers: cfe-commits
    
    Differential Revision: https://reviews.llvm.org/D23989
    
    git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279964 91177308-0d34-0410-b5e6-96231b3b80d8
    hokein committed Aug 29, 2016
  28. @tstellarAMD

    AMDGPU/SI: Query AA, if available, in areMemAccessesTriviallyDisjoint()

    Summary:
    The SILoadStoreOptimizer will need to use AliasAnalysis here in order to
    move it before scheduling.
    
    Reviewers: arsenm
    
    Subscribers: arsenm, llvm-commits, kzhuravl
    
    Differential Revision: https://reviews.llvm.org/D23813
    
    git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279963 91177308-0d34-0410-b5e6-96231b3b80d8
    tstellarAMD committed Aug 29, 2016
  29. Fixed a bug in type legalizer for masked gather.

    The problem occurs when the Node doesn't updated in place , UpdateNodeOperation() return the node that already exist.
    In this case assert fail in PromoteIntegerOperand() , N have 2 results ( val + chain).
    
    Differential Revision: http://reviews.llvm.org/D23756
    
    git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279961 91177308-0d34-0410-b5e6-96231b3b80d8
    Igor Breger committed Aug 29, 2016
  30. [AVX512] In some cases KORTEST instruction may be used instead of ZEX…

    …T + TEST sequence.
    
    Differential Revision: http://reviews.llvm.org/D23490
    
    git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279960 91177308-0d34-0410-b5e6-96231b3b80d8
    Igor Breger committed Aug 29, 2016
  31. @hokein

    [InstructionSelect] NumBlocks isn't defined in DEBUG build.

    Summary: A follow-up fixing on http://llvm.org/viewvc/llvm-project?view=revision&revision=279905.
    
    Reviewers: bkramer
    
    Subscribers: cfe-commits
    
    Differential Revision: https://reviews.llvm.org/D23985
    
    git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279959 91177308-0d34-0410-b5e6-96231b3b80d8
    hokein committed Aug 29, 2016
  32. [X86] Don't lower FABS/FNEG masking directly to a ConstantPool load. …

    …Just create a ConstantFPSDNode and let that be lowered.
    
    This allows broadcast loads to used when available.
    
    git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279958 91177308-0d34-0410-b5e6-96231b3b80d8
    Craig Topper committed Aug 29, 2016
  33. [AVX-512] Always use v8i64 when converting 512-bit FAND/FOR/FXOR/FAND…

    …N to integer operations when DQI isn't supported. This is consistent with the recent changes to promote logical operations to i64 vectors.
    
    git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279957 91177308-0d34-0410-b5e6-96231b3b80d8
    Craig Topper committed Aug 29, 2016
  34. [AVX-512] Add 512-bit fabs tests with and without AVX512DQ.

    git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279956 91177308-0d34-0410-b5e6-96231b3b80d8
    Craig Topper committed Aug 29, 2016
  35. @lhames

    [Orc] Simplify LogicalDylib and move it back inside CompileOnDemandLa…

    …yer. Also
    
    switch to using one indirect stub manager per logical dylib rather than one per
    input module.
    
    LogicalDylib is a helper class used by the CompileOnDemandLayer to manage
    symbol resolution between modules during lazy compilation. In particular, it
    ensures that internal symbols resolve correctly even in the case where multiple
    input modules contain the same internal symbol name (which must to be promoted
    to external hidden linkage so that functions in any given module can be split
    out by lazy compilation). LogicalDylib's resolution scheme (before this commit)
    required one stub-manager per input module. This made recompilation of functions
    (by adding a module containing a new definition) difficult, as the stub manager
    for any given symbol was bound to the module that supplied the original
    definition. By using one stubs manager for the whole logical dylib symbols can
    be more easily replaced, although support for doing this is not included in this
    patch (it will be implemented in a follow up).
    
    
    
    git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279952 91177308-0d34-0410-b5e6-96231b3b80d8
    lhames committed Aug 29, 2016