Permalink
Branch: release_38
Commits on Jun 4, 2016
-
tstellarAMD committed
Jun 4, 2016 ------------------------------------------------------------------------ r257663 | dimitry | 2016-01-13 11:48:50 -0800 (Wed, 13 Jan 2016) | 4 lines Remove bashism from merge.sh: POSIX sh does not have the `function` reserved word, and it is even superfluous in bash, for this particular instance. ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_38@271772 91177308-0d34-0410-b5e6-96231b3b80d8
-
tstellarAMD committed
Jun 4, 2016 ------------------------------------------------------------------------ r268295 | thomas.stellard | 2016-05-02 13:11:44 -0700 (Mon, 02 May 2016) | 7 lines AMDGPU/SI: Use v_readfirstlane_b32 when restoring SGPRs spilled to scratch We were using v_readlane_b32 with the lane set to zero, but this won't work if thread 0 is not active. Differential Revision: http://reviews.llvm.org/D19745 ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_38@271771 91177308-0d34-0410-b5e6-96231b3b80d8
-
tstellarAMD committed
Jun 4, 2016 ------------------------------------------------------------------------ r268287 | thomas.stellard | 2016-05-02 12:37:56 -0700 (Mon, 02 May 2016) | 19 lines AMDGPU/SI: Set the kill flag on temp VGPRs used to restore SGPRs from scratch Summary: When we restore an SGPR value from scratch, we first load it into a temporary VGPR and then use v_readlane_b32 to copy the value from the VGPR back into an SGPR. We weren't setting the kill flag on the VGPR in the v_readlane_b32 instruction, so the register scavenger wasn't able to re-use this temp value later. I wasn't able to create a lit test for this. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D19744 ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_38@271770 91177308-0d34-0410-b5e6-96231b3b80d8
-
tstellarAMD committed
Jun 4, 2016 ------------------------------------------------------------------------ r268259 | nhaehnle | 2016-05-02 10:37:01 -0700 (Mon, 02 May 2016) | 14 lines AMDGPU: llvm.SI.fs.constant is a source of divergence Summary: This intrinsic is used to get flat-shaded fragment shader inputs. Those are uniform across a primitive, but a fragment shader wave may process pixels from multiple primitives (as indicated by the prim_mask), and so that's where divergence can arise. Reviewers: arsenm, tstellarAMD Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D19747 ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_38@271769 91177308-0d34-0410-b5e6-96231b3b80d8
-
tstellarAMD committed
Jun 4, 2016 ------------------------------------------------------------------------ r267916 | Matthew.Arsenault | 2016-04-28 11:38:48 -0700 (Thu, 28 Apr 2016) | 6 lines AMDGPU: Fix mishandling array allocations when promoting alloca The canonical form for allocas is a single allocation of the array type. In case we see a non-canonical array alloca, make sure we aren't replacing this with an array N times smaller. ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_38@271768 91177308-0d34-0410-b5e6-96231b3b80d8
-
tstellarAMD committed
Jun 4, 2016 ------------------------------------------------------------------------ r266825 | nhaehnle | 2016-04-19 14:58:22 -0700 (Tue, 19 Apr 2016) | 12 lines AMDGPU: Guard VOPC instructions against incorrect commute Summary: The added testcase, which triggered this, was derived from a shader-db case via bugpoint. A separate question is why scalar branching wasn't used. Reviewers: arsenm, tstellarAMD Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D19208 ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_38@271767 91177308-0d34-0410-b5e6-96231b3b80d8
-
tstellarAMD committed
Jun 4, 2016 ------------------------------------------------------------------------ r266824 | nhaehnle | 2016-04-19 14:58:17 -0700 (Tue, 19 Apr 2016) | 21 lines AMDGPU/SI: SGPR accounting in getSIProgramInfo must ignore exec_lo/hi Summary: A shader stored the live mask (initial exec mask) in an SGPR which was then spilled during register allocation. The allocator quite reasonably optimized turned the spill into v_writelane_b32 %vgpr, exec_lo, N v_writelane_b32 %vgpr, exec_hi, N+1 at the beginning of the shader, confusing the SGPR accounting. No test case, because si-sgpr-spill.ll together with an upcoming patch for WQM handling exhibits the problem. Reviewers: arsenm, tstellarAMD Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D19199 ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_38@271766 91177308-0d34-0410-b5e6-96231b3b80d8
-
tstellarAMD committed
Jun 4, 2016 ------------------------------------------------------------------------ r266244 | thomas.stellard | 2016-04-13 13:44:16 -0700 (Wed, 13 Apr 2016) | 13 lines AMDGPU/SI: Add support for spilling VGPRs without having to scavenge registers Summary: When we are spilling SGPRs to scratch memory, we usually don't have free SGPRs to do the address calculation, so we need to re-use the ScratchOffset register for the calculation. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18917 ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_38@271765 91177308-0d34-0410-b5e6-96231b3b80d8
Commits on Jun 3, 2016
-
tstellarAMD committed
Jun 3, 2016 ------------------------------------------------------------------------ r266152 | thomas.stellard | 2016-04-12 16:57:30 -0700 (Tue, 12 Apr 2016) | 13 lines AMDGPU/SI: Fix spilling of 96-bit registers Summary: It seems like this was broken in r252327. I thought we had test cases for this, but it's really hard to tirgger spills of this exact register size since they aren't used very much. Reviewers: arsenm, nhaehnle Subscribers: nhaehnle, arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D19021 ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_38@271735 91177308-0d34-0410-b5e6-96231b3b80d8
-
tstellarAMD committed
Jun 3, 2016 ------------------------------------------------------------------------ r266105 | thomas.stellard | 2016-04-12 11:40:43 -0700 (Tue, 12 Apr 2016) | 15 lines AMDGPU/SI: Insert wait states required after v_readfirstlane on SI Summary: We will be able to handle this case much better once the hazard recognizer is finished, but this conservative implementation fixes a hang with the piglit test: spec/arb_arrays_of_arrays/execution/sampler/fs-nested-struct-arrays-nonconst-nested-arra Reviewers: arsenm, nhaehnle Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18988 ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_38@271731 91177308-0d34-0410-b5e6-96231b3b80d8
-
tstellarAMD committed
Jun 3, 2016 ------------------------------------------------------------------------ r266088 | nhaehnle | 2016-04-12 09:10:38 -0700 (Tue, 12 Apr 2016) | 16 lines AMDGPU/SI: Fix a mis-compilation of multi-level breaks Summary: Under certain circumstances, multi-level breaks (or what is understood by the control flow passes as such) could be miscompiled in a way that causes infinite loops, by emitting incorrect control flow intrinsics. This fixes a hang in dEQP-GLES3.functional.shaders.loops.while_dynamic_iterations.conditional_continue_vertex Reviewers: arsenm, tstellarAMD Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18967 ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_38@271730 91177308-0d34-0410-b5e6-96231b3b80d8
-
tstellarAMD committed
Jun 3, 2016 ------------------------------------------------------------------------ r264214 | Matthew.Arsenault | 2016-03-23 16:17:29 -0700 (Wed, 23 Mar 2016) | 2 lines AMDGPU: Promote alloca should skip volatiles ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_38@271729 91177308-0d34-0410-b5e6-96231b3b80d8
-
tstellarAMD committed
Jun 3, 2016 ------------------------------------------------------------------------ r263627 | michel.daenzer | 2016-03-16 02:10:42 -0700 (Wed, 16 Mar 2016) | 11 lines AMDGPU: Verify instructions in non-debug builds as well And emit an error if it fails. This prevents illegal instructions from getting sent to the GPU, which would potentially result in a hang. This is a candidate for the stable branch(es). Reviewed-by: Marek Olšák <marek.olsak@amd.com> ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_38@271724 91177308-0d34-0410-b5e6-96231b3b80d8
-
tstellarAMD committed
Jun 3, 2016 ------------------------------------------------------------------------ r263441 | marek.olsak | 2016-03-14 08:57:14 -0700 (Mon, 14 Mar 2016) | 8 lines AMDGPU/SI: Incomplete shader binaries need to finish execution at the end Reviewers: tstellarAMD, arsenm Subscribers: arsenm Differential Revision: http://reviews.llvm.org/D18058 ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_38@271723 91177308-0d34-0410-b5e6-96231b3b80d8
-
tstellarAMD committed
Jun 3, 2016 ------------------------------------------------------------------------ r262732 | thomas.stellard | 2016-03-04 10:31:18 -0800 (Fri, 04 Mar 2016) | 12 lines AMDGPU/SI: Add support for spiling SGPRs to scratch buffer Summary: This is necessary for when we run out of VGPRs and can no longer use v_{read,write}_lane for spilling SGPRs. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D17592 ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_38@271722 91177308-0d34-0410-b5e6-96231b3b80d8 -
tstellarAMD committed
Jun 3, 2016 ------------------------------------------------------------------------ r262728 | thomas.stellard | 2016-03-04 10:02:01 -0800 (Fri, 04 Mar 2016) | 19 lines AMDGPU/SI: Enable frame index scavenging during PrologEpilogueInserter Summary: This allows us to use virtual registers when we need extra registers for inserting spill instructions in SIRegisterInfo:eliminateFrameIndex(). Once all the frame indices have been eliminated, the PrologEpilogueInserter does an extra pass over the program to replace all virtual registers with physical ones. This allows us to make more efficient use of our emergency spill slots, so we only need to create one. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D17591 ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_38@271721 91177308-0d34-0410-b5e6-96231b3b80d8
-
tstellarAMD committed
Jun 3, 2016 ------------------------------------------------------------------------ r262577 | thomas.stellard | 2016-03-02 19:45:09 -0800 (Wed, 02 Mar 2016) | 12 lines AMDGPU/SI: Don't try to move scratch wave offset when there are no free SGPRs Summary: When there were no free SGPRs, we were trying to move this value into some of the reserved registers which was causing a segmentation fault. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D17590 ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_38@271720 91177308-0d34-0410-b5e6-96231b3b80d8
-
tstellarAMD committed
Jun 3, 2016 ------------------------------------------------------------------------ r262297 | Matthew.Arsenault | 2016-02-29 20:58:20 -0800 (Mon, 29 Feb 2016) | 2 lines AMDGPU: Don't use estimated stack size when we know the real stack size ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_38@271719 91177308-0d34-0410-b5e6-96231b3b80d8
-
tstellarAMD committed
Jun 3, 2016 ------------------------------------------------------------------------ r261385 | thomas.stellard | 2016-02-19 16:37:25 -0800 (Fri, 19 Feb 2016) | 20 lines AMDGPU/SI: Use v_readfirstlane to legalize SMRD with VGPR base pointer Summary: Instead of trying to replace SMRD instructions with a VGPR base pointer with an equivalent MUBUF instruction, we now copy the base pointer to SGPRs using v_readfirstlane. This is safe to do, because any load selected as an SMRD instruction has been proven to have a uniform base pointer, so each thread in the wave will have the same pointer value in VGPRs. This will fix some errors on VI from trying to replace SMRD instructions with addr64-enabled MUBUF instructions that don't exist. Reviewers: arsenm, cfang, nhaehnle Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D17305 ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_38@271700 91177308-0d34-0410-b5e6-96231b3b80d8
-
tstellarAMD committed
Jun 3, 2016 ------------------------------------------------------------------------ r260692 | changpeng.fang | 2016-02-12 09:11:04 -0800 (Fri, 12 Feb 2016) | 13 lines AMDGPU/SI: Annotate Loops with Constant Condition in SIAnnotateControlFlow pass. Summary: It is possible that the loop condition can be a boolean constant (infinite loop, for example). So we sould handle constant condition in annotating a loop. This patch adds this functionality to support annotating constant condition. Reviewers: tstellarAMD, arsenm Subscribers: llvm-commits, arsenm Differential Revision: http://reviews.llvm.org/D15093 ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_38@271685 91177308-0d34-0410-b5e6-96231b3b80d8
-
tstellarAMD committed
Jun 3, 2016 ------------------------------------------------------------------------ r260658 | Matthew.Arsenault | 2016-02-11 22:31:30 -0800 (Thu, 11 Feb 2016) | 12 lines AMDGPU: Set flat_scratch from flat_scratch_init reg This was hardcoded to the static private size, but this would be missing the offset and additional size for someday when we have dynamic sizing. Also stops always initializing flat_scratch even when unused. In the future we should stop emitting this unless flat instructions are used to access private memory. For example this will initialize it almost always on VI because flat is used for global access. ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_38@271684 91177308-0d34-0410-b5e6-96231b3b80d8
-
tstellarAMD committed
Jun 3, 2016 ------------------------------------------------------------------------ r260651 | Matthew.Arsenault | 2016-02-11 18:40:47 -0800 (Thu, 11 Feb 2016) | 7 lines AMDGPU: Set element_size in private resource descriptor Introduce a subtarget feature for this, and leave the default with the current behavior which assumes up to 16-byte loads/stores can be used. The field also seems to have the ability to be set to 2 bytes, but I'm not sure what that would be used for. ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_38@271679 91177308-0d34-0410-b5e6-96231b3b80d8
-
tstellarAMD committed
Jun 3, 2016 ------------------------------------------------------------------------ r260645 | Matthew.Arsenault | 2016-02-11 18:16:10 -0800 (Thu, 11 Feb 2016) | 2 lines AMDGPU: Initialize SILowerControlFlow ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_38@271643 91177308-0d34-0410-b5e6-96231b3b80d8
-
tstellarAMD committed
Jun 3, 2016 ------------------------------------------------------------------------ r260599 | thomas.stellard | 2016-02-11 13:45:07 -0800 (Thu, 11 Feb 2016) | 14 lines AMDGPU/SI: Make sure MIMG descriptors and samplers stay in SGPRs Summary: It's possible to have resource descriptors and samplers stored in VGPRs, either by a VMEM instruction or in the case of samplers, floating-point calculations. When this happens, we need to use v_readfirstlane to copy these values back to sgprs. Reviewers: mareko, arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D17102 ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_38@271642 91177308-0d34-0410-b5e6-96231b3b80d8
-
tstellarAMD committed
Jun 3, 2016 ------------------------------------------------------------------------ r260588 | thomas.stellard | 2016-02-11 13:14:34 -0800 (Thu, 11 Feb 2016) | 20 lines AMDGPU/SI: When splitting SMRD instructions, add its users to VALU worklist Summary: When we split SMRD instructions into two MUBUFs we were adding the users of the newly created MUBUFs to the VALU worklist. However, the only users these instructions had was the REG_SEQUENCE that was inserted by splitSMRD when the original SMRD instruction was split. We need to make sure to add the users of the original SMRD to the VALU worklist before it is split. I have a test case, but it requires one other bug fix, so it will be added in a later commt. Reviewers: mareko, arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D17101 ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_38@271641 91177308-0d34-0410-b5e6-96231b3b80d8
-
tstellarAMD committed
Jun 3, 2016 ------------------------------------------------------------------------ r260495 | Matthew.Arsenault | 2016-02-10 22:15:39 -0800 (Wed, 10 Feb 2016) | 9 lines AMDGPU: Fix constant bus use check with subregisters If the two operands to an instruction were both subregisters of the same super register, it would incorrectly think this counted as the same constant bus use. This fixes the verifier error in fmin_legacy.ll which was missing -verify-machineinstrs. ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_38@271640 91177308-0d34-0410-b5e6-96231b3b80d8
-
tstellarAMD committed
Jun 3, 2016 ------------------------------------------------------------------------ r259911 | Matthew.Arsenault | 2016-02-05 11:47:23 -0800 (Fri, 05 Feb 2016) | 5 lines AMDGPU: Preserve alignments on new created globals Also switch to internal linkage, and include the name of the function in the name. ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_38@271639 91177308-0d34-0410-b5e6-96231b3b80d8
Commits on Jun 2, 2016
-
tstellarAMD committed
Jun 2, 2016 ------------------------------------------------------------------------ r259894 | thomas.stellard | 2016-02-05 09:42:38 -0800 (Fri, 05 Feb 2016) | 8 lines AMDGPU/SI: Correctly initialize SIInsertWaits pass Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D16724 ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_38@271594 91177308-0d34-0410-b5e6-96231b3b80d8
-
tstellarAMD committed
Jun 2, 2016 ------------------------------------------------------------------------ r259558 | Matthew.Arsenault | 2016-02-02 12:28:10 -0800 (Tue, 02 Feb 2016) | 4 lines AMDGPU: Handle promoting memmove Also add missing tests for the others. ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_38@271593 91177308-0d34-0410-b5e6-96231b3b80d8
-
tstellarAMD committed
Jun 2, 2016 ------------------------------------------------------------------------ r259546 | Matthew.Arsenault | 2016-02-02 11:18:53 -0800 (Tue, 02 Feb 2016) | 5 lines AMDGPU: Whitelist handled intrinsics We shouldn't crash on unhandled intrinsics. Also simplify failure handling in loop. ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_38@271592 91177308-0d34-0410-b5e6-96231b3b80d8
-
tstellarAMD committed
Jun 2, 2016 We need to correctly initialize the AMDGPUPromoteAlloca pass, because later commits will add tests that try to pass the -amdgpu-promote-alloca flag to opt. git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_38@271591 91177308-0d34-0410-b5e6-96231b3b80d8
-
tstellarAMD committed
Jun 2, 2016 ------------------------------------------------------------------------ r259059 | thomas.stellard | 2016-01-28 09:13:44 -0800 (Thu, 28 Jan 2016) | 14 lines AMDGPU: waitcnt operand fixes Summary: Allow lgkmcnt up to 0xF (hardware allows that). Fix mask for ExpCnt in AMDGPUInstPrinter. Reviewers: tstellarAMD, arsenm Subscribers: arsenm Differential Revision: http://reviews.llvm.org/D16314 Patch by: Nikolay Haustov ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_38@271590 91177308-0d34-0410-b5e6-96231b3b80d8
-
tstellarAMD committed
Jun 2, 2016 ------------------------------------------------------------------------ r258936 | thomas.stellard | 2016-01-27 07:53:52 -0800 (Wed, 27 Jan 2016) | 14 lines AMDGPU/SI: Fix commuting of 32-bit VOPC instructions Summary: We didn't have entries in the commuting table for the 32-bit instructions. I don't think we hit this problem now, but we will once uniform branching is enabled. Tests will come in a later commit. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D16600 ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_38@271589 91177308-0d34-0410-b5e6-96231b3b80d8
-
tstellarAMD committed
Jun 2, 2016 ------------------------------------------------------------------------ r258901 | Matthew.Arsenault | 2016-01-26 18:17:49 -0800 (Tue, 26 Jan 2016) | 17 lines AMDGPU: Fix default device handling When no device name is specified, default to kaveri for HSA since SI is not supported and it woud fail. Default to "tahiti" instead of "SI" since these are effectively the same, and tahiti is an actual device. Move default device handling to the TargetMachine rather than the AMDGPUSubtarget. The module ISA version is computed from the device name provided with the target machine, so the attributes printed by the AsmPrinter were inconsistent with those computed in the subtarget. Also remove DevName field from subtarget since it's redundant with getCPU() in the superclass. ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_38@271588 91177308-0d34-0410-b5e6-96231b3b80d8
-
tstellarAMD committed
Jun 2, 2016 ------------------------------------------------------------------------ r258606 | Matthew.Arsenault | 2016-01-22 21:32:14 -0800 (Fri, 22 Jan 2016) | 5 lines AMDGPU: Remove Feature64BitPtr This is a leftover from AMDIL that doesn't do anything and doesn't belong here. ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_38@271587 91177308-0d34-0410-b5e6-96231b3b80d8