Skip to content

Commit dbb9f9f

Browse files
committed
Commit contains 108 changes.
Change 1: Will check in another fix after opensource builds are back up made by: Xiao Lei Change 2: To make subroutine call for emulation functions be on by default. made by: Junjie Gu Change 3: [IGC Refactor][Builtins, IGC]: Update changes for Built-in Indexing Description for Open Source: these allow us to load only the builtins that we need into the user kernel made by: hudson_server Change 4made by: Michael Liao Change 5: As summary. made by: Wei Pan Change 6made by: Alexander Paige Change 7: Automated integration from mainline to DEV_IGC made by: IGC Change 8: Need to use zext on PHI when doing integer promotion. For example, phi i2 [1, %b0] [-1, %b1] made by: Junjie Gu Change 9: When icmp is signed, need to do sext (using shl + ashr) made by: Junjie Gu Change 10: Missing description made by: Junjie Gu Change 11: For CS, if simd8 is not least allowed simd size, don't enable subroutine as subroutine is simd8 only for now. made by: Junjie Gu Change 12made by: Thomas F Raoux Change 13: As summary. made by: Wei Pan Change 14: To make subroutine call for emulation functions be on by default. made by: hudson_server Change 15made by: Po-yu Chen Change 16: add more acc restrictions made by: Weiyu Chen Change 17: Make sure the uneeded code is indeed removed. made by: Junjie Gu Change 18: Handle insertelementinst and waveshuffle specially made by: Junjie Gu Change 19: In FF, un constrained variables can be assigned freely, RR helps scheduling effort. loose the initial condition. only node in first time un-constrainted list is applied RR. made by: hudson_server Change 20: Fix alignment in CreateBufferLoad() for types smalled than 4 bytes. Set correct address alignement in ldraw instrinsics. made by: Lukasz Gotszald Change 21made by: Thomas F Raoux Change 22made by: Thomas F Raoux Change 23: made by: hudson_server Change 24made by: Thomas F Raoux Change 25: Phi instructions always need to be at the beginning of a given basic block. made by: hudson_server Change 26: [IGC Backout][IGC]: Revert CL#751879 Description for Open Source: - due to performance regression made by: hudson_server Change 27: Provide global and local variable splitting in single pass, and handled in the same way. made by: hudson_server Change 28: [IGC Refactor][IGC]: Step2 to remove deprecated sampler intrinsics. Missing file in previous CL Description for Open Source: made by: hudson_server Change 29: In FF, un constrained variables can be assigned freely, RR helps scheduling effort. loose the initial condition. only node in first time un-constrainted list is applied RR. made by: Bu Qi Cheng Change 30made by: Thomas F Raoux Change 31: [IGC BugFix][DX9_FE]: Fix more regression due to refactoring Description for Open Source: Fix more regression due to refactoring made by: hudson_server Change 32: Provide global and local variable splitting in single pass, and handled in the same way. made by: Bu Qi Cheng Change 33: [IGC Refactor][IGC]: Last step of legacy sample removal Description for Open Source: Remove legacy intrinsics made by: hudson_server Change 34: Fix wrong size of string passing. This caused the last char of "options" string to be cut off, therefore module metadata was incorrect. made by: Andrzej Ratajewski Change 35: Back-out of one of previous change. made by: Michael Liao Change 36made by: Tomasz Bujewski Change 37made by: Thomas F Raoux Change 38: Internal feature made by: hudson_server Change 39: Unexpected problem has been detected in VulkanULT. It was not reproduced at Sanity, preETM nor locally. made by: Lukasz Gotszald Change 40: Phi instructions always need to be at the beginning of a given basic block. made by: Jacek Jankowski Change 41: This way IGC-NEO interface will be available in IGC when building IGC with Apple support (for non-apple OS). made by: Jaroslaw Chodor Change 42: Fix alignment in CreateBufferLoad() for types smalled than 4 bytes. Set correct address alignement in ldraw instrinsics. made by: Lukasz Gotszald Change 43made by: Thomas F Raoux Change 44made by: Thomas F Raoux Change 45: made by: Mariusz Merecki Change 46made by: Junjie Gu Change 47made by: Bu Qi Cheng Change 48: Emulation pass also emulates i64 div/mod. So, make sure it is for double emulation before enabling subroutine support. Code Review: trivial made by: Junjie Gu Change 49: add a missing check on whether mad src0 can be acc made by: Weiyu Chen Change 50: As summary. made by: Wei Pan Change 51: Enables stateless support on BDW by default made by: Xiao Lei Change 52: Added compiler support for GTPin made by: Xiao Lei Change 53: Refactored GTPin return data to contain the correct version and request status. made by: Xiao Lei Change 54made by: Thomas F Raoux Change 55: made by: Thomas F Raoux Change 56: Back-out of one of previous change. made by: Mariusz Merecki Change 57: made by: IGC Change 58: Provide global and local variable splitting in single pass, and handled in the same way. made by: IGC Change 59: made by: Mariusz Merecki Change 60: Provide global and local variable splitting in single pass, and handled in the same way. made by: Bu Qi Cheng Change 61: internal feature made by: Weiyu Chen Change 62: See if we can remove this seemly useless check made by: Weiyu Chen Change 63made by: Thomas F Raoux Change 64made by: Thomas F Raoux Change 65made by: Piotr Mochocki Change 66: Assert is unnecessary as it will be handled in other parts of the pass. Fixes a regression from CL751781 made by: Xiao Lei Change 67made by: Jose Santillan Change 68made by: Michael Liao Change 69made by: Pawel Jurek Change 70: In FF, un constrained variables can be assigned freely, RR helps scheduling effort. made by: IGC Change 71: CPack: switch to component based packaging Group IGC artifacts into igc component. This change enables creation of separable IGC installation packages (deb,rpm,tgz) by top level build system. made by: Lukasz Filipkowski Change 72made by: Michael Liao Change 73: Internal feature made by: Weiyu Chen Change 74: As summary. made by: Wei Pan Change 75: enable multi-acc replacement for certain platforms made by: Weiyu Chen Change 76made by: Thomas F Raoux Change 77: Accidental Checkin - Backing out of previous CL made by: Xiao Lei Change 78: Enables GTPin input read and output writes made by: Xiao Lei Change 79: don't do acc replacement for dst without any use made by: Weiyu Chen Change 80: Added support in Legalization pass to handle store of illegal int types. Added support in PeepholeTypeLegalizer pass to handle PHI and Trunc instructions with illegal int types. made by: Xiao Lei Change 81: acc substitution relies on def-use on virtual declares, and removeReudundMov pass can break it made by: Weiyu Chen Change 82: - ensure all required values are present made by: Michael Liao Change 83: - limit to OpenCL shader only made by: Michael Liao Change 84made by: Thomas F Raoux Change 85: Back-out of one of previous change. made by: Michael Liao Change 86: To make subroutine call for emulation functions be on by default. made by: Junjie Gu Change 87: Turining full Half Promotion pass back on for OCL due to functional regressions. made by: Jacek Jankowski Change 88: Gen9 doesn't support bindless sampler heap. These changes follow DirectX approach to support a large number of samplers. The idea is to use sampler header to pass pointer to the sampler state. These changes are not complete as we also need corresponding UMD changes but we can safely check them in and enable later when UMD is ready. Proposed solution is: - UMD creates separate descriptor set heap just for samplers in the indirect state heap. So UMD has to manage two bindless heaps one for surfaces and one for samplers. - The assumption is that a descriptor set offset "n" has the same offset in both heaps. This will limit the number of offsets that need to go through constant registers - Additionally UMD will pass us the base offset of the emulated bindless heap (offset relative to the dynamic state base address) - Compiler will add this base offset to every sampler binding and IGC will program the sampler state offset in the header. made by: Mariusz Merecki Change 89made by: Thomas F Raoux Change 90: Update SpirV OpAtomicIDecrement instruction to use the DEC not PREDEC HW operation. made by: Lukasz Gotszald Change 91: Backing out previous commit, as CI is down and checkins are halted, so no testing for this will occur. made by: Xiao Lei Change 92: Added support in Legalization pass to handle store of illegal int types. Added support in PeepholeTypeLegalizer pass to handle PHI and Trunc instructions with illegal int types. made by: Xiao Lei Change 93: Automated integration from mainline to DEV_IGC made by: IGC Change 94: In FF, un constrained variables can be assigned freely, RR helps scheduling effort. made by: Bu Qi Cheng Change 95: revert CL#750441 made by: Michael Liao Change 96: Change CS SIMD32 heuristics to consider loop stall cost, to allow more shader to be compiled as SIMD32. made by: hudson_server Change 97: --add check for add dst in VxH mode --add check for src modifier in mul inst made by: Weiyu Chen Change 98: GTPin data is sent to driver through patch token made by: Xiao Lei Change 99: add code to support acc on mad src0. currently disabled by default. made by: Weiyu Chen Change 100: Change CS SIMD32 heuristics to consider loop stall cost, to allow more shader to be compiled as SIMD32. made by: Peng Guo Change 101: Added support to enabling subroutine for emulation functions. The default is off for now. Don't expect any functional changes. made by: Junjie Gu Change 102: When lexical scope information is present for -gline-tables-only, it leads to unexpected behavior when building lexical scope DIE because of missing MCSymbol labels for instructions. This patch skip building lexical scope DIE for such cases. made by: Pratik J Ashar Change 103: add platform checks on when to use multiAccSubstitution made by: hudson_server Change 104: - 2nd attempt made by: hudson_server Change 105: fix a bug in previous checkin where src2 was incorrectly applied acc made by: hudson_server Change 106: Added support to 3DBuilder to build igc for debugging Metal Shaders and also to build IGC for open source made by: Juan1 Rodriguez Change 107: If the address fill move has a different exec size than kernel's simd size, NoMask must be used (e.g., (W) mov (1) a0.0 r1.0:w made by: Weiyu Chen Change 108: As Vulkan no longer needs FP64 emulation, do not set NeedFP64() for vulkan. This should have no functional change. made by: Junjie Gu Change-Id: I2fc60e3bdbf85c279446c0a3a9a9c8dc60e09cdd
1 parent 017da36 commit dbb9f9f

File tree

77 files changed

+1562
-1690
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

77 files changed

+1562
-1690
lines changed
Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,40 @@
1+
/*===================== begin_copyright_notice ==================================
2+
3+
Copyright (c) 2017 Intel Corporation
4+
5+
Permission is hereby granted, free of charge, to any person obtaining a
6+
copy of this software and associated documentation files (the
7+
"Software"), to deal in the Software without restriction, including
8+
without limitation the rights to use, copy, modify, merge, publish,
9+
distribute, sublicense, and/or sell copies of the Software, and to
10+
permit persons to whom the Software is furnished to do so, subject to
11+
the following conditions:
12+
13+
The above copyright notice and this permission notice shall be included
14+
in all copies or substantial portions of the Software.
15+
16+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
17+
OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
18+
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
19+
IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
20+
CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
21+
TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
22+
SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
23+
24+
25+
======================= end_copyright_notice ==================================*/
26+
27+
28+
/// Declaration of upgrader passes. When the IR changes, we add passes to keep compatibility with
29+
/// clients which haven't moved yet to the new representation.
30+
31+
namespace llvm
32+
{
33+
class Pass;
34+
}
35+
namespace IGC
36+
{
37+
/// Transform legacy resource access intrinsics taking an integer representation to
38+
/// the new intrinsics taking pointer representation
39+
llvm::Pass* CreateUpgradeResourceIntrinsic();
40+
}
Lines changed: 217 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,217 @@
1+
/*===================== begin_copyright_notice ==================================
2+
3+
Copyright (c) 2017 Intel Corporation
4+
5+
Permission is hereby granted, free of charge, to any person obtaining a
6+
copy of this software and associated documentation files (the
7+
"Software"), to deal in the Software without restriction, including
8+
without limitation the rights to use, copy, modify, merge, publish,
9+
distribute, sublicense, and/or sell copies of the Software, and to
10+
permit persons to whom the Software is furnished to do so, subject to
11+
the following conditions:
12+
13+
The above copyright notice and this permission notice shall be included
14+
in all copies or substantial portions of the Software.
15+
16+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
17+
OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
18+
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
19+
IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
20+
CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
21+
TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
22+
SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
23+
24+
25+
======================= end_copyright_notice ==================================*/
26+
27+
#include "AdaptorCommon/IRUpgrader/IRUpgrader.hpp"
28+
#include "GenISAIntrinsics/GenIntrinsics.h"
29+
#include "Compiler/CISACodeGen/helper.h"
30+
31+
#include <vector>
32+
33+
#include "common/LLVMWarningsPush.hpp"
34+
#include <llvm/Pass.h>
35+
#include <llvm/IR/InstVisitor.h>
36+
#include <llvm/IR/IRBuilder.h>
37+
#include "common/LLVMWarningsPop.hpp"
38+
39+
using namespace llvm;
40+
41+
class UpgradeResourceAccess : public FunctionPass, public InstVisitor<UpgradeResourceAccess>
42+
{
43+
public:
44+
UpgradeResourceAccess() : FunctionPass(ID) {}
45+
static char ID;
46+
void getAnalysisUsage(llvm::AnalysisUsage &AU) const override
47+
{
48+
AU.setPreservesCFG();
49+
}
50+
bool runOnFunction(llvm::Function &F) override;
51+
void visitCallInst(CallInst& C);
52+
private:
53+
void ChangeIntrinsic(CallInst& C, GenISAIntrinsic::ID ID);
54+
void FixSamplerSignature(SampleIntrinsic* sample);
55+
bool m_changed = false;
56+
};
57+
char UpgradeResourceAccess::ID = 0;
58+
59+
bool UpgradeResourceAccess::runOnFunction(llvm::Function &F)
60+
{
61+
visit(F);
62+
return m_changed;
63+
}
64+
65+
static Value* GetResource(Module* m, IRBuilder<>& builder, Value* i)
66+
{
67+
unsigned int addrSpace = IGC::EncodeAS4GFXResource(*i, IGC::RESOURCE, 0);
68+
PointerType* ptrT = PointerType::get(i->getType(), addrSpace);
69+
Value* img = nullptr;
70+
if(isa<ConstantInt>(i))
71+
{
72+
img = ConstantPointerNull::get(ptrT);
73+
}
74+
else
75+
{
76+
Function* getBufferPointer = GenISAIntrinsic::getDeclaration(m, GenISAIntrinsic::GenISA_GetBufferPtr, ptrT);
77+
img = builder.CreateCall(getBufferPointer, { i, builder.getInt32(IGC::RESOURCE) });
78+
}
79+
return img;
80+
}
81+
82+
static Value* GetSampler(Module* m, IRBuilder<>& builder, Value* i)
83+
{
84+
unsigned int addrSpace = IGC::EncodeAS4GFXResource(*i, IGC::SAMPLER, 0);
85+
PointerType* ptrT = PointerType::get(i->getType(), addrSpace);
86+
Value* s = nullptr;
87+
if(isa<ConstantInt>(i))
88+
{
89+
s = ConstantPointerNull::get(ptrT);
90+
}
91+
else
92+
{
93+
Function* getBufferPointer = GenISAIntrinsic::getDeclaration(m, GenISAIntrinsic::GenISA_GetBufferPtr, ptrT);
94+
s = builder.CreateCall(getBufferPointer, { i, builder.getInt32(IGC::SAMPLER) });
95+
}
96+
return s;
97+
}
98+
99+
void UpgradeResourceAccess::ChangeIntrinsic(CallInst& C, GenISAIntrinsic::ID ID)
100+
{
101+
std::vector<Type*> types;
102+
std::vector<Value*> args;
103+
IRBuilder<> builder(&C);
104+
Module* m = C.getParent()->getParent()->getParent();
105+
for(unsigned int i = 0; i < C.getNumArgOperands(); i++)
106+
{
107+
args.push_back(C.getOperand(i));
108+
}
109+
switch(ID)
110+
{
111+
case GenISAIntrinsic::GenISA_sampleptr:
112+
case GenISAIntrinsic::GenISA_sampleBptr:
113+
case GenISAIntrinsic::GenISA_sampleCptr:
114+
case GenISAIntrinsic::GenISA_sampleBCptr:
115+
case GenISAIntrinsic::GenISA_sampleLptr:
116+
case GenISAIntrinsic::GenISA_sampleDptr:
117+
case GenISAIntrinsic::GenISA_sampleKillPix: {
118+
types.push_back(C.getType());
119+
types.push_back(C.getOperand(0)->getType());
120+
unsigned int resIndex = C.getNumOperands() - 6;
121+
unsigned int samplerIndex = C.getNumOperands() - 5;
122+
args[resIndex] = GetResource(m, builder, args[resIndex]);
123+
args[samplerIndex] = GetSampler(m, builder, args[samplerIndex]);
124+
types.push_back(args[resIndex]->getType());
125+
types.push_back(args[samplerIndex]->getType());
126+
}
127+
break;
128+
case GenISAIntrinsic::GenISA_gather4ptr: {
129+
types.push_back(C.getType());
130+
types.push_back(C.getOperand(0)->getType());
131+
unsigned int resIndex = C.getNumOperands() - 7;
132+
unsigned int samplerIndex = C.getNumOperands() - 6;
133+
args[resIndex] = GetResource(m, builder, args[resIndex]);
134+
args[samplerIndex] = GetSampler(m, builder, args[samplerIndex]);
135+
types.push_back(args[resIndex]->getType());
136+
types.push_back(args[samplerIndex]->getType());
137+
}
138+
break;
139+
case GenISAIntrinsic::GenISA_ldmsptr:
140+
case GenISAIntrinsic::GenISA_ldmcsptr:
141+
case GenISAIntrinsic::GenISA_ldptr: {
142+
types.push_back(C.getType());
143+
unsigned int resIndex = C.getNumOperands() - 5;
144+
args[resIndex] = GetResource(m, builder, args[resIndex]);
145+
types.push_back(args[resIndex]->getType());
146+
}
147+
break;
148+
default:
149+
assert("unhandled intrinsic upgrade" && 0);
150+
break;
151+
}
152+
Function* f = GenISAIntrinsic::getDeclaration(m, ID, types);
153+
Value* newCall = builder.CreateCall(f, args);
154+
C.replaceAllUsesWith(newCall);
155+
C.eraseFromParent();
156+
m_changed = true;
157+
}
158+
159+
void UpgradeResourceAccess::visitCallInst(CallInst& C)
160+
{
161+
if(C.getCalledFunction()->getName().startswith("genx.GenISA.sample."))
162+
{
163+
ChangeIntrinsic(C, GenISAIntrinsic::GenISA_sampleptr);
164+
}
165+
else if(C.getCalledFunction()->getName().startswith("genx.GenISA.sampleB."))
166+
{
167+
ChangeIntrinsic(C, GenISAIntrinsic::GenISA_sampleBptr);
168+
}
169+
else if(C.getCalledFunction()->getName().startswith("genx.GenISA.sampleD."))
170+
{
171+
ChangeIntrinsic(C, GenISAIntrinsic::GenISA_sampleDptr);
172+
}
173+
else if(C.getCalledFunction()->getName().startswith("genx.GenISA.sampleC."))
174+
{
175+
ChangeIntrinsic(C, GenISAIntrinsic::GenISA_sampleCptr);
176+
}
177+
else if(C.getCalledFunction()->getName().startswith("genx.GenISA.sampleL."))
178+
{
179+
ChangeIntrinsic(C, GenISAIntrinsic::GenISA_sampleLptr);
180+
}
181+
else if(C.getCalledFunction()->getName().startswith("genx.GenISA.gather4."))
182+
{
183+
ChangeIntrinsic(C, GenISAIntrinsic::GenISA_gather4ptr);
184+
}
185+
else if(C.getCalledFunction()->getName().startswith("genx.GenISA.ldms."))
186+
{
187+
ChangeIntrinsic(C, GenISAIntrinsic::GenISA_ldmsptr);
188+
}
189+
else if(C.getCalledFunction()->getName().equals("genx.GenISA.ldmcs"))
190+
{
191+
ChangeIntrinsic(C, GenISAIntrinsic::GenISA_ldmcsptr);
192+
}
193+
else if(C.getCalledFunction()->getName().startswith("genx.GenISA.ld."))
194+
{
195+
ChangeIntrinsic(C, GenISAIntrinsic::GenISA_ldptr);
196+
}
197+
else if(C.getCalledFunction()->getName().equals("genx.GenISA.sampleKill.legacy"))
198+
{
199+
ChangeIntrinsic(C, GenISAIntrinsic::GenISA_sampleKillPix);
200+
}
201+
else if(SampleIntrinsic* sample = dyn_cast<SampleIntrinsic>(&C))
202+
{
203+
if(!sample->getTextureValue()->getType()->isPointerTy())
204+
{
205+
ChangeIntrinsic(*sample, sample->getIntrinsicID());
206+
}
207+
}
208+
}
209+
210+
211+
namespace IGC
212+
{
213+
Pass* CreateUpgradeResourceIntrinsic()
214+
{
215+
return new UpgradeResourceAccess();
216+
}
217+
}

IGC/AdaptorOCL/SPIRV/SPIRVInternal.h

Lines changed: 10 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -406,6 +406,15 @@ _SPIRV_OP(OpGenericCastToPtrExplicit)
406406
_SPIRV_OP(OpSubgroupBallotKHR)
407407
_SPIRV_OP(OpSubgroupFirstInvocationKHR)
408408
#undef _SPIRV_OP
409+
410+
// Intel Subgroups builtins
411+
#define _SPIRV_OP(x, y) add(Op##y, #x);
412+
_SPIRV_OP(intel_sub_group_shuffle, SubgroupShuffleINTEL)
413+
_SPIRV_OP(intel_sub_group_shuffle_down, SubgroupShuffleDownINTEL)
414+
_SPIRV_OP(intel_sub_group_shuffle_up, SubgroupShuffleUpINTEL)
415+
_SPIRV_OP(intel_sub_group_shuffle_xor, SubgroupShuffleXorINTEL)
416+
#undef _SPIRV_OP
417+
409418
}
410419
typedef SPIRVMap<Op, std::string, SPIRVInstruction>
411420
OCLSPIRVBuiltinMap;
@@ -696,7 +705,7 @@ void decorateSPIRVExtInst(std::string &S, std::vector<Type*> ArgTypes);
696705
bool isFunctionBuiltin(llvm::Function* F);
697706

698707
/// Get a canonical function name for a SPIR-V op code.
699-
std::string getSPIRVBuiltinName(Op OC, std::vector<Type*> ArgTypes, std::string suffix);
708+
std::string getSPIRVBuiltinName(Op OC, SPIRVInstruction *BI, std::vector<Type*> ArgTypes, std::string suffix);
700709

701710
/// Mutates function call instruction by changing the arguments.
702711
/// \param ArgMutate mutates the function arguments.

IGC/AdaptorOCL/SPIRV/SPIRVReader.cpp

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2518,7 +2518,8 @@ SPIRVToLLVM::transValueWithoutDecoration(SPIRVValue *BV, Function *F,
25182518
auto OC = BV->getOpCode();
25192519
if (isSPIRVCmpInstTransToLLVMInst(static_cast<SPIRVInstruction*>(BV))) {
25202520
return mapValue(BV, transCmpInst(BV, BB, F));
2521-
} else if (OCLSPIRVBuiltinMap::find(OC)) {
2521+
} else if (OCLSPIRVBuiltinMap::find(OC) ||
2522+
isIntelSubgroupOpCode(OC)) {
25222523
return mapValue(BV, transSPIRVBuiltinFromInst(
25232524
static_cast<SPIRVInstruction *>(BV), BB));
25242525
} else if (isBinaryShiftLogicalBitwiseOpCode(OC) ||
@@ -2852,7 +2853,7 @@ SPIRVToLLVM::transSPIRVBuiltinFromInst(SPIRVInstruction *BI, BasicBlock *BB) {
28522853
ArgTys.insert(ArgTys.begin(), RetTy);
28532854
}
28542855

2855-
std::string builtinName(getSPIRVBuiltinName(OC, ArgTys, suffix));
2856+
std::string builtinName(getSPIRVBuiltinName(OC, BI, ArgTys, suffix));
28562857

28572858
if (hasReturnTypeInTypeList)
28582859
{

IGC/AdaptorOCL/SPIRV/SPIRVUtil.cpp

Lines changed: 48 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -63,6 +63,7 @@ SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
6363
///
6464
//===----------------------------------------------------------------------===//
6565

66+
#include "libSPIRV/SPIRVInstruction.h"
6667
#include "SPIRVInternal.h"
6768
#include "Mangler/ParameterType.h"
6869

@@ -225,17 +226,53 @@ isFunctionBuiltin(llvm::Function* F) {
225226
}
226227

227228
std::string
228-
getSPIRVBuiltinName(Op OC, std::vector<Type*> ArgTypes, std::string suffix) {
229-
std::string name;
230-
if (OCLSPIRVBuiltinMap::find(OC, &name)){
231-
name = name + suffix;
232-
decorateSPIRVBuiltin(name, ArgTypes);
233-
}
234-
else
235-
{
236-
spirv_assert(0 && "Couldn't find opcode in map!");
237-
}
238-
return name;
229+
getSPIRVBuiltinName(Op OC, SPIRVInstruction *BI, std::vector<Type*> ArgTypes, std::string suffix) {
230+
std::string name = "";
231+
232+
if (isIntelSubgroupOpCode(OC)) {
233+
std::stringstream tmpName;
234+
SPIRVType *DataTy = nullptr;
235+
switch (OC) {
236+
case OpSubgroupBlockReadINTEL:
237+
case OpSubgroupImageBlockReadINTEL:
238+
tmpName << "intel_sub_group_block_read";
239+
DataTy = BI->getType();
240+
break;
241+
case OpSubgroupBlockWriteINTEL:
242+
tmpName << "intel_sub_group_block_write";
243+
DataTy = BI->getOperands()[1]->getType();
244+
break;
245+
case OpSubgroupImageBlockWriteINTEL:
246+
tmpName << "intel_sub_group_block_write";
247+
DataTy = BI->getOperands()[2]->getType();
248+
break;
249+
default:
250+
tmpName << OCLSPIRVBuiltinMap::map(OC);
251+
}
252+
if (DataTy) {
253+
if (DataTy->getBitWidth() == 16)
254+
tmpName << "_us";
255+
if (DataTy->isTypeVector()) {
256+
if (unsigned ComponentCount = DataTy->getVectorComponentCount())
257+
tmpName << ComponentCount;
258+
}
259+
}
260+
name = tmpName.str();
261+
}
262+
else
263+
{
264+
name = OCLSPIRVBuiltinMap::map(OC);
265+
}
266+
267+
if (!name.empty()) {
268+
name = name + suffix;
269+
decorateSPIRVBuiltin(name, ArgTypes);
270+
}
271+
else
272+
{
273+
spirv_assert(0 && "Couldn't find opcode in map!");
274+
}
275+
return name;
239276
}
240277

241278
CallInst *

0 commit comments

Comments
 (0)