Errors on non-immediate load/gather offsets#3283
Conversation
Add errors for gather and load intrinsics when offsets are non- literals or out of range. These largely use the slightly altered validation paths that the sample intrinsics use. Reword error when texture access offsets are not immediates to be more all encompassing and grammatical. Incendentally remove duplicate shaders being used for the validation test from the old directory while identical copies in the validation directory went unused. Redirected validation test to the appropriate copies. This is consistent with the test shader re-org's stated intent Fixes microsoft#2590 Fixes microsoft#2713
|
❌ Build DirectXShaderCompiler 1.0.3929 failed (commit 65a78e3c8f by @pow2clk) |
Old mistakes and new omissions
|
✅ Build DirectXShaderCompiler 1.0.3931 completed (commit d5604807e6 by @pow2clk) |
| r += text1.GatherAlpha(samp1, a.xy, int2(0, 0), int2(-1, 1), int2(3, 4), int2(-8, 7), status); r += status; | ||
|
|
||
| r += text2.GatherBlue(samp1, a.xyz, b.xy, b.zw, a.xy, a.zw, status); r += status; | ||
| r += text2.GatherBlue(samp1, a.xyz, int2(0, 0), int2(-1, 1), int2(3, 4), int2(-8, 7), status); r += status; |
There was a problem hiding this comment.
These old tests flagrantly violating this purported restriction and the lack of any documentation of it do make me wonder if maybe this should be allowed. When next I have access to FXC, I'll see what that does with these.
There was a problem hiding this comment.
I still think this is correct, but for the record, FXC produces errors on out of range and variable offset parameters for Load and Samples, but not for Gathers.
vcsharma
left a comment
There was a problem hiding this comment.
This change looks good, but it may also unearth existing shaders/tests using offsets which are not immediate for texture load/gather ops. As an example, I shared with you offline a failing internal test that we might want to address it as part of this change.
| CollectIllegalOffsets(illegalOffsets, CurF, DXIL::OpCode::TextureGather, hlslOP); | ||
| CollectIllegalOffsets(illegalOffsets, CurF, DXIL::OpCode::TextureGatherCmp, hlslOP); | ||
| CollectIllegalOffsets(illegalOffsets, CurF, DXIL::OpCode::TextureLoad, hlslOP); |
There was a problem hiding this comment.
nit: given that we are now expanding scope of this pass to also legalize offsets for texture gather and load ops, it might be more clearer to rename this pass as DxilLegalizeOffsetsPass or something else which more accurately reflects the expanded scope.
There was a problem hiding this comment.
Perhaps, but in my opinion, the name is ok, as it encompasses offsets to texture sample locations for textures, and not offsets for a variety of other things we might use the word "offset" for, even if it's for more than just the "Sample*()" methods.
| } | ||
| } else { | ||
| if (!isa<UndefValue>(offsets[i])) { | ||
| if (i < offsets.size() && !isa<UndefValue>(offsets[i])) { |
There was a problem hiding this comment.
It's not very clear to me why we need to do additional bound check here (i < offsets.size()). Is it because we can no longer expect kMaxNumOffsets to be 3? It might change based on resource kind?
There was a problem hiding this comment.
It's been so long I don't remember exactly what I was thinking here. You are correct that kMaxNumOffsets should no longer be 3 in all cases. It could have needed to be 4 for the existing cases before too. It shouldn't really be a constant either.
I think I'll remove this conditional and change the kMaxNumOffsets loop to use the size of offsets and add some more tests for cases that have different expected numbers of offsets and different maximum numbers of offsets
There was a problem hiding this comment.
Futher up in the stack, only the expected offsets are passed into this function as an array ref. This array size is not always kMaxNumOffsets, so this bound check prevents OOB index on offsets[i].
There was a problem hiding this comment.
@pow2clk I think that should be ok: using offsets.size() instead of kMaxNumOffsets for loop, and removing the bound check here. I think this was leftover from a version of the code that had an assumed fixed sized array.
| // CHK_RANGE: error: offset texture instructions must take offset which can resolve to integer literal in the range -8 to 7. | ||
| // CHK_RANGE: error: offset texture instructions must take offset which can resolve to integer literal in the range -8 to 7. | ||
| // CHK_RANGE: error: offset texture instructions must take offset which can resolve to integer literal in the range -8 to 7. | ||
| // CHK_RANGE: error: offset texture instructions must take offset which can resolve to integer literal in the range -8 to 7. | ||
|
|
||
| // CHK_VAROFF: Offsets to texture access operations must be immediate values | ||
| // CHK_VAROFF: Offsets to texture access operations must be immediate values | ||
| // CHK_VAROFF: Offsets to texture access operations must be immediate values | ||
| // CHK_VAROFF: Offsets to texture access operations must be immediate values | ||
| // CHK_VAROFF: Offsets to texture access operations must be immediate values | ||
| // CHK_VAROFF: Offsets to texture access operations must be immediate values | ||
| // CHK_VAROFF: Offsets to texture access operations must be immediate values | ||
| // CHK_VAROFF: Offsets to texture access operations must be immediate values |
Sample operations have a maximum of 3 args, but gather has a maximum of two since it always operates on 2D images. So gather operations only pass in two offset args to the offset validation, which resulted in an invalid access. Rather than adding a specialized condition to evade this, just iterate over the number of elements in the array. For sample it will be 3 and for gather 2 and it will still check for expected undefined args appropriately. For the offset legalization pass, the opcode is used to determine the start and end of the offset args
If the resource is invalid, it fires an assert when trying to retrieve the number of offsets expected. An error has already been generated, so just bail.
Only produce the loop unroll suggestion when within a loop Base error line on call instruction instead of source of the offset Sort by location in source when possible and remove duplicates Adapt tests to verify and match these changes
Gather operations that take four separate offsets are meant to allow for programmable, non-immediate values. This removes them from the immediate and range validations by giving the immediate versions their own opcodes
…mpiler into offset_validation
|
I divided the new changes into logically independent commits. It might be easier to review them individually. |
|
✅ Build DirectXShaderCompiler 1.0.4350 completed (commit 64f1e84b72 by @pow2clk) |
pow2clk
left a comment
There was a problem hiding this comment.
Explanatory comments
| 220 Pack4x8 packs vector of 4 signed or unsigned values into a packed datatype, drops or clamps unused bits | ||
| 221 IsHelperLane returns true on helper lanes in pixel shaders | ||
| 222 TextureGatherImm same as TextureGather, except offsets are limited to immediate values between -8 and 7 | ||
| 223 TextureGatherCmpImm same as TextureGatherCmp, except offsets are limited to immediate values between -8 and 7 |
There was a problem hiding this comment.
These are the new dummy(for now) intrinsics that indicate offsets need to be immediate in the -8 to 7 range)
| // Metadata | ||
| bool requiresUniformInputs() const { return false; } | ||
| }; | ||
|
|
There was a problem hiding this comment.
This is all autogenerated
| class DxilLegalizeSampleOffsetPass : public FunctionPass { | ||
|
|
||
| LoopInfo LI; | ||
|
|
There was a problem hiding this comment.
This is moved to the class because it is needed to determine if the loop unrolling suggestion is relevant to the error message.
| hlsl::OP *hlslOP = DM.GetOP(); | ||
|
|
||
| std::vector<Instruction *> illegalOffsets; | ||
| std::vector<std::pair<Value *, CallInst*>> illegalOffsets; |
There was a problem hiding this comment.
This is to have the CallInst available for the error message since citing the location of the value passed into the call instruction is not as useful and sometimes lacks any location information entirely.
There was a problem hiding this comment.
This std::vector<std::pair<Value *, CallInst*>> would be much better as a typedef'd type.
There was a problem hiding this comment.
If I'm going to have a name, I might as well make a struct that allows meaningful field access
| LegalizeOffsets(ssaIllegalOffsets); | ||
|
|
||
| FinalCheck(illegalOffsets, F, hlslOP); | ||
| FinalCheck(F, hlslOP); |
There was a problem hiding this comment.
This parameter wasn't used, so I removed it incidentally.
| // Invalid resource type for gather. | ||
| ValCtx.EmitInstrError(CI, ValidationRule::InstrResourceKindForGather); | ||
| break; | ||
| return; |
There was a problem hiding this comment.
Return on error to prevent calling ValidateResourceOffset on an invalid resource as happens with some of our tests specifically targeting the validator by altering DXIL after it is produced
| ValCtx.EmitInstrError(CI, | ||
| ValidationRule::InstrResourceKindForTextureLoad); | ||
| break; | ||
| return; |
There was a problem hiding this comment.
As above, return prevents attempts to validate invalid resources as an error is already produced
| if (hasSampleOffsets) { | ||
| statusIdx = HLOperandIndex::kGatherStatusWithSampleOffsetArgIndex; | ||
| } else { | ||
| opcode = OP::OpCode::TextureGatherImm; |
There was a problem hiding this comment.
Where there aren't 4 offset parameters, the immediate restricted gather should be used.
| @@ -1,17 +0,0 @@ | |||
| // RUN: %dxc -E main -T ps_6_0 -Od %s | FileCheck %s | |||
There was a problem hiding this comment.
These files were redundant. Where they were used, the more appropriate location was in dxilvalidation/ where they were copied with the mass test file migration probably with the intent to make this exact change. Rather than alter these as they should be, I resolved what seemed like a mistake.
| r += text1.GatherCmpAlpha(samp1, a, cmpVal, uint2(-3,7)); | ||
| r += text1.GatherCmpAlpha(samp1, a, cmpVal, uint2(-3,7),status); r += status; | ||
| r += text1.GatherCmpAlpha(samp1, a, cmpVal, uint2(-3,8),uint2(-2,3), uint2(-3,8),uint2(-2,3)); | ||
| r += text1.GatherCmpAlpha(samp1, a, cmpVal, uint2(-3,8),uint2(8,-3), uint2(8,-3), uint2(-3,2), status); r+=status; |
There was a problem hiding this comment.
Chose to leave the 4-offset variant with values outside the range even though I doubt that was the original intent. It verifies that these are still possible
There was a problem hiding this comment.
Right, they should support range of at least [-32..31], as well as variable offsets, however, I think immediate values within the [-8..7] range could be optimized to sets of immediate offset calls by the driver, or in some future version of DXC.
| for i in "RenderTargetGetSamplePosition,RenderTargetGetSampleCount".split(","): | ||
| self.name_idx[i].shader_stages = ("pixel",) | ||
| for i in "TextureGather,TextureGatherCmp".split(","): | ||
| for i in "TextureGather,TextureGatherCmp,TextureGatherImm,TextureGatherCmpImm".split(","): |
There was a problem hiding this comment.
These new operations need to have a shader_model that indicates they are not legal for any existing shader model, that might mean setting (6,15) which aligns with experimental, but that might also break something somewhere. Maybe that break can be fixed to handle this new case though.
Otherwise, these will be allowed in validation for all shader models, which would be bad.
| self.set_op_count_for_version(1, 6, next_op_idx) | ||
| assert next_op_idx == 222, "222 is expected next operation index but encountered %d and thus opcodes are broken" % next_op_idx | ||
|
|
||
| self.add_dxil_op("TextureGatherImm", next_op_idx, "TextureGatherImm", "same as TextureGather, except offsets are limited to immediate values between -8 and 7", "hfwi", "ro", [ |
There was a problem hiding this comment.
Perhaps we can add a new self.first_experimental_op_idx(next_op_idx), or imply by the highest one set in set_op_count_for_version, then we can delineate the operations that aren't part of any official shader model this way for spec and generated code.
| r += text1.GatherCmpAlpha(samp1, a, cmpVal, uint2(-3,7)); | ||
| r += text1.GatherCmpAlpha(samp1, a, cmpVal, uint2(-3,7),status); r += status; | ||
| r += text1.GatherCmpAlpha(samp1, a, cmpVal, uint2(-3,8),uint2(-2,3), uint2(-3,8),uint2(-2,3)); | ||
| r += text1.GatherCmpAlpha(samp1, a, cmpVal, uint2(-3,8),uint2(8,-3), uint2(8,-3), uint2(-3,2), status); r+=status; |
There was a problem hiding this comment.
Right, they should support range of at least [-32..31], as well as variable offsets, however, I think immediate values within the [-8..7] range could be optimized to sets of immediate offset calls by the driver, or in some future version of DXC.
| hlsl::OP *hlslOP = DM.GetOP(); | ||
|
|
||
| std::vector<Instruction *> illegalOffsets; | ||
| std::vector<std::pair<Value *, CallInst*>> illegalOffsets; |
There was a problem hiding this comment.
This std::vector<std::pair<Value *, CallInst*>> would be much better as a typedef'd type.
| illegalOffsets.emplace_back(I); | ||
| illegalOffsets.emplace_back(std::make_pair(I, CI)); | ||
| else if(ConstantInt *cOffset = dyn_cast<ConstantInt>(offset)) { | ||
| int val = cOffset->getValue().getSExtValue(); |
There was a problem hiding this comment.
This returns int64_t, and you are relying on implicit cast to int, recommend making this an explicit cast so we don't get build breaks.
There was a problem hiding this comment.
No reason not to make val int64_t
| return aLoc.getLine() < bLoc.getLine(); | ||
| // No line numbers, just compare pointers so that matching instructions will be adjacent | ||
| return a.second < b.second; | ||
| } |
There was a problem hiding this comment.
Nice, but what about source file? Won't this interleave messages based on line number from different source files?
Additionally, I don't think pointer compares will produce deterministic output. Even though that's in the error message ordering and not generated code, it still seems yucky.
| auto offsetEnd = finalIllegalOffsets.end(); | ||
|
|
||
| std::sort(offsetBegin, offsetEnd, InstLT); | ||
| offsetEnd = std::unique(offsetBegin, offsetEnd, InstEq); |
There was a problem hiding this comment.
Maybe we could avoid re-ordering based on debug info by checking for debug info first? Then reordering by pointers doesn't have to happen and the order can at least be deterministic?
| hlsl::OP *hlslOP = DM.GetOP(); | ||
|
|
||
| std::vector<Instruction *> illegalOffsets; | ||
| std::vector<std::pair<Value *, CallInst*>> illegalOffsets; |
There was a problem hiding this comment.
If I'm going to have a name, I might as well make a struct that allows meaningful field access
| illegalOffsets.emplace_back(I); | ||
| illegalOffsets.emplace_back(std::make_pair(I, CI)); | ||
| else if(ConstantInt *cOffset = dyn_cast<ConstantInt>(offset)) { | ||
| int val = cOffset->getValue().getSExtValue(); |
There was a problem hiding this comment.
No reason not to make val int64_t
| return aLoc.getLine() < bLoc.getLine(); | ||
| // No line numbers, just compare pointers so that matching instructions will be adjacent | ||
| return a.second < b.second; | ||
| } |
There was a problem hiding this comment.
pointer compares are not deterministic across compiles and the results are weird, but without debug info, we don't get any line numbers anyway so it's just the same error repeated. It does allow elimination of duplicates within a single compile which means at least the number of errors is right
| auto offsetEnd = finalIllegalOffsets.end(); | ||
|
|
||
| std::sort(offsetBegin, offsetEnd, InstLT); | ||
| offsetEnd = std::unique(offsetBegin, offsetEnd, InstEq); |
There was a problem hiding this comment.
As I said above, ordering by pointers is weird, but it at least allows eliminating duplicates
|
✅ Build DirectXShaderCompiler 1.0.4464 completed (commit 3dfc7365fc by @pow2clk) |
6.15 to coincide with the other "future" dxil value correct string compares by using std::
|
✅ Build DirectXShaderCompiler 1.0.4468 completed (commit dd67047b9a by @pow2clk) |
Add errors for gather and load intrinsics when offsets are non-
literals or out of range. These largely use the slightly altered
validation paths that the sample intrinsics use.
Reword error when texture access offsets are not immediates to be
more all encompassing and grammatical.
Incendentally remove duplicate shaders being used for the validation
test from the old directory while identical copies in the validation
directory went unused. Redirected validation test to the appropriate
copies. This is consistent with the test shader re-org's stated intent
Fixes #2590
Fixes #2713