Skip to content

[spirv] Invalid codegen in RWByteAddressBuffer InterlockedMin and Max#5707

Closed
Jasper-Bekkers wants to merge 2 commits intomicrosoft:mainfrom
Traverse-Research:patch-rwbab-bug
Closed

[spirv] Invalid codegen in RWByteAddressBuffer InterlockedMin and Max#5707
Jasper-Bekkers wants to merge 2 commits intomicrosoft:mainfrom
Traverse-Research:patch-rwbab-bug

Conversation

@Jasper-Bekkers
Copy link
Copy Markdown

@Jasper-Bekkers Jasper-Bekkers commented Sep 13, 2023

RWByteAddressBuffer has overloads for InterlockedMin and InterlockedMax for signed ints that fail to compile when used from SPIR-V.

    fatal error: generated SPIR-V is invalid: AtomicSMax: expected Value to be of type Result Type
      %536 = OpAtomicSMax %uint %535 %uint_1 %uint_0 %532

Obviously we can't return a %uint from a signed atomic operation, however in processRWByteAddressBufferAtomicMethods the call was always assumed to take %uint's.

This PR attempt to resolve that in the same way as for other Interlocked operations by using the result expected type instead.

Repo cases

RWByteAddressBuffer rwbab: register(u0);

[numthreads(4, 16, 1)]
void main(uint3 global_id : SV_DispatchThreadID, uint3 local_id : SV_GroupThreadID, uint3 wg_id : SV_GroupID)
{
    int k = 100;
    rwbab.InterlockedMax(0, k);

    return;
}

@Jasper-Bekkers
Copy link
Copy Markdown
Author

Iteration times made it such that I couldn't test this until now: this change doesn't work (yet).

@s-perron s-perron self-assigned this Sep 29, 2023
@s-perron
Copy link
Copy Markdown
Collaborator

You will want to add a unit test. That could help with your turn around time too.

See examples in https://github.com/microsoft/DirectXShaderCompiler/tree/main/tools/clang/test/CodeGenSPIRV_Lit.

The can be run by building the target check-all. If you use ninja for you build tools, then do ninja check-all.

@Jasper-Bekkers
Copy link
Copy Markdown
Author

I actually think the fix I did is not correct, but I lack the expertise to make it correct.

An alternative way to address this I think would be to do a twos complement flip for negative numbers before feeding them into the u32 atomics.

spvBuilder.getConstantInt(astContext.UnsignedIntTy, llvm::APInt(32, 0));
auto *offset = doExpr(expr->getArg(0));
const auto *dest = expr->getArg(0);
const auto baseType = dest->getType()->getCanonicalTypeUnqualified();
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the DXIL generated for these cases, the type of the operation is determined by the type of value, not the type of dest, which is actually an offset.

SpirvInstruction *originalVal = spvBuilder.createAtomicOp(
translateAtomicHlslOpcodeToSpirvOpcode(opcode),
astContext.UnsignedIntTy, ptr, spv::Scope::Device,
atomicOp, baseType, ptr, spv::Scope::Device,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is that the type for baseType does not match the type for value. That causes a validation error. You cannot cast the pointer to another type, but you could cast the input type to baseType, when they differ, and are the same number of bits. Do not change the opcode. The opcode seem correct as is.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Take this example: https://godbolt.org/z/3sz6MEx9s.

It produces:

         %39 = OpAccessChain %_ptr_StorageBuffer_uint %buffer %uint_0 %37
         %40 = OpLoad %int %inputValue
         %41 = OpAtomicSMax %uint %39 %uint_1 %uint_0 %40

This is invalid because the pointer %39 points to a uint, and the value %40 is %int. However you can make it valid by changing it to:

         %39 = OpAccessChain %_ptr_StorageBuffer_uint %buffer %uint_0 %37
         %40 = OpLoad %int %inputValue
         %bc = OpBitcast %uint %40
         %41 = OpAtomicSMax %uint %39 %uint_1 %uint_0 %bc

Comment on lines +3868 to +3875
if (atomicOp == spv::Op::OpAtomicUMax && baseType->isSignedIntegerType())
atomicOp = spv::Op::OpAtomicSMax;
if (atomicOp == spv::Op::OpAtomicSMax && baseType->isUnsignedIntegerType())
atomicOp = spv::Op::OpAtomicUMax;
if (atomicOp == spv::Op::OpAtomicUMin && baseType->isSignedIntegerType())
atomicOp = spv::Op::OpAtomicSMin;
if (atomicOp == spv::Op::OpAtomicSMin && baseType->isUnsignedIntegerType())
atomicOp = spv::Op::OpAtomicUMin;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These changes are not needed. I believe this will give incorrect results. The parser already determine the correct opcode.

@sudonatalie
Copy link
Copy Markdown
Collaborator

@Jasper-Bekkers Are you still looking to address this PR or should we opt to close it at this point?

@Jasper-Bekkers
Copy link
Copy Markdown
Author

@Jasper-Bekkers Are you still looking to address this PR or should we opt to close it at this point?

Probably alright, I don't have the bandwidth to look into this anymore from our side but the bug still remains unfortunately.

sudonatalie added a commit to sudonatalie/DirectXShaderCompiler that referenced this pull request May 10, 2024
RWByteAddressBuffer has overloads for InterlockedMin and InterlockedMax for
signed ints that were failing to compile due to mismatched types in the
generated SPIR-V instruction. This adds the missing cast if necessary.

At the same time, some redundant code is removed from the InterlockedMin/Max
intrinsic non-member functions' codegen to modify the opcode. If it was
necessary in the past, the frontend has since been fixed and it is no longer
necessary. Tests to verify these combinations and the necessary implicit casts
have also been added.

Fixes microsoft#3196
Related to microsoft#4189, microsoft#6254, microsoft#5707
sudonatalie added a commit that referenced this pull request May 14, 2024
RWByteAddressBuffer has overloads for InterlockedMin and InterlockedMax
for signed ints that were failing to compile due to mismatched types in
the generated SPIR-V instruction. This adds the missing cast if
necessary.

At the same time, some redundant code is removed from the
InterlockedMin/Max intrinsic non-member functions' codegen to modify the
opcode. If it was necessary in the past, the frontend has since been
fixed and it is no longer necessary. Tests to verify these combinations
and the necessary implicit casts have also been added.

Fixes #3196
Related to #4189, #6254, #5707
SjMxr233 pushed a commit to ShaderHelper/DirectXShaderCompiler that referenced this pull request Jul 24, 2025
RWByteAddressBuffer has overloads for InterlockedMin and InterlockedMax
for signed ints that were failing to compile due to mismatched types in
the generated SPIR-V instruction. This adds the missing cast if
necessary.

At the same time, some redundant code is removed from the
InterlockedMin/Max intrinsic non-member functions' codegen to modify the
opcode. If it was necessary in the past, the frontend has since been
fixed and it is no longer necessary. Tests to verify these combinations
and the necessary implicit casts have also been added.

Fixes microsoft#3196
Related to microsoft#4189, microsoft#6254, microsoft#5707
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

3 participants