Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question: why OpSelectionMerge and OpLoopMerge are separate instructions? #147

Closed
ghost opened this issue Mar 10, 2020 · 11 comments
Closed

Comments

@ghost
Copy link

ghost commented Mar 10, 2020

If the idea behind OpSelectionMerge and OpLoopMerge is that threads should know when they should diverge and converge, why have 2 separate instructions for conditional statements and loops? The fact that there are 2 instructions gives rise to many corner case questions, questions such as "should a OpLoopMerge block have OpSelectionMerge blocks within itself or not" (SPIR-V documentation example says no, while I sometimes see SPIR-V code in the wild that says yes). I see driver programmers from a big IHV hit these corner cases and complain about it on the Internet in the year of 2019, 5 years after the release of the spec, this is not normal, there are Android phones out there with broken SPIR-V compilers because of that, and many more will come if someone won't clarify all of this in detail.

@ghost
Copy link
Author

ghost commented Mar 10, 2020

Here's my theory that so far fits all the SPIR-V code I saw from the Khronos Group and other vendors:

  • OpLoopMerge and OpSelectionMerge are functionally the same, they both are about thread divergence and convergence only, the only reason they were split to two is because OpSelectionMerge needs only one parameter, while OpLoopMerge needs two. (This rule aligns with the SPIR-V documentation example that doesn't specify OpSelectionMerge for OpBranchConditional %56 %50 %51 because it relies on OpLoopMerge's merge block, so if OpLoopMerge and OpSelectionMerge were about functionally different things then this wouldn't be legal. You could've make one instruction and ignore the second parameter for conditional statements then, but we can't go back in time and change the past, so whatever.)
  • You need to specify OpLoopMerge and OpSelectionMerge within other OpLoopMerge and OpSelectionMerge blocks even tho the threads are diverged already. (What? Diverge the already divergent threads? This rule aligns with what I saw in some SPIR-V code that had a OpSelectionMerge block within a parent OpLoopMerge block tho, so ok, whatever.)
  • You can omit OpSelectionMerge only if the conditional statement converges to already specified by the parent OpLoopMerge or OpSelectionMerge block's merge block. (This rule aligns with the SPIR-V documentation example and some other SPIR-V code I saw, so alright, fine.)

@ghost
Copy link
Author

ghost commented Mar 10, 2020

Correct me if I'm wrong.

@alan-baker alan-baker transferred this issue from KhronosGroup/SPIRV-Tools Mar 11, 2020
@johnkslang
Copy link
Member

Can you give an example of an implicit (undeclared) merge block?

We don't need merge instructions for branches that are reconverging the control flow, just for those that are introducing new nested divergence.

@ghost
Copy link
Author

ghost commented Mar 11, 2020

Can you give an example of an implicit (undeclared) merge block?

A merge block is always present, it's that

We don't need merge instructions for branches that are reconverging the control flow

makes OpSelectionMerge implicit (undeclared) sometimes. I hope only for reconverging the control flow.

just for those that are introducing new nested divergence.

Why divergence nests?

@ghost
Copy link
Author

ghost commented Mar 11, 2020

Specifically,

You need to specify OpLoopMerge and OpSelectionMerge within other OpLoopMerge and OpSelectionMerge blocks even tho the threads are diverged already.

This rule aligns with what I saw in some SPIR-V code that had a OpSelectionMerge block within a parent OpLoopMerge block

Why OpSelectionMerge needs to be issued in a OpLoopMerge block that is already divergent?

@ghost
Copy link
Author

ghost commented Mar 11, 2020

For example:

#version 450

layout(location = 0) in vec4 color1;
layout(location = 1) in vec4 multiplier;
layout(location = 2) noperspective in vec4 color2;

layout(location = 0) out vec4 color;

struct S {
    bool b;
    vec4 v[5];
    int i;
};

layout(binding = 0) uniform blockName {
    S s;
    bool cond;
};

void main()
{
    vec4 scale = vec4(1.0, 1.0, 2.0, 1.0);

    for (int i = 0; i < 4; ++i) {
        if (cond)
            color *= color1 + s.v[2];
        else
            color *= sqrt(color2) * scale;
        color *= multiplier;
    }
}
; SPIR-V
; Version: 1.0
; Generator: Khronos Glslang Reference Front End; 8
; Bound: 67
; Schema: 0
               OpCapability Shader
          %1 = OpExtInstImport "GLSL.std.450"
               OpMemoryModel Logical GLSL450
               OpEntryPoint Fragment %main "main" %color %color1 %color2 %multiplier
               OpExecutionMode %main OriginUpperLeft
               OpSource GLSL 450
               OpName %main "main"
               OpName %scale "scale"
               OpName %i "i"
               OpName %S "S"
               OpMemberName %S 0 "b"
               OpMemberName %S 1 "v"
               OpMemberName %S 2 "i"
               OpName %blockName "blockName"
               OpMemberName %blockName 0 "s"
               OpMemberName %blockName 1 "cond"
               OpName %_ ""
               OpName %color "color"
               OpName %color1 "color1"
               OpName %color2 "color2"
               OpName %multiplier "multiplier"
               OpDecorate %_arr_v4float_uint_5 ArrayStride 16
               OpMemberDecorate %S 0 Offset 0
               OpMemberDecorate %S 1 Offset 16
               OpMemberDecorate %S 2 Offset 96
               OpMemberDecorate %blockName 0 Offset 0
               OpMemberDecorate %blockName 1 Offset 112
               OpDecorate %blockName Block
               OpDecorate %_ DescriptorSet 0
               OpDecorate %_ Binding 0
               OpDecorate %color Location 0
               OpDecorate %color1 Location 0
               OpDecorate %color2 NoPerspective
               OpDecorate %color2 Location 2
               OpDecorate %multiplier Location 1
       %void = OpTypeVoid
          %3 = OpTypeFunction %void
      %float = OpTypeFloat 32
    %v4float = OpTypeVector %float 4
%_ptr_Function_v4float = OpTypePointer Function %v4float
    %float_1 = OpConstant %float 1
    %float_2 = OpConstant %float 2
         %12 = OpConstantComposite %v4float %float_1 %float_1 %float_2 %float_1
        %int = OpTypeInt 32 1
%_ptr_Function_int = OpTypePointer Function %int
      %int_0 = OpConstant %int 0
      %int_4 = OpConstant %int 4
       %bool = OpTypeBool
       %uint = OpTypeInt 32 0
     %uint_5 = OpConstant %uint 5
%_arr_v4float_uint_5 = OpTypeArray %v4float %uint_5
          %S = OpTypeStruct %uint %_arr_v4float_uint_5 %int
  %blockName = OpTypeStruct %S %uint
%_ptr_Uniform_blockName = OpTypePointer Uniform %blockName
          %_ = OpVariable %_ptr_Uniform_blockName Uniform
      %int_1 = OpConstant %int 1
%_ptr_Uniform_uint = OpTypePointer Uniform %uint
     %uint_0 = OpConstant %uint 0
%_ptr_Output_v4float = OpTypePointer Output %v4float
      %color = OpVariable %_ptr_Output_v4float Output
%_ptr_Input_v4float = OpTypePointer Input %v4float
     %color1 = OpVariable %_ptr_Input_v4float Input
      %int_2 = OpConstant %int 2
%_ptr_Uniform_v4float = OpTypePointer Uniform %v4float
     %color2 = OpVariable %_ptr_Input_v4float Input
 %multiplier = OpVariable %_ptr_Input_v4float Input
       %main = OpFunction %void None %3
          %5 = OpLabel
      %scale = OpVariable %_ptr_Function_v4float Function
          %i = OpVariable %_ptr_Function_int Function
               OpStore %scale %12
               OpStore %i %int_0
               OpBranch %17
         %17 = OpLabel
               OpLoopMerge %19 %20 None
               OpBranch %21
         %21 = OpLabel
         %22 = OpLoad %int %i
         %25 = OpSLessThan %bool %22 %int_4
               OpBranchConditional %25 %18 %19
         %18 = OpLabel
         %35 = OpAccessChain %_ptr_Uniform_uint %_ %int_1
         %36 = OpLoad %uint %35
         %38 = OpINotEqual %bool %36 %uint_0
               OpSelectionMerge %40 None
               OpBranchConditional %38 %39 %53
         %39 = OpLabel
         %45 = OpLoad %v4float %color1
         %48 = OpAccessChain %_ptr_Uniform_v4float %_ %int_0 %int_1 %int_2
         %49 = OpLoad %v4float %48
         %50 = OpFAdd %v4float %45 %49
         %51 = OpLoad %v4float %color
         %52 = OpFMul %v4float %51 %50
               OpStore %color %52
               OpBranch %40
         %53 = OpLabel
         %55 = OpLoad %v4float %color2
         %56 = OpExtInst %v4float %1 Sqrt %55
         %57 = OpLoad %v4float %scale
         %58 = OpFMul %v4float %56 %57
         %59 = OpLoad %v4float %color
         %60 = OpFMul %v4float %59 %58
               OpStore %color %60
               OpBranch %40
         %40 = OpLabel
         %62 = OpLoad %v4float %multiplier
         %63 = OpLoad %v4float %color
         %64 = OpFMul %v4float %63 %62
               OpStore %color %64
               OpBranch %20
         %20 = OpLabel
         %65 = OpLoad %int %i
         %66 = OpIAdd %int %65 %int_1
               OpStore %i %66
               OpBranch %17
         %19 = OpLabel
               OpReturn
               OpFunctionEnd

graph dot

Why do you need to issue OpSelectionMerge %40 None if the whole loop is already in the divergent state by OpLoopMerge %19 %20 None?

@johnkslang
Copy link
Member

The rules nest. Strip away the loop, and that's the convergence of the divergence that occurred. Not ignoring the loop, it's not a full convergence, but it's still a local convergence, rather than a further divergence.

@ghost
Copy link
Author

ghost commented Mar 11, 2020

Strip away the loop, and that's the convergence of the divergence that occurred.

I see, so basically SPIR-V is always about explicitly specified by the user divergence and convergence and you guys don't want to deal with cfg analysis and automatic structurization like in DXIL?

@ghost
Copy link
Author

ghost commented Mar 11, 2020

Because if SPIR-V code transform tools are allowed to insert new merge instructions, then you can strip that loop and insert OpSelectionMerge %40 None for the user.

@johnkslang
Copy link
Member

SPIR-V for graphics explicitly declares its structured control flow. CFG analysis can be done to verify what was declared, but complex inference is not required.

It's a common theme in SPIR-V: Rather than infer (e.g., result types), declare and verify, because in its full generality, verification of a statement is simpler than inference of the statement.

@ghost
Copy link
Author

ghost commented Mar 12, 2020

I see. Thank you for answers, John!

The issue can be closed for now.

This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant