Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fleshing: Vulkan::Calling vkWaitForFences Timeout #53

Open
Jack-Clark opened this issue Mar 28, 2022 · 7 comments
Open

Fleshing: Vulkan::Calling vkWaitForFences Timeout #53

Jack-Clark opened this issue Mar 28, 2022 · 7 comments
Assignees

Comments

@Jack-Clark
Copy link
Collaborator

When executing some of the amber files that result from fleshing the xml file in the s057 folder that is generated by the Vulkan CTS scraper, I repeatedly get a Vulkan::Calling vkWaitForFences Timeout error.

I have drawn out the CFG below:
PNG image

The error doesn't occur when following the path B4->LH1->B0. Whenever the LH1->LH0 branch is taken, it results in the error. I've generated many different amber files which contain paths that cycle the loop different numbers of times and they all reproduce the error 100% of the time.

I've checked the generated spirv and it looks ok to me. Does anyone see anything wrong?

#!amber

SHADER compute compute_shader SPIRV-ASM

; Follow the path:
; 8 -> <9> -> <12> -> 13 -> 11 -> <9> -> <12> -> 13 -> 11 -> <9> -> 10
;
; 2 CFG nodes have OpBranchConditional or OpSwitch as their terminators (denoted <n>): 9 and 12.
;
; To follow this path, we need to make these decisions each time we reach 9 or 12.
; This path was generated with the seed 3413273072763214729 and has length 11.
;
; We equip the shader with 2+1 storage buffers:
; - An input storage buffer with the directions for each node 9 or 12
; - An output storage buffer that records the blocks that are executed

; SPIR-V
; Version: 1.3
; Generator: Khronos Glslang Reference Front End; 8
; Bound: 15
; Schema: 0

               OpCapability Shader
               OpMemoryModel Logical GLSL450
               OpEntryPoint GLCompute %7 "main"
               OpExecutionMode %7 LocalSize 1 1 1
               
               ; Below, we declare various types and variables for storage buffers.
               ; These decorations tell SPIR-V that the types and variables relate to storage buffers


               OpDecorate %size_2_struct_type BufferBlock
               OpMemberDecorate %size_2_struct_type 0 Offset 0
               OpDecorate %size_2_array_type ArrayStride 4

               OpDecorate %size_3_struct_type BufferBlock
               OpMemberDecorate %size_3_struct_type 0 Offset 0
               OpDecorate %size_3_array_type ArrayStride 4

               OpDecorate %output_struct_type BufferBlock
               OpMemberDecorate %output_struct_type 0 Offset 0
               OpDecorate %output_array_type ArrayStride 4

               OpDecorate %directions_9_variable DescriptorSet 0
               OpDecorate %directions_9_variable Binding 0

               OpDecorate %directions_12_variable DescriptorSet 0
               OpDecorate %directions_12_variable Binding 1

               OpDecorate %output_variable DescriptorSet 0
               OpDecorate %output_variable Binding 2


          %1 = OpTypeVoid
          %2 = OpTypeFunction %1
          %3 = OpTypeBool
          %4 = OpTypeInt 32 0
          %5 = OpConstantTrue %3
          %6 = OpConstant %4 0
          

               %constant_0 = OpConstant %4 0
               %constant_1 = OpConstant %4 1
               %constant_2 = OpConstant %4 2
               %constant_3 = OpConstant %4 3
               %constant_8 = OpConstant %4 8
               %constant_9 = OpConstant %4 9
               %constant_10 = OpConstant %4 10
               %constant_11 = OpConstant %4 11
               %constant_12 = OpConstant %4 12
               %constant_13 = OpConstant %4 13
               %constant_14 = OpConstant %4 14


               ; Declaration of storage buffers for the 2 directions and the output
               

               %size_2_array_type = OpTypeArray %4 %constant_2
               %size_2_struct_type = OpTypeStruct %size_2_array_type
               %size_2_pointer_type = OpTypePointer Uniform %size_2_struct_type
               %directions_12_variable = OpVariable %size_2_pointer_type Uniform

               %size_3_array_type = OpTypeArray %4 %constant_3
               %size_3_struct_type = OpTypeStruct %size_3_array_type
               %size_3_pointer_type = OpTypePointer Uniform %size_3_struct_type
               %directions_9_variable = OpVariable %size_3_pointer_type Uniform

               %output_array_type = OpTypeArray %4 %constant_11
               %output_struct_type = OpTypeStruct %output_array_type
               %output_pointer_type = OpTypePointer Uniform %output_struct_type
               %output_variable = OpVariable %output_pointer_type Uniform

               ; Pointer type for declaring local variables of int type
               %local_int_ptr = OpTypePointer Function %4

               ; Pointer type for integer data in a storage buffer
               %storage_buffer_int_ptr = OpTypePointer Uniform %4


          %7 = OpFunction %1 None %2

          %8 = OpLabel ; validCFG/Block$4
               %output_index = OpVariable %local_int_ptr Function %constant_0
               %directions_9_index = OpVariable %local_int_ptr Function %constant_0
               %directions_12_index = OpVariable %local_int_ptr Function %constant_0


   %temp_8_0 = OpLoad %4 %output_index
   %temp_8_1 = OpAccessChain %storage_buffer_int_ptr %output_variable %constant_0 %temp_8_0
               OpStore %temp_8_1 %constant_8
   %temp_8_2 = OpIAdd %4 %temp_8_0 %constant_1
               OpStore %output_index %temp_8_2
               OpBranch %9


          %9 = OpLabel ; validCFG/LoopHeader$1
   %temp_9_0 = OpLoad %4 %output_index
   %temp_9_1 = OpAccessChain %storage_buffer_int_ptr %output_variable %constant_0 %temp_9_0
               OpStore %temp_9_1 %constant_9
   %temp_9_2 = OpIAdd %4 %temp_9_0 %constant_1
               OpStore %output_index %temp_9_2
   %temp_9_3 = OpLoad %4 %directions_9_index
   %temp_9_4 = OpAccessChain %storage_buffer_int_ptr %directions_9_variable %constant_0 %temp_9_3
   %temp_9_5 = OpLoad %4 %temp_9_4
   %temp_9_6 = OpIEqual %3 %temp_9_5 %constant_1
   %temp_9_7 = OpIAdd %4 %temp_9_3 %constant_1
               OpStore %directions_9_index %temp_9_7
               OpLoopMerge %10 %11 None
               OpBranchConditional %temp_9_6 %12 %10


         %12 = OpLabel ; validCFG/LoopHeader$0
  %temp_12_0 = OpLoad %4 %output_index
  %temp_12_1 = OpAccessChain %storage_buffer_int_ptr %output_variable %constant_0 %temp_12_0
               OpStore %temp_12_1 %constant_12
  %temp_12_2 = OpIAdd %4 %temp_12_0 %constant_1
               OpStore %output_index %temp_12_2
  %temp_12_3 = OpLoad %4 %directions_12_index
  %temp_12_4 = OpAccessChain %storage_buffer_int_ptr %directions_12_variable %constant_0 %temp_12_3
  %temp_12_5 = OpLoad %4 %temp_12_4
  %temp_12_6 = OpIEqual %3 %temp_12_5 %constant_1
  %temp_12_7 = OpIAdd %4 %temp_12_3 %constant_1
               OpStore %directions_12_index %temp_12_7
               OpLoopMerge %13 %14 None
               OpBranchConditional %temp_12_6 %13 %13


         %10 = OpLabel ; validCFG/Block$0
  %temp_10_0 = OpLoad %4 %output_index
  %temp_10_1 = OpAccessChain %storage_buffer_int_ptr %output_variable %constant_0 %temp_10_0
               OpStore %temp_10_1 %constant_10
  %temp_10_2 = OpIAdd %4 %temp_10_0 %constant_1
               OpStore %output_index %temp_10_2
               OpReturn


         %13 = OpLabel ; validCFG/Block$2
  %temp_13_0 = OpLoad %4 %output_index
  %temp_13_1 = OpAccessChain %storage_buffer_int_ptr %output_variable %constant_0 %temp_13_0
               OpStore %temp_13_1 %constant_13
  %temp_13_2 = OpIAdd %4 %temp_13_0 %constant_1
               OpStore %output_index %temp_13_2
               OpBranch %11


         %14 = OpLabel ; validCFG/Block$3
               OpBranch %12


         %11 = OpLabel ; validCFG/Block$1
  %temp_11_0 = OpLoad %4 %output_index
  %temp_11_1 = OpAccessChain %storage_buffer_int_ptr %output_variable %constant_0 %temp_11_0
               OpStore %temp_11_1 %constant_11
  %temp_11_2 = OpIAdd %4 %temp_11_0 %constant_1
               OpStore %output_index %temp_11_2
               OpBranch %9

               OpFunctionEnd

 END

 BUFFER directions_9 DATA_TYPE uint32 STD430 DATA 1 1 0 END
 BUFFER directions_12 DATA_TYPE uint32 STD430 DATA 1 1 END

 BUFFER output DATA_TYPE uint32 STD430 SIZE 11 FILL 0

 PIPELINE compute pipeline
   ATTACH compute_shader

   BIND BUFFER directions_9 AS storage DESCRIPTOR_SET 0 BINDING 0
   BIND BUFFER directions_12 AS storage DESCRIPTOR_SET 0 BINDING 1

   BIND BUFFER output AS storage DESCRIPTOR_SET 0 BINDING 2
 END

 RUN pipeline 1 1 1

 EXPECT directions_9 IDX 0 EQ 1 1 0
 EXPECT directions_12 IDX 0 EQ 1 1
 EXPECT output IDX 0 EQ 8 9 12 13 11 9 12 13 11 9 10

This was run on my laptop which is using the Intel open-source Mesa driver version 88088582 (API version 1.2.182).

Has anyone encountered this error before?

@Jack-Clark Jack-Clark self-assigned this Mar 28, 2022
@afd
Copy link
Member

afd commented Mar 28, 2022

I tried the Amber file on my laptop, which also has Intel Mesa drivers (not sure of the driver version), and I get the same. A VkWaitForFences means that the test case has timed out.

I tried this out using the NVIDIA drivers on fennel (I built Amber locally - you should be able to do the same, e.g. make a jack directory under /data) and it passes instantly there.

This could already be a compiler bug.

Can you look into simplifying the SPIR-V assembly to get something with the same control flow but even simpler instructions, to see whether the problem still triggers? E.g., the data on when to go left/right could all be embedded in the SPIR-V, so no need for corresponding Amber declarations.

@Jack-Clark
Copy link
Collaborator Author

I'll have a look at simplifying this tomorrow. I tried it with the SwiftShader driver enabled on my desktop and it also passes instantly there.

@Jack-Clark
Copy link
Collaborator Author

Jack-Clark commented Mar 30, 2022

I have a few strange findings.

  1. Changing OpBranchConditional %temp_12_6 %13 %13 to OpBranchConditional %temp_12_6 %13 %14 in LoopHeader$0 gets the test to pass successfully. Interestingly the docs for OpBranchConditional disallow using the same id from version 1.6 onwards. Here is the amber code:
#!amber

SHADER compute compute_shader SPIRV-ASM

; Follow the path:
; 8 -> <9> -> <12> -> 13 -> 11 -> <9> -> 10
;
; 2 CFG nodes have OpBranchConditional or OpSwitch as their terminators (denoted <n>): 9 and 12.
;
; To follow this path, we need to make these decisions each time we reach 9 or 12.
; This path was generated with the seed 5228420583341109684 and has length 7.
;
; We equip the shader with 2+1 storage buffers:
; - An input storage buffer with the directions for each node 9 or 12
; - An output storage buffer that records the blocks that are executed

; SPIR-V
; Version: 1.3
; Generator: Khronos Glslang Reference Front End; 8
; Bound: 15
; Schema: 0

               OpCapability Shader
               OpMemoryModel Logical GLSL450
               OpEntryPoint GLCompute %7 "main"
               OpExecutionMode %7 LocalSize 1 1 1
               
               ; Below, we declare various types and variables for storage buffers.
               ; These decorations tell SPIR-V that the types and variables relate to storage buffers

               OpDecorate %output_struct_type BufferBlock
               OpMemberDecorate %output_struct_type 0 Offset 0
               OpDecorate %output_array_type ArrayStride 4

               OpDecorate %output_variable DescriptorSet 0
               OpDecorate %output_variable Binding 0


          %1 = OpTypeVoid
          %2 = OpTypeFunction %1
          %3 = OpTypeBool
          %4 = OpTypeInt 32 0
          %5 = OpConstantTrue %3
          %6 = OpConstant %4 0
          

               %constant_0 = OpConstant %4 0
               %constant_1 = OpConstant %4 1
               %constant_2 = OpConstant %4 2
               %constant_7 = OpConstant %4 7
               %constant_8 = OpConstant %4 8
               %constant_9 = OpConstant %4 9
               %constant_10 = OpConstant %4 10
               %constant_11 = OpConstant %4 11
               %constant_12 = OpConstant %4 12
               %constant_13 = OpConstant %4 13
               %constant_14 = OpConstant %4 14


               ; Declaration of storage buffers for the 2 directions and the output            
               %output_array_type = OpTypeArray %4 %constant_7
               %output_struct_type = OpTypeStruct %output_array_type
               %output_pointer_type = OpTypePointer Uniform %output_struct_type
               %output_variable = OpVariable %output_pointer_type Uniform

               ; Pointer type for declaring local variables of int type
               %local_int_ptr = OpTypePointer Function %4

               ; Pointer type for integer data in a storage buffer
               %storage_buffer_int_ptr = OpTypePointer Uniform %4
              
          %7 = OpFunction %1 None %2

          %8 = OpLabel ; validCFG/Block$4
               %output_index = OpVariable %local_int_ptr Function %constant_0
               %loop_exit_cond = OpVariable %local_int_ptr Function %constant_0


   %temp_8_0 = OpLoad %4 %output_index
   %temp_8_1 = OpAccessChain %storage_buffer_int_ptr %output_variable %constant_0 %temp_8_0
               OpStore %temp_8_1 %constant_8
   %temp_8_2 = OpIAdd %4 %temp_8_0 %constant_1
               OpStore %output_index %temp_8_2
               OpBranch %9


          %9 = OpLabel ; validCFG/LoopHeader$1
   
   ; Handle Output
   %temp_9_0 = OpLoad %4 %output_index
   %temp_9_1 = OpAccessChain %storage_buffer_int_ptr %output_variable %constant_0 %temp_9_0
               OpStore %temp_9_1 %constant_9
   %temp_9_2 = OpIAdd %4 %temp_9_0 %constant_1
               OpStore %output_index %temp_9_2
   
   ; Conditional jump to validCFG/LoopHeader$0
       %cond_val = OpLoad %4 %loop_exit_cond
   %new_cond_val = OpIAdd %4 %cond_val %constant_1
                   OpStore %loop_exit_cond %new_cond_val
       %temp_9_6 = OpIEqual %3 %cond_val %constant_0
                   OpLoopMerge %10 %11 None
                   OpBranchConditional %temp_9_6 %12 %10


         %12 = OpLabel ; validCFG/LoopHeader$0
  ; Handle Output
  %temp_12_0 = OpLoad %4 %output_index
  %temp_12_1 = OpAccessChain %storage_buffer_int_ptr %output_variable %constant_0 %temp_12_0
               OpStore %temp_12_1 %constant_12
  %temp_12_2 = OpIAdd %4 %temp_12_0 %constant_1
               OpStore %output_index %temp_12_2
  
     ; Conditional jump to validCFG/Block$2
  %temp_12_6 = OpIEqual %3 %constant_1 %constant_1
               OpLoopMerge %13 %14 None
               OpBranchConditional %temp_12_6 %13 %14
               

         %13 = OpLabel ; validCFG/Block$2
  ; Handle Output
  %temp_13_0 = OpLoad %4 %output_index
  %temp_13_1 = OpAccessChain %storage_buffer_int_ptr %output_variable %constant_0 %temp_13_0
               OpStore %temp_13_1 %constant_13
  %temp_13_2 = OpIAdd %4 %temp_13_0 %constant_1
               OpStore %output_index %temp_13_2
               OpBranch %11


         %11 = OpLabel ; validCFG/Block$1
  %temp_11_0 = OpLoad %4 %output_index
  %temp_11_1 = OpAccessChain %storage_buffer_int_ptr %output_variable %constant_0 %temp_11_0
               OpStore %temp_11_1 %constant_11
  %temp_11_2 = OpIAdd %4 %temp_11_0 %constant_1
               OpStore %output_index %temp_11_2
               OpBranch %9
               

         %10 = OpLabel ; validCFG/Block$0
  %temp_10_0 = OpLoad %4 %output_index
  %temp_10_1 = OpAccessChain %storage_buffer_int_ptr %output_variable %constant_0 %temp_10_0
               OpStore %temp_10_1 %constant_10
  %temp_10_2 = OpIAdd %4 %temp_10_0 %constant_1
               OpStore %output_index %temp_10_2
               OpReturn


         %14 = OpLabel ; validCFG/Block$3
               OpBranch %12

               OpFunctionEnd

 END

 BUFFER output DATA_TYPE uint32 STD430 SIZE 7 FILL 0

 PIPELINE compute pipeline
   ATTACH compute_shader
   
   BIND BUFFER output AS storage DESCRIPTOR_SET 0 BINDING 0
 
 END

 RUN pipeline 1 1 1

 EXPECT output IDX 0 EQ 8 9 12 13 11 9 10
  1. I can also get the test to finish executing by removing all output related code, although obviously we can't tell if it passed in the usual sense. What is very strange is that I can get the test to execute and fail the output checks by removing the output related code in the LoopHeader$0 and LoopHeader$1 blocks only. The failure condition is Verifier failed: 10 == 9, at index 1. 10 is the id of Block$0 and 9 is the id of LoopHeader$1. This suggests that removing the output code in the two loop header blocks causes a different path to be taken i.e. the path Block$4->LoopHeader$1->Block$0 is taken rather than the expected path Block$4->LoopHeader$1->LoopHeader$0->.... If I change the expected output to EXPECT output IDX 0 EQ 8 10, the test passes. Here is the amber code with the output code stripped from LoopHeader$1 and LoopHeader$0:
#!amber

SHADER compute compute_shader SPIRV-ASM

; Follow the path:
; 8 -> <9> -> <12> -> 13 -> 11 -> <9> -> 10
;
; 2 CFG nodes have OpBranchConditional or OpSwitch as their terminators (denoted <n>): 9 and 12.
;
; To follow this path, we need to make these decisions each time we reach 9 or 12.
; This path was generated with the seed 5228420583341109684 and has length 7.
;
; We equip the shader with 2+1 storage buffers:
; - An input storage buffer with the directions for each node 9 or 12
; - An output storage buffer that records the blocks that are executed

; SPIR-V
; Version: 1.3
; Generator: Khronos Glslang Reference Front End; 8
; Bound: 15
; Schema: 0

               OpCapability Shader
               OpMemoryModel Logical GLSL450
               OpEntryPoint GLCompute %7 "main"
               OpExecutionMode %7 LocalSize 1 1 1
               
               ; Below, we declare various types and variables for storage buffers.
               ; These decorations tell SPIR-V that the types and variables relate to storage buffers

               OpDecorate %output_struct_type BufferBlock
               OpMemberDecorate %output_struct_type 0 Offset 0
               OpDecorate %output_array_type ArrayStride 4

               OpDecorate %output_variable DescriptorSet 0
               OpDecorate %output_variable Binding 0


          %1 = OpTypeVoid
          %2 = OpTypeFunction %1
          %3 = OpTypeBool
          %4 = OpTypeInt 32 0
          %5 = OpConstantTrue %3
          %6 = OpConstant %4 0
          

               %constant_0 = OpConstant %4 0
               %constant_1 = OpConstant %4 1
               %constant_2 = OpConstant %4 2
               %constant_7 = OpConstant %4 7
               %constant_8 = OpConstant %4 8
               %constant_9 = OpConstant %4 9
               %constant_10 = OpConstant %4 10
               %constant_11 = OpConstant %4 11
               %constant_12 = OpConstant %4 12
               %constant_13 = OpConstant %4 13
               %constant_14 = OpConstant %4 14


               ; Declaration of storage buffers for the 2 directions and the output            
               %output_array_type = OpTypeArray %4 %constant_7
               %output_struct_type = OpTypeStruct %output_array_type
               %output_pointer_type = OpTypePointer Uniform %output_struct_type
               %output_variable = OpVariable %output_pointer_type Uniform

               ; Pointer type for declaring local variables of int type
               %local_int_ptr = OpTypePointer Function %4

               ; Pointer type for integer data in a storage buffer
               %storage_buffer_int_ptr = OpTypePointer Uniform %4
              
          %7 = OpFunction %1 None %2

          %8 = OpLabel ; validCFG/Block$4
               %output_index = OpVariable %local_int_ptr Function %constant_0
               %loop_exit_cond = OpVariable %local_int_ptr Function %constant_0

   %temp_8_0 = OpLoad %4 %output_index
   %temp_8_1 = OpAccessChain %storage_buffer_int_ptr %output_variable %constant_0 %temp_8_0
               OpStore %temp_8_1 %constant_8
   %temp_8_2 = OpIAdd %4 %temp_8_0 %constant_1
               OpStore %output_index %temp_8_2
               OpBranch %9


          %9 = OpLabel ; validCFG/LoopHeader$1    
   
   ; Conditional jump to validCFG/LoopHeader$0
       %cond_val = OpLoad %4 %loop_exit_cond
   %new_cond_val = OpIAdd %4 %cond_val %constant_1
                   OpStore %loop_exit_cond %new_cond_val
       %temp_9_6 = OpIEqual %3 %cond_val %constant_0
                   OpLoopMerge %10 %11 None
                   OpBranchConditional %temp_9_6 %12 %10


         %12 = OpLabel ; validCFG/LoopHeader$0

     ; Conditional jump to validCFG/Block$2
  %temp_12_6 = OpIEqual %3 %constant_1 %constant_1
               OpLoopMerge %13 %14 None
               OpBranchConditional %temp_12_6 %13 %13
               

         %13 = OpLabel ; validCFG/Block$2
  ; Handle Output
  %temp_13_0 = OpLoad %4 %output_index
  %temp_13_1 = OpAccessChain %storage_buffer_int_ptr %output_variable %constant_0 %temp_13_0
               OpStore %temp_13_1 %constant_13
  %temp_13_2 = OpIAdd %4 %temp_13_0 %constant_1
               OpStore %output_index %temp_13_2
               OpBranch %11


         %11 = OpLabel ; validCFG/Block$1
  %temp_11_0 = OpLoad %4 %output_index
  %temp_11_1 = OpAccessChain %storage_buffer_int_ptr %output_variable %constant_0 %temp_11_0
               OpStore %temp_11_1 %constant_11
  %temp_11_2 = OpIAdd %4 %temp_11_0 %constant_1
               OpStore %output_index %temp_11_2
               OpBranch %9
               

         %10 = OpLabel ; validCFG/Block$0
  %temp_10_0 = OpLoad %4 %output_index
  %temp_10_1 = OpAccessChain %storage_buffer_int_ptr %output_variable %constant_0 %temp_10_0
               OpStore %temp_10_1 %constant_10
  %temp_10_2 = OpIAdd %4 %temp_10_0 %constant_1
               OpStore %output_index %temp_10_2
               OpReturn


         %14 = OpLabel ; validCFG/Block$3
               OpBranch %12

               OpFunctionEnd

 END

 BUFFER output DATA_TYPE uint32 STD430 SIZE 7 FILL 0

 PIPELINE compute pipeline
   ATTACH compute_shader
   
   BIND BUFFER output AS storage DESCRIPTOR_SET 0 BINDING 0
 
 END

 RUN pipeline 1 1 1

 EXPECT output IDX 0 EQ 8 10

Note that I've reduced both of the above examples by removing input related code and hard-coding the path. The reduced program that still fails is below:

#!amber

SHADER compute compute_shader SPIRV-ASM

; Follow the path:
; 8 -> <9> -> <12> -> 13 -> 11 -> <9> -> 10
;
; 2 CFG nodes have OpBranchConditional or OpSwitch as their terminators (denoted <n>): 9 and 12.
;
; To follow this path, we need to make these decisions each time we reach 9 or 12.
; This path was generated with the seed 5228420583341109684 and has length 7.
;
; We equip the shader with 2+1 storage buffers:
; - An input storage buffer with the directions for each node 9 or 12
; - An output storage buffer that records the blocks that are executed

; SPIR-V
; Version: 1.3
; Generator: Khronos Glslang Reference Front End; 8
; Bound: 15
; Schema: 0

               OpCapability Shader
               OpMemoryModel Logical GLSL450
               OpEntryPoint GLCompute %7 "main"
               OpExecutionMode %7 LocalSize 1 1 1
               
               ; Below, we declare various types and variables for storage buffers.
               ; These decorations tell SPIR-V that the types and variables relate to storage buffers

               OpDecorate %output_struct_type BufferBlock
               OpMemberDecorate %output_struct_type 0 Offset 0
               OpDecorate %output_array_type ArrayStride 4

               OpDecorate %output_variable DescriptorSet 0
               OpDecorate %output_variable Binding 0


          %1 = OpTypeVoid
          %2 = OpTypeFunction %1
          %3 = OpTypeBool
          %4 = OpTypeInt 32 0
          %5 = OpConstantTrue %3
          %6 = OpConstant %4 0
          

               %constant_0 = OpConstant %4 0
               %constant_1 = OpConstant %4 1
               %constant_2 = OpConstant %4 2
               %constant_7 = OpConstant %4 7
               %constant_8 = OpConstant %4 8
               %constant_9 = OpConstant %4 9
               %constant_10 = OpConstant %4 10
               %constant_11 = OpConstant %4 11
               %constant_12 = OpConstant %4 12
               %constant_13 = OpConstant %4 13
               %constant_14 = OpConstant %4 14


               ; Declaration of storage buffers for the 2 directions and the output            
               %output_array_type = OpTypeArray %4 %constant_7
               %output_struct_type = OpTypeStruct %output_array_type
               %output_pointer_type = OpTypePointer Uniform %output_struct_type
               %output_variable = OpVariable %output_pointer_type Uniform

               ; Pointer type for declaring local variables of int type
               %local_int_ptr = OpTypePointer Function %4

               ; Pointer type for integer data in a storage buffer
               %storage_buffer_int_ptr = OpTypePointer Uniform %4
              
          %7 = OpFunction %1 None %2

          %8 = OpLabel ; validCFG/Block$4
               %output_index = OpVariable %local_int_ptr Function %constant_0
               %loop_exit_cond = OpVariable %local_int_ptr Function %constant_0


   %temp_8_0 = OpLoad %4 %output_index
   %temp_8_1 = OpAccessChain %storage_buffer_int_ptr %output_variable %constant_0 %temp_8_0
               OpStore %temp_8_1 %constant_8
   %temp_8_2 = OpIAdd %4 %temp_8_0 %constant_1
               OpStore %output_index %temp_8_2
               OpBranch %9


          %9 = OpLabel ; validCFG/LoopHeader$1
   
   ; Handle Output
   %temp_9_0 = OpLoad %4 %output_index
   %temp_9_1 = OpAccessChain %storage_buffer_int_ptr %output_variable %constant_0 %temp_9_0
               OpStore %temp_9_1 %constant_9
   %temp_9_2 = OpIAdd %4 %temp_9_0 %constant_1
               OpStore %output_index %temp_9_2
   
   ; Conditional jump to validCFG/LoopHeader$0
       %cond_val = OpLoad %4 %loop_exit_cond
   %new_cond_val = OpIAdd %4 %cond_val %constant_1
                   OpStore %loop_exit_cond %new_cond_val
       %temp_9_6 = OpIEqual %3 %cond_val %constant_0
                   OpLoopMerge %10 %11 None
                   OpBranchConditional %temp_9_6 %12 %10


         %12 = OpLabel ; validCFG/LoopHeader$0
  %temp_12_0 = OpLoad %4 %output_index
  %temp_12_1 = OpAccessChain %storage_buffer_int_ptr %output_variable %constant_0 %temp_12_0
               OpStore %temp_12_1 %constant_12
  %temp_12_2 = OpIAdd %4 %temp_12_0 %constant_1
               OpStore %output_index %temp_12_2
  
     ; Conditional jump to validCFG/Block$2
  %temp_12_6 = OpIEqual %3 %constant_1 %constant_1
               OpLoopMerge %13 %14 None
               OpBranchConditional %temp_12_6 %13 %13
               

         %13 = OpLabel ; validCFG/Block$2
  ; Handle Output
  %temp_13_0 = OpLoad %4 %output_index
  %temp_13_1 = OpAccessChain %storage_buffer_int_ptr %output_variable %constant_0 %temp_13_0
               OpStore %temp_13_1 %constant_13
  %temp_13_2 = OpIAdd %4 %temp_13_0 %constant_1
               OpStore %output_index %temp_13_2
               OpBranch %11


         %11 = OpLabel ; validCFG/Block$1
  %temp_11_0 = OpLoad %4 %output_index
  %temp_11_1 = OpAccessChain %storage_buffer_int_ptr %output_variable %constant_0 %temp_11_0
               OpStore %temp_11_1 %constant_11
  %temp_11_2 = OpIAdd %4 %temp_11_0 %constant_1
               OpStore %output_index %temp_11_2
               OpBranch %9
               

         %10 = OpLabel ; validCFG/Block$0
  %temp_10_0 = OpLoad %4 %output_index
  %temp_10_1 = OpAccessChain %storage_buffer_int_ptr %output_variable %constant_0 %temp_10_0
               OpStore %temp_10_1 %constant_10
  %temp_10_2 = OpIAdd %4 %temp_10_0 %constant_1
               OpStore %output_index %temp_10_2
               OpReturn


         %14 = OpLabel ; validCFG/Block$3
               OpBranch %12

               OpFunctionEnd

 END

 BUFFER output DATA_TYPE uint32 STD430 SIZE 7 FILL 0

 PIPELINE compute pipeline
   ATTACH compute_shader
   
   BIND BUFFER output AS storage DESCRIPTOR_SET 0 BINDING 0
 
 END

 RUN pipeline 1 1 1

 EXPECT output IDX 0 EQ 8 9 12 13 11 9 10

@afd
Copy link
Member

afd commented Mar 30, 2022

This is looking more and more like a Mesa bug (albeit perhaps a slightly boring one since it depends on the feature of a block having duplicate successors which, as you say, will be disallowed in future SPIR-V versions).

Would you be able to spend say 30 mins preparing as simple a repro for this bug as you can, and then maybe we can go through it carefully in our meeting with @johnwickerson today? Then if all looks good you can report it to Mesa (assuming it still repros on the latest release driver, if that's not what you're testing.)

In the repro, I think removing output code that isn't relevant to the expected output is a good thing if it still causes the test to fail.

Compiler bugs always do have really weird effects, and changing seemingly unrelated parts of the test case often has an impact (because something bad has gone wrong).

Aside: if "duplicate successors" is the tell tale sign for this bug then perhaps you could write some Mesa-testing infrastructure to skip test cases that have this property. More generally, perhaps your experimental scripts should support an "ignore" script, which will run on each asm file before it is fleshed and discard the asm file if it fails some target-specific checks. This could be a nice way to get around what @johnwickerson has dubbed "The Cludedo Problem".

@Jack-Clark
Copy link
Collaborator Author

I've been able to reproduce the issue on the latest release and on the main branch of Mesa.

As you suggested I created a modified version with a block B5 that is the target of the false case in the OpBranchConditional in LH0. This causes the test to pass. I can also make B3 the false case target and it will pass.

Here is the minimal amber file that reproduces the issue. There must be a side effect in LH0 in order for that path to be taken, so the output variable is still necessary. Is this in a suitable form to submit as a bug report?

#!amber

SHADER compute compute_shader SPIRV-ASM

; SPIR-V
; Version: 1.3
; Generator: Khronos Glslang Reference Front End; 8
; Bound: 16
; Schema: 0

                             OpCapability Shader
                             OpMemoryModel Logical GLSL450
                             OpEntryPoint GLCompute %7 "main"
                             OpExecutionMode %7 LocalSize 1 1 1
                        
                             OpDecorate %output_struct_type BufferBlock
                             OpMemberDecorate %output_struct_type 0 Offset 0
                             OpDecorate %output_array_type ArrayStride 4
                             OpDecorate %output_variable DescriptorSet 0
                             OpDecorate %output_variable Binding 0

                        %1 = OpTypeVoid
                        %2 = OpTypeFunction %1
                        %3 = OpTypeBool
                        %4 = OpTypeInt 32 0

               %constant_0 = OpConstant %4 0
               %constant_1 = OpConstant %4 1

        %output_array_type = OpTypeArray %4 %constant_1
       %output_struct_type = OpTypeStruct %output_array_type
      %output_pointer_type = OpTypePointer Uniform %output_struct_type
          %output_variable = OpVariable %output_pointer_type Uniform

            %local_int_ptr = OpTypePointer Function %4
               %sb_int_ptr = OpTypePointer Uniform %4

                        %7 = OpFunction %1 None %2

                        %8 = OpLabel ; validCFG/Block$4
           %loop_exit_cond = OpVariable %local_int_ptr Function %constant_0
                             OpBranch %9

                        %9 = OpLabel ; validCFG/LoopHeader$1                  
                 %cond_val = OpLoad %4 %loop_exit_cond
             %new_cond_val = OpIAdd %4 %cond_val %constant_1
                             OpStore %loop_exit_cond %new_cond_val
                 %temp_9_6 = OpIEqual %3 %cond_val %constant_0
                             OpLoopMerge %10 %11 None
                             OpBranchConditional %temp_9_6 %12 %10

                       %12 = OpLabel ; validCFG/LoopHeader$0
                %temp_12_1 = OpAccessChain %sb_int_ptr %output_variable %constant_0 %constant_0
                             OpStore %temp_12_1 %constant_1 ; This is a necessary side effect to force this path to be taken
                %temp_12_6 = OpIEqual %3 %constant_1 %constant_1
                             OpLoopMerge %13 %14 None
                             OpBranchConditional %temp_12_6 %13 %13
               
                       %13 = OpLabel ; validCFG/Block$2
                             OpBranch %11

                       %11 = OpLabel ; validCFG/Block$1
                             OpBranch %9                              

                       %10 = OpLabel ; validCFG/Block$0
                             OpReturn

                       %14 = OpLabel ; validCFG/Block$3
                             OpBranch %12

                             OpFunctionEnd

 END

 BUFFER output DATA_TYPE uint32 STD430 SIZE 1 FILL 0

 PIPELINE compute pipeline
   ATTACH compute_shader
   
   BIND BUFFER output AS storage DESCRIPTOR_SET 0 BINDING 0
 
 END

 RUN pipeline 1 1 1

The output of running the above is:

$ amber -v 1.1 error.amber 
[ERROR] validation layer (Validation):
Validation Error: [ VUID-vkDestroyFence-fence-01120 ] Object 0: handle = 0xf56c9b0000000004, type = VK_OBJECT_TYPE_FENCE; | MessageID = 0x5d296248 | VkFence 0xf56c9b0000000004[] is in use. The Vulkan spec states: All queue submission commands that refer to fence must have completed execution (https://vulkan.lunarg.com/doc/view/1.3.204.1/linux/1.3-extensions/vkspec.html#VUID-vkDestroyFence-fence-01120)
[ERROR] validation layer (Validation):
Validation Error: [ VUID-vkFreeCommandBuffers-pCommandBuffers-00047 ] Object 0: handle = 0x55f7661dacd0, type = VK_OBJECT_TYPE_COMMAND_BUFFER; | MessageID = 0x1ab902fc | Attempt to free VkCommandBuffer 0x55f7661dacd0[] which is in use. The Vulkan spec states: All elements of pCommandBuffers must not be in the pending state (https://vulkan.lunarg.com/doc/view/1.3.204.1/linux/1.3-extensions/vkspec.html#VUID-vkFreeCommandBuffers-pCommandBuffers-00047)
error.amber: Vulkan::Calling vkWaitForFences Timeout

Summary of Failures:
  error.amber

Summary: 0 pass, 1 fail
[ERROR] validation layer (Validation):
Validation Error: [ VUID-vkDestroyDevice-device-00378 ] Object 0: handle = 0x55f7660fd3d0, type = VK_OBJECT_TYPE_DEVICE; Object 1: handle = 0xcfef35000000000a, type = VK_OBJECT_TYPE_PIPELINE_LAYOUT; | MessageID = 0x71500fba | OBJ ERROR : For VkDevice 0x55f7660fd3d0[], VkPipelineLayout 0xcfef35000000000a[] has not been destroyed. The Vulkan spec states: All child objects created on device must have been destroyed prior to destroying device (https://vulkan.lunarg.com/doc/view/1.3.204.1/linux/1.3-extensions/vkspec.html#VUID-vkDestroyDevice-device-00378)
[ERROR] validation layer (Validation):
Validation Error: [ VUID-vkDestroyDevice-device-00378 ] Object 0: handle = 0x55f7660fd3d0, type = VK_OBJECT_TYPE_DEVICE; Object 1: handle = 0xec4bec000000000b, type = VK_OBJECT_TYPE_PIPELINE; | MessageID = 0x71500fba | OBJ ERROR : For VkDevice 0x55f7660fd3d0[], VkPipeline 0xec4bec000000000b[] has not been destroyed. The Vulkan spec states: All child objects created on device must have been destroyed prior to destroying device (https://vulkan.lunarg.com/doc/view/1.3.204.1/linux/1.3-extensions/vkspec.html#VUID-vkDestroyDevice-device-00378)

@afd
Copy link
Member

afd commented Apr 1, 2022

Great! Please go ahead and file to Mesa. If you look in the doc about building and fuzzing Mesa, it links to some bug reports that we filed previously. Please follow the structure of one of those for decent practice on what info to provide in the report.

@Jack-Clark
Copy link
Collaborator Author

Issue filed

@afd afd added modelling and removed modelling labels Oct 19, 2022
@afd afd transferred this issue from another repository Oct 19, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants