New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PixelShaderGen: Use subgroup reduction for bounding box #7904
Conversation
#ifdef SUPPORTS_WARP_REDUCTION | ||
WARP_MIN(minpos); | ||
WARP_MAX(maxpos); | ||
if (IS_FIRST_ACTIVE_WARP) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's possible that this new condition will prevent the loads for the bbox_* values from being scheduled near the beginning of the shader, so it may actually end up being slower. Maybe help the compiler a bit by moving them there manually.
@Degerz I don't really have any desire to implement it in D3D11 with vendor-specific stuff, since it's kinda messy. SM6 could be an option with the D3D12 backend, but we have to merge that first. The only concern I would have is bloating the download size with DXCompiler, as it doesn't seem to be available anywhere in the system (only in the SDK AFAICT). |
Works now and doesn't crash immediately on a new game anymore. Is ~20% faster in OpenGL and 2% faster in Vulkan. |
I checked the latest Metal Shading Language spec and found out that it supports SIMD-group functions which looks pretty similar to Vulkan's subgroup operations. Is there any chance that SPIRV-Cross can handle this for Metal ? |
@Degerz I don't see why SPIRV-Cross couldn't implement the extension, assuming the semantics are the same. There may be extra steps required to ensure the same behavior (e.g. helper or discarded threads/invocations). |
As KHR_shader_subgroup seems to be usable on AMD now, shall we switch the OGL implementation to use KHR_shader_subgroup instead of NV_shader_thread_group? If I interpret the numbers here correct https://opengl.gpuinfo.org/listreports.php?extension=GL_NV_shader_thread_group vs https://opengl.gpuinfo.org/listreports.php?extension=GL_KHR_shader_subgroup , AMD very recently gained support for the newer extension and INTEL has it for a while already. Both don't support the NV extention through. However I'm unsure how many Nvidia users we might loose by switching to KHR_shader_subgroup on OGL. What is your opinion? If you want to try it: #11523 |
Currently, we perform 4 atomic operations for every fragment being shaded when bounding box is enabled (assuming they aren't elided by the branch).
This branch reduces the number of atomic operations by up to a factor of 32 (or the GPU's warp/wave size), by doing a warp-wise min/max reduction, and only performing the atomic operations on the first active thread.
Unfortunately for GL, this is NVIDIA-only, since as far as I can tell there's no vendor-neutral extension for doing shuffles or subgroup reduction operations. Vulkan support is dependent on the vendor implementing Vulkan 1.1 and supporting GroupNonUniformArithmetic?