-
Notifications
You must be signed in to change notification settings - Fork 304
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a bunch of limits. #1863
Add a bunch of limits. #1863
Conversation
- `minUniformBufferOffsetAlignment` at 256. Required from all APIs. - `minStorageBufferOffsetAlignment` at 256. Required for most Vulkan Android devices. - `maxVertexOutputComponents` at 64. Required for early Apple GPUs on Metal. - `maxFragmentInputComponents` at 60. Required for early Apple GPUs on Metal. - `maxColorAttachments` at 4. Required for early Apple GPUs on Metal and many Android Vulkan devices. - `maxComputeWorkgroupStorageSize` at 16352. Required for early Apple GPUs on Metal and many Android Vulkan devices. - `maxComputeWorkgroupInvocations` at 256. Required for many Android Vulkan devices. Chose to not have per-dimension limits as they would be more constraining and can be worked around when translating from WGSL to SPIR-V. - `maxComputePerDimensionDispatchSize` at 65535. Required for many Android Vulkan devices. See gpuweb#1343
(in talking with @kainino0x @Kangz and @kvark, I think this is where we ended up feeling:) Ideally we can require We might want |
I think we should unite the in/out location limits. Quickly glancing over them on Android Vulkan shows 64 being a safe bet. And yes, max color attachments should just be a hard limit in the spec. The |
Maybe we should put 🌶️ or 🌶️🌶️ next to the spicier limits with respect to portability, like on a restaurant menu. :) |
For the record, we also discussed whether we can believe Metal's 60 limitations for fragment shaders or not given use of the position builtin would probably cost 0.
That should be the case for all limits. |
See #1865 about having limits which exist but aren't configurable. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still LGTM % #1865 and the open comment
{{GPUShaderModule}} entry-point. | ||
|
||
<tr><td><dfn>maxComputeWorkgroupInvocations</dfn> | ||
<td>{{GPUSize32}} <td>Higher <td>256 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For Metal, the workgroup size limit is per pipeline, not per device. In theory, any particular compute shader may have a lower maximum workgroup size than the device's maximum due to heavy resource usage.
I don't have any direct experience with this for Metal, so I'm just curious as to whether we are confident that we'll always be able to support a workgroup size of 256 for all possible compute shaders?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We discussed this a while ago in #275 but never resolved. In general pipeline compilation can fail for many different reasons (couldn't allocate registers, or too much spilling is another one). 256 is probably safe in most cases, but to be 100% safe developers should use createRenderPipelineAsync
to know if the compilation succeeded.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I suppose there is no better option than to expect 256 to be supported on Metal pipelines :/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my understanding, Metal is actually strictly less restrictive than other platforms - all of the APIs can fail pipeline creation arbitrarily (e.g. because they couldn't manage to allocate all of the memory needed by the shader at a particular workgroup size). Instead, Metal does not require workgroup_size
at compile time, so they let you create a pipeline, and then tell you what the maximum workgroup_size
is afterward*. In other words, Metal fails softly (reduces the maximum workgroup size) while other APIs fail hard (fail to create the pipeline at all).
(* In practice, I suspect this only helps for compute shaders which do not use workgroup memory, as workgroup memory size is almost always dependent on workgroup size.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree no need to block on #1865, but have more comments
{{GPUShaderModule}} entry-point. | ||
|
||
<tr><td><dfn>maxComputeWorkgroupInvocations</dfn> | ||
<td>{{GPUSize32}} <td>Higher <td>256 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my understanding, Metal is actually strictly less restrictive than other platforms - all of the APIs can fail pipeline creation arbitrarily (e.g. because they couldn't manage to allocate all of the memory needed by the shader at a particular workgroup size). Instead, Metal does not require workgroup_size
at compile time, so they let you create a pipeline, and then tell you what the maximum workgroup_size
is afterward*. In other words, Metal fails softly (reduces the maximum workgroup size) while other APIs fail hard (fail to create the pipeline at all).
(* In practice, I suspect this only helps for compute shaders which do not use workgroup memory, as workgroup memory size is almost always dependent on workgroup size.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Merging as discussed with Kai in the editor's meeting today. |
minUniformBufferOffsetAlignment
at 256. Required from all APIs.minStorageBufferOffsetAlignment
at 256. Required for most VulkanAndroid devices.
maxVertexOutputComponents
at 64. Required for early Apple GPUson Metal.
maxFragmentInputComponents
at 60. Required for early Apple GPUson Metal.
maxColorAttachments
at 4. Required for early Apple GPUs on Metaland many Android Vulkan devices.
maxComputeWorkgroupStorageSize
at 16352. Required for earlyApple GPUs on Metal and many Android Vulkan devices.
maxComputeWorkgroupInvocations
at 256. Required for many AndroidVulkan devices. Chose to not have per-dimension limits as they would
be more constraining and can be worked around when translating from
WGSL to SPIR-V.
maxComputePerDimensionDispatchSize
at 65535. Required for manyAndroid Vulkan devices.
See #1343
Preview | Diff