Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a bunch of limits. #1863

Merged
merged 5 commits into from
Jun 28, 2021
Merged

Add a bunch of limits. #1863

merged 5 commits into from
Jun 28, 2021

Conversation

Kangz
Copy link
Contributor

@Kangz Kangz commented Jun 21, 2021

  • minUniformBufferOffsetAlignment at 256. Required from all APIs.
  • minStorageBufferOffsetAlignment at 256. Required for most Vulkan
    Android devices.
  • maxVertexOutputComponents at 64. Required for early Apple GPUs
    on Metal.
  • maxFragmentInputComponents at 60. Required for early Apple GPUs
    on Metal.
  • maxColorAttachments at 4. Required for early Apple GPUs on Metal
    and many Android Vulkan devices.
  • maxComputeWorkgroupStorageSize at 16352. Required for early
    Apple GPUs on Metal and many Android Vulkan devices.
  • maxComputeWorkgroupInvocations at 256. Required for many Android
    Vulkan devices. Chose to not have per-dimension limits as they would
    be more constraining and can be worked around when translating from
    WGSL to SPIR-V.
  • maxComputePerDimensionDispatchSize at 65535. Required for many
    Android Vulkan devices.

See #1343


Preview | Diff

    - `minUniformBufferOffsetAlignment` at 256. Required from all APIs.
    - `minStorageBufferOffsetAlignment` at 256. Required for most Vulkan
      Android devices.
    - `maxVertexOutputComponents` at 64. Required for early Apple GPUs
      on Metal.
    - `maxFragmentInputComponents` at 60. Required for early Apple GPUs
      on Metal.
    - `maxColorAttachments` at 4. Required for early Apple GPUs on Metal
      and many Android Vulkan devices.
    - `maxComputeWorkgroupStorageSize` at 16352. Required for early
      Apple GPUs on Metal and many Android Vulkan devices.
    - `maxComputeWorkgroupInvocations` at 256. Required for many Android
      Vulkan devices. Chose to not have per-dimension limits as they would
      be more constraining and can be worked around when translating from
      WGSL to SPIR-V.
    - `maxComputePerDimensionDispatchSize` at 65535. Required for many
      Android Vulkan devices.

See gpuweb#1343
@Kangz Kangz requested review from kvark and kainino0x June 21, 2021 13:16
@github-actions
Copy link
Contributor

Previews, as seen when this build job started (b2ee676):
WebGPU | IDL
WGSL
Explainer

@kainino0x
Copy link
Contributor

#1069

@kdashg
Copy link
Contributor

kdashg commented Jun 21, 2021

(in talking with @kainino0x @Kangz and @kvark, I think this is where we ended up feeling:)

Ideally we can require maxColorAttachments: 8, and probably don't need it as a limit, since no one seems to be offering >8.

We might want maxVertexOutputComponents and maxFragmentInputComponents to match. We're unsure about merging them into one var in case adding new shader stages causes a merged name to get confusing or obsolete.
E.g., in GL, MAX_TEXTURE_IMAGE_UNITS ended up being for frag shaders only, vs MAX_VERTEX_TEXTURE_IMAGE_UNITS and MAX_COMBINED_TEXTURE_IMAGE_UNITS.

@kvark
Copy link
Contributor

kvark commented Jun 21, 2021

I think we should unite the in/out location limits. Quickly glancing over them on Android Vulkan shows 64 being a safe bet.

And yes, max color attachments should just be a hard limit in the spec.

The minUniformBufferOffsetAlignment and minStorageBufferOffsetAlignment limits could be a bit dangerous, but I'm hoping that people realize they are opting into non-portability if they request them lower than 256.

@kdashg
Copy link
Contributor

kdashg commented Jun 21, 2021

Maybe we should put 🌶️ or 🌶️🌶️ next to the spicier limits with respect to portability, like on a restaurant menu. :)

@Kangz
Copy link
Contributor Author

Kangz commented Jun 21, 2021

We might want maxVertexOutputComponents and maxFragmentInputComponents to match. We're unsure about merging them into one var in case adding new shader stages causes a merged name to get confusing or obsolete.

For the record, we also discussed whether we can believe Metal's 60 limitations for fragment shaders or not given use of the position builtin would probably cost 0.

The minUniformBufferOffsetAlignment and minStorageBufferOffsetAlignment limits could be a bit dangerous, but I'm hoping that people realize they are opting into non-portability if they request them lower than 256.

That should be the case for all limits.

spec/index.bs Show resolved Hide resolved
spec/index.bs Show resolved Hide resolved
spec/index.bs Outdated Show resolved Hide resolved
spec/index.bs Outdated Show resolved Hide resolved
spec/index.bs Show resolved Hide resolved
spec/index.bs Show resolved Hide resolved
spec/index.bs Outdated Show resolved Hide resolved
spec/index.bs Outdated Show resolved Hide resolved
@kainino0x
Copy link
Contributor

See #1865 about having limits which exist but aren't configurable.

Copy link
Contributor

@kainino0x kainino0x left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM modulo #1865 and #1069 (dropping support for < MTLGPUFamilyApple2)

spec/index.bs Show resolved Hide resolved
spec/index.bs Outdated Show resolved Hide resolved
spec/index.bs Outdated Show resolved Hide resolved
spec/index.bs Show resolved Hide resolved
spec/index.bs Show resolved Hide resolved
Copy link
Contributor

@kainino0x kainino0x left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still LGTM % #1865 and the open comment

{{GPUShaderModule}} entry-point.

<tr><td><dfn>maxComputeWorkgroupInvocations</dfn>
<td>{{GPUSize32}} <td>Higher <td>256
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For Metal, the workgroup size limit is per pipeline, not per device. In theory, any particular compute shader may have a lower maximum workgroup size than the device's maximum due to heavy resource usage.

I don't have any direct experience with this for Metal, so I'm just curious as to whether we are confident that we'll always be able to support a workgroup size of 256 for all possible compute shaders?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We discussed this a while ago in #275 but never resolved. In general pipeline compilation can fail for many different reasons (couldn't allocate registers, or too much spilling is another one). 256 is probably safe in most cases, but to be 100% safe developers should use createRenderPipelineAsync to know if the compilation succeeded.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I suppose there is no better option than to expect 256 to be supported on Metal pipelines :/

Copy link
Contributor

@kainino0x kainino0x Jun 23, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my understanding, Metal is actually strictly less restrictive than other platforms - all of the APIs can fail pipeline creation arbitrarily (e.g. because they couldn't manage to allocate all of the memory needed by the shader at a particular workgroup size). Instead, Metal does not require workgroup_size at compile time, so they let you create a pipeline, and then tell you what the maximum workgroup_size is afterward*. In other words, Metal fails softly (reduces the maximum workgroup size) while other APIs fail hard (fail to create the pipeline at all).

(* In practice, I suspect this only helps for compute shaders which do not use workgroup memory, as workgroup memory size is almost always dependent on workgroup size.)

@Kangz
Copy link
Contributor Author

Kangz commented Jun 23, 2021

Still LGTM % #1865 and the open comment

Do we need to block on #1865? Maybe we can revisit later?

Copy link
Contributor

@kainino0x kainino0x left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree no need to block on #1865, but have more comments

spec/index.bs Outdated Show resolved Hide resolved
spec/index.bs Outdated Show resolved Hide resolved
{{GPUShaderModule}} entry-point.

<tr><td><dfn>maxComputeWorkgroupInvocations</dfn>
<td>{{GPUSize32}} <td>Higher <td>256
Copy link
Contributor

@kainino0x kainino0x Jun 23, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my understanding, Metal is actually strictly less restrictive than other platforms - all of the APIs can fail pipeline creation arbitrarily (e.g. because they couldn't manage to allocate all of the memory needed by the shader at a particular workgroup size). Instead, Metal does not require workgroup_size at compile time, so they let you create a pipeline, and then tell you what the maximum workgroup_size is afterward*. In other words, Metal fails softly (reduces the maximum workgroup size) while other APIs fail hard (fail to create the pipeline at all).

(* In practice, I suspect this only helps for compute shaders which do not use workgroup memory, as workgroup memory size is almost always dependent on workgroup size.)

Copy link
Member

@toji toji left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@toji
Copy link
Member

toji commented Jun 28, 2021

Merging as discussed with Kai in the editor's meeting today.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants