Add GL_EXT_mesh_shader#640
Conversation
|
Re discussion on the mesa issue, Khronos does not "approve" new vendor extensions, though we do try and consistency-check them and make sure they're following the extension guidelines before we include them in the extension registry and hand out enum allocations. It's true that GL spec activity is very minimal within Khronos, but vendor and EXT extension development do not have to happen inside Khronos. So the first thing to ask is whether there is a commitment to implement this on the part of someone actually writing Mesa drivers. There's no point in publishing an extension spec in the registry if nobody has implemented it. Then, how and why does it differ from the NV extension? I see a slight signature change on one of the APIs but haven't tried to review the whole thing. Because of the close relationship between them, there should at least be a section down around the "Interactions" discussing the things that are the same, and those that had to be changed, and why. Would it be possible to implement the NV extension as it stands today on your target GPUs, and then add a really small extension on top of that to accommodate the changed signature, rather than duplicate so much of that language? BTW, when promoting an extension we keep the enum values unchanged so long as they are indistinguishable semantically from the point of view of the driver they are passed to. Only if there's a need to behave differently depending on which extension is being used would the enum value need to change. |
Thanks for the explanation.
Yeah, I'm going to implement it in mesa if it's accepted.
The difference with NV extension has been listed in the issue Q&A:
It's not possible to stack a new extension on NV. Because the NV extension interface (mostly GLSL part) is not suitable for other GPU vendors, that's why Vulkan created VK_EXT_mesh_shader. We can implement NV extension with many ugly workaround in driver, but it will hurt performance:
The runtime API part mostly come from the VK_EXT_mesh_shader to leverage the existing agreement made by different GPU vendors. I can keep the enum value which is same as NV extension and assign a fake value for the new ones. |
|
Hi,
Yes, there is interest in implementing it in RadeonSI as well as in Zink (which would work on top of the Vulkan EXT_mesh_shader exposed by the underlying Vulkan driver).
Same as the Vulkan NV vs. EXT extensions. In a nutshell, the NV extension makes it impossible to implement mesh shaders with reasonable performance on other vendors's HW; and EXT fixes that. Furthermore, EXT is better aligned with D3D12 mesh shaders and therefore benefits developers by providing a more familiar programming model. If you are interested in the exact details, they have been discussed in the Vulkan EXT_mesh_shader blog post and also on the spec MR here, among other places. This comment in the Mesa repo goes through the main issues with implemeing the NV extension on HW that wasn't designed for it. |
|
I'd like to see the mesa implementation at least well underway before this is released, but I think it's a great addition to the ecosystem. Bringing cross-vendor support to GL mesh shading will enable things like nvidium to finally run on more platforms. |
|
Then can the numbers be allocated first, so that I can update headers and start implementation? |
|
Yeah that seems good. @oddhack do you take care of that or am I supposed to do something? |
@yuq how many do you need? We allocate in blocks of 16. I think I counted 62 enums in the spec as it stands, so I can give you a block of 64 if you need that many (the new bit values not included in that total since they are semantically in a different namespace). |
|
I need 22, aligned to 16 is 32. I reused some enum numbers from GL_NV_mesh_shader, only those begin with 0xF need to be allocated. How about the extension serial number (I fake to 1024)? Do they have to be allocated when release? |
Done, see d8fdb8d (enums 0x9740-0x975F). The extension number is assigned when we publish. It isn't actually used as anything but an ordering mechanism. |
|
OK, thanks. |
ab5160e to
d5abc46
Compare
fc7aafc to
fbd56ad
Compare
|
We are 100% in support of this proposal and are excited about working with this functionality. |
|
I've done the implementation in mesa for AMD GPU. Next I'm going to upstream the code while giving it more test. https://gitlab.freedesktop.org/yuq825/mesa/-/commits/topic/mesh-shader |
|
@yuq WG has some review comments pending. Also will wait to ship until nvidium is at least semi-working with this (pending) to ensure things are usable as expected. |
I think nvdium only works on nvidia cards? Switching from nv's mesh shader implmentation to this is enough to make it run on amd cards? |
MCRcortex
left a comment
There was a problem hiding this comment.
Main thing is probably rasterization order and if possible having driver preferences on fast path payload sizes
|
|
||
| * MESH_PREFERS_COMPACT_PRIMITIVE_OUTPUT_EXT, TRUE if the implementation | ||
| will perform best if there are no unused primitives in the output array. | ||
|
|
There was a problem hiding this comment.
It may be useful to have MAX_PREFERRED_PAYLOAD_SIZE_EXT etc for optimial task to mesh payload size, e.g. on nvidia this is 128 bytes for the fast path if remembering correctly
There was a problem hiding this comment.
Usually these optimized parameter is in application or game engine. I added these parameter just in order to forking VK_EXT_mesh_shader. Of course MAX_PREFERRED_PAYLOAD_SIZE_EXT can be added if required, but VK_EXT_mesh_shader does not have this parameter which should make GL-VK translation layer always return max payload size for this.
There was a problem hiding this comment.
I don't think that a MAX_PREFERRED_PAYLOAD_SIZE_EXT makes much sense here. All GPU manufacturers suggest to use as little payload as possible, so the "preferred" amount would be zero.
I also don't think it's a good idea to deviate from Vulkan's VkPhysicalDeviceMeshShaderPropertiesEXT. We don't want to reinvent the wheel here.
e.g. on nvidia this is 128 bytes for the fast path if remembering correctly
That sounds like an implementation detail for NVidia and even they didn't feel like it's worth including that in the NV extension.
|
Are there any guarantees about how/where the mesh shader workgroups are launched? Know it would be hardware dependent but as an example, launching many mesh tasks from a one task shader and few meshes from other task shaders perform significantly worse than a roughly even distribution of mesh tasks? Are there also any guarantees about how work is dispatched from the mesh shader to the remaining raster pipeline? That is, if a single mesh shader workgroup from a set of mesh shader workgroups (dispatched from a task shader) takes significantly longer to complete, would this block the other workgroups from dispatching work to the raster pipeline? (asking due to possibility of doing compute raster in the mesh shader while still dispatching larger tris to the hw raster pipeline, which would also enable higher gpu sillicon utilization and hopefully maximize full throughput) |
@Headcrabed As far as I see the author of nvidium @MCRcortex is here with us, so I will interpret his presence as being interested in porting nvidium to use the EXT mesh shaders instead of the NV extension. Assuming no other NVidia specifics are used by nvidium, it should then work on other GPUs too.
@Ristovski Previous GPUs don't have the hardware capability to implement mesh shaders. Most notably, only GFX10.3 and newer support per-primitive outputs. |
@MCRcortex Not sure if this thread is the right one to discuss GPU specific implementation details, but I'm happy to answer your questions about how it works at least on AMD HW. I reached out to you on your Discord. |
|
With regards to rasterization order: it seems that both AMD and NVidia do guarantee the "strict" rasterization order in their currently released GPUs. I haven't got any info about other hardware vendors yet. Considering that the Vulkan spec is different from the proposal as well as different from the D3D12 spec, I asked for clarification on the Vulkan spec to make sure whether the current spec is what was intended. For consistency between OpenGL and Vulkan, I suggest to wait until that is resolved before we move forward here. |
|
Am doing some work on mesh shaders now and had some other questions, would it be possible to get a am doubtful of being able to get a |
Yes. GLSL spec described it: https://github.com/KhronosGroup/GLSL/blob/main/extensions/ext/GLSL_EXT_mesh_shader.txt#L964 |
|
Oh awsome, ty, missed that in the spec |
|
Have a project that should be more easily to test this extension with (unfortunately am missing hardware todo so (dont have any amd hardware)) |
|
How does this interact with ARB_separate_shader_objects? There are no explicit notes in the spec here for it. |
ARB_separate_shader_objects is added to OpenGL 4.1, this extension is written on OpenGL 4.6 spec, so no explicit notes on ARB_separate_shader_objects interaction. There's some notes to add mesh shader stages for ARB_separate_shader_objects introduced spec words: |
|
Ah, thanks, I missed that. |
b161756 to
44c5648
Compare
|
Talked a bit about this with @zmike, however it should also be raised here, how does mesh shaders interact with |
The way I understand it, it should work the same as for any other pre-rasterization stage. Why should it be ignored? EDIT: I'd like it to be consistent with Vulkan. Does Vulkan ignore it? |
|
its not defined in the vulkan spec from what do understand (hence zmike requesting clarification).
furthurmore, if rasterization discard is enabled a fragment shader is not strictly required to even be attached (from what have read, this is probably incorrect however) |
|
With Vulkan the rasterization discard state still applies when drawing with mesh shaders. |
This is an OpenGL extension forking VK_EXT_mesh_shader to provide OpenGL mesh shader functionality.
Required by NVIDIA.
https://chromium.googlesource.com/external/github.com/KhronosGroup/OpenGL-Registry.git/+log/5bae8738b23d..9cb90ca4902d 2026-03-19 oddhack@sonic.net Update headers 2026-03-19 35645466+Guy1524@users.noreply.github.com Add MESA_map_buffer_client_pointer. (#678) 2026-03-19 xPaw@users.noreply.github.com Fix typos in ATI_meminfo (#679) 2026-03-19 3479527+lexaknyazev@users.noreply.github.com Fix glGetFixedv parameter name (#677) 2026-03-19 3479527+lexaknyazev@users.noreply.github.com Add more XML groups (#674) 2026-03-19 carl@astholm.se Fix profile-specific definitions for EXT_texture_sRGB enums (#673) 2026-01-26 oddhack@sonic.net Update generated dates on headers after recent PR 2026-01-26 109496735+PanGao-h@users.noreply.github.com Fix inconsistent names in two huawei extensions (#671) 2026-01-19 oddhack@sonic.net Update gl3[12].h since some formal parameters changed their name a while back. 2025-10-23 oddhack@sonic.net updates 2025-10-23 64510454+Wang-JingWen@users.noreply.github.com Add desktop OpenGL support for GL_EXT_fragment_shading_rate (#663) 2025-10-23 165082818+AshishMat@users.noreply.github.com EXT_multisampled_render_to_texture2: Clarify discard behavior for multisample depth/stencil renderbuffer (#661) 2025-10-23 3479527+lexaknyazev@users.noreply.github.com Add EXT_shader_texture_samples (#666) 2025-10-22 oddhack@sonic.net Regenerate headers from recent changes 2025-10-22 michael.blumenkrantz@gmail.com add glcore support to most GL_EXT_texture_sRGB formats (#662) 2025-10-22 ashley.smith@collabora.com Add GL_EXT_shader_realtime_clock (#664) 2025-10-09 oddhack@sonic.net Update headers for new extension and fix regexp strings in Python scripts 2025-10-09 yuq825@gmail.com Add GL_EXT_mesh_shader (#640) 2025-09-17 109496735+PanGao-h@users.noreply.github.com Add vendor extensions HUAWEI_shader_binary and HUAWEI_program_binary (#659) 2025-08-23 oddhack@sonic.net Overdue copyright date updates 2025-08-23 109496735+PanGao-h@users.noreply.github.com Add missing registry.py contents for OpenGL ES extension GL_EXT_shader_clock (#658) 2025-08-09 109496735+PanGao-h@users.noreply.github.com Reserve enum range 0x9770-0x977f for HUAWEI (#657) 2025-07-07 julius_hager@hotmail.com Remove VertexBufferObjectUsage in favor of BufferUsageARB (#654) 2025-06-04 165082818+AshishMat@users.noreply.github.com Update gl.xml (#653) 2025-05-27 3479527+lexaknyazev@users.noreply.github.com Fix XML for GL_OES_EGL_image_external (#650) 2025-05-27 daporkchop@daporkchop.net Fix interactions naming the wrong extension in ARB_direct_state_access (#652) 2025-05-27 michael.blumenkrantz@gmail.com OES_draw_elements_base_vertex: clarify EXT suffix (#651) 2025-04-14 michael.blumenkrantz@gmail.com update dependency info for draw_buffers_indexed extensions (#645) 2025-04-14 michael.blumenkrantz@gmail.com remove GL_EXT_direct_state_access from some ES EXT extensions (#646) 2025-04-14 ashley.smith@collabora.com Add GL_EXT_shader_clock for OpenGL ES (From ARB_shader_clock) (#648) 2025-03-21 rlocatti@nvidia.com Remove plural bindings in GL_ARB_shader_texture_image_samples (#637) 2025-03-21 robdclark@gmail.com Add GL_MESA_texture_const_bandwidth (#643) 2025-01-30 oddhack@sonic.net Update headers for recent XML change 2025-01-30 julius_hager@hotmail.com Add len to glGetObjectLabelKHR (#623) 2025-01-30 dev@lynxeye.de Expose NV_texture_barrier on GLES2 (#639) 2025-01-30 michael.blumenkrantz@gmail.com fix type casing in GL_EXT_YUV_target (#641) 2025-01-08 oddhack@sonic.net Reserve enums for GL_EXT_mesh_shader (KhronosGroup/OpenGL-Registry#640) 2024-11-06 gleese@broadcom.com Update issue text for EXT_texture_shadow_lod (#634) 2024-09-10 jan-harald.fredriksen@arm.com Update Status of ARM_shader_core_properties (#632) 2024-08-15 oddhack@sonic.net Regenerate headers after \!630 2024-08-15 contact@emersion.fr Add version define to gl2ext.h (#630) 2024-07-22 oddhack@sonic.net Update <enums> reservation comments and regenerate headers 2024-07-22 syoussefi@google.com Reserve 0x9720-0x973F for ANGLE (#626) 2024-07-17 jan-harald.fredriksen@arm.com GL_EXT_fragment_shading_rate: clarify state table (#618) 2024-07-17 3479527+lexaknyazev@users.noreply.github.com Clarify EXT_texture_norm16 read combinations (#625) 2024-06-19 oddhack@sonic.net Fix typo. 2024-06-05 kusmabite@gmail.com Update EXT_texture_format_BGRA8888.txt (#603) 2024-06-05 tysons@nvidia.com Clarify NV_polygon_mode extended entrypoints (#615) 2024-05-27 166864803+MatAshish@users.noreply.github.com Adds QCOM_ycbcr_degamma extension (#613) 2024-05-13 outofcontrol@users.noreply.github.com Update index_gl.php 2024-05-07 166864803+MatAshish@users.noreply.github.com Update vendor to "QCOM" and fix group for new enums (#610) 2024-05-06 166864803+MatAshish@users.noreply.github.com Allocate enums for Qualcomm (#609) 2024-03-28 1498135+alyssarosenzweig@users.noreply.github.com ARB_sample_locations: clarify gl_SamplePosition interaction (#607) 2024-03-28 4693344+oddhack@users.noreply.github.com Add missing glNamedFramebufferTextureMultiviewOVR to XML / headers (#608) 2024-03-14 michael.blumenkrantz@gmail.com EXT_clip_cull_distance: add GL_OES_shader_io_blocks interaction (#605) 2024-03-09 stonesthrow@users.noreply.github.com Allocate enums for Samsung (#606) 2024-02-10 oddhack@sonic.net Update ABI document to note it is obsolete and resolve dead links (per #601) 2024-01-04 jan-harald.fredriksen@arm.com Clarify wording of issue 7 of OES_EGL_image_external_essl3. (#595) 2023-12-06 34522114+Rytis-Stan@users.noreply.github.com Removed "len" XML attribute from "pointer" parameter declarations of "glVertexAttribIPointerEXT", "glVertexAttribLPointerEXT" and "glVertexAttribPointerARB" commands. The removal was done because the "pointer" parameter is not mean to act as an array, but simply as a relative integer offset when defining vertex attributes. An equivalent fix was done previously for the related commands: "glVertexAttribIPointer", "glVertexAttribLPointer" and "glVertexAttribPointer". (#598) 2023-12-06 jan-harald.fredriksen@arm.com Add GL_ARM_shader_core_properties. (#599) 2023-11-24 julius_hager@hotmail.com Fix TextureWrapMode enum group (#590) 2023-11-24 3479527+lexaknyazev@users.noreply.github.com Allow EXT_render_snorm with OpenGL ES 3.0 (#596) 2023-11-09 34522114+Rytis-Stan@users.noreply.github.com Removed "len" XML attribute from "pointer" parameter declarations of "glVertexAttribIPointer", "glVertexAttribLPointer" and "glVertexAttribPointer" commands. The removal was done because the "pointer" parameter is not mean to act as an array, but simply as a relative integer offset when defining vertex attributes (#592) 2023-11-09 ajax@redhat.com extensions: Add GL_MESA_sampler_objects (#591) 2023-09-29 oddhack@sonic.net Update headers / XML / extension index for new QCOM extensions 2023-09-29 32749368+tate-hornbeck@users.noreply.github.com Add QCOM_render_sRGB_R8_RG8 (#589) 2023-09-29 32749368+tate-hornbeck@users.noreply.github.com Add QCOM_texture_lod_bias (#588) 2023-09-29 michael.blumenkrantz@gmail.com update ARB_shader_viewport_layer_array and OES_viewport_array (#587) 2023-09-13 jajones@nvidia.com Document memory object <size> parameter requirements (#581) 2023-09-13 32749368+tate-hornbeck@users.noreply.github.com Clarify copyTex[Sub]Image for GL_QCOM_render_shared_exponent (#586) 2023-08-31 oddhack@sonic.net Update GLSL and ESSL specifications to revision 8 2023-07-29 oddhack@sonic.net Merge branch 'main' of github.com:KhronosGroup/OpenGL-Registry 2023-07-29 oddhack@sonic.net Update CoC 2023-03-31 34598518+dpkcareergreatness@users.noreply.github.com Add IMG_pvric_end_to_end_signature extension for GLSC 2.0 (#570) 2023-03-10 oddhack@sonic.net Update glext.h for !564 2023-03-10 carl@astholm.se Document GL_NV_half_float commands that require other extensions (#564) 2023-02-20 julius_hager@hotmail.com Rename group attribute for non GLenum parameters not corresponding to any enum group (#534) 2023-02-20 oddhack@sonic.net Regenerate headers for new extension 2023-02-20 simon@bl4ckb0ne.ca Add GL_EXT_framebuffer_blit_layers (#541) 2023-02-20 contact@emersion.fr Add NV_pack_subimage XML (#561) 2023-02-20 jan-harald.fredriksen@arm.com Reserve enum range starting at 0x96F0 for Arm. (#559) 2023-01-19 jan-harald.fredriksen@arm.com Clarified that ASTC 3D blocks can only be used with 3D targets, not 2D or cubemap arrays. (#554) 2023-01-19 sunserega2@gmail.com Remove TextureEnvParameter dup (#560) 2023-01-05 gregory.schlomoff@gmail.com Fix incorrect section number (#555) 2023-01-05 pokechu022@gmail.com Fix typo in WGL_EXT_create_context_es2_profile (#556) 2023-01-05 pokechu022@gmail.com Fix "the this" typos (#557) 2022-12-09 julius_hager@hotmail.com Enum fixes found while making #534 (#546) 2022-11-29 h0wy36@gmail.com [xml] Add missing len attribute to data parameters (#551) 2022-11-29 h0wy36@gmail.com [xml] Fix param group in DrawElements* commands (#550) 2022-11-29 sunserega2@gmail.com fix #545 (#548) If this roll has caused a breakage, revert this CL and stop the roller using the controls here: https://autoroll.skia.org/r/opengl-registry-dawn Please CC chouinard@google.com,webgpu-dev-team@google.com on the revert to ensure that a human is aware of the problem. To file a bug in OpenGL-Registry: https://github.com/KhronosGroup/OpenGL-Registry/issues/new To file a bug in Dawn: https://crbug.com/dawn To report a problem with the AutoRoller itself, please file a bug: https://issues.skia.org/issues/new?component=1389291&template=1850622 Documentation for the AutoRoller is here: https://skia.googlesource.com/buildbot/+doc/main/autoroll/README.md Bug: None Tbr: chouinard@google.com Change-Id: I7ad51296ba91f66b2de7697daf9e456613e0586b Reviewed-on: https://dawn-review.googlesource.com/c/dawn/+/302035 Commit-Queue: Dawn Autoroller <dawn-autoroll@skia-public.iam.gserviceaccount.com> Bot-Commit: Dawn Autoroller <dawn-autoroll@skia-public.iam.gserviceaccount.com>
This is an OpenGL extension forking VK_EXT_mesh_shader to provide OpenGL mesh shader functionality.
Numbers in the spec haven't been allocated, so use fake numbers for now. No header/XML updates in the PR either.
This extension is for the request of nvidium users to add OpenGL mesh shader support to drivers other than NVIDIA GPUs: