Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MSL: Implement geometry shaders using object and mesh stages. #2200

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

js6i
Copy link
Contributor

@js6i js6i commented Sep 15, 2023

That's a cleaned up and updated implementation of geometry shaders that we shipped with CrossOver 23. Marked as a draft, because there's still a couple hacks that I want to revisit. There's obviously room for improvements, but I'm creating the PR to get the conversation about upstreaming started, and in particular to get comments on what would need to be changed before it could be merged.

Vertex shaders get compiled as [[object]] and spawned over 1D grid of threads, one per primitive. There's a rudimentary vertex loader, we then run the original vertex function over the vertices, and put transformed vertices into [[payload]]. At the end a single mesh thread is dispatched.

The geometry shader goes in the mesh stage. The wrapper just wrangles the payload into a transposed format that the geometry shader expects and runs it. There's an extra object of type spvMeshStream passed to it, which implements emitting vertices etc.

@js6i js6i changed the title Implement geometry shaders using object and mesh stages. MSL: Implement geometry shaders using object and mesh stages. Sep 21, 2023
@js6i js6i force-pushed the geometry_shaders branch 3 times, most recently from 3dbf356 to fa702ce Compare September 21, 2023 09:58
@js6i js6i marked this pull request as ready for review September 21, 2023 11:12
spirv_cross_c.h Outdated
SPVC_MSL_VERTEX_FORMAT_OTHER = SPVC_MSL_SHADER_VARIABLE_FORMAT_OTHER,
SPVC_MSL_VERTEX_FORMAT_UINT8 = SPVC_MSL_SHADER_VARIABLE_FORMAT_UINT8,
SPVC_MSL_VERTEX_FORMAT_UINT16 = SPVC_MSL_SHADER_VARIABLE_FORMAT_UINT16,
SPVC_MSL_SHADER_INPUT_FORMAT_OTHER = SPVC_MSL_SHADER_VARIABLE_FORMAT_OTHER,
SPVC_MSL_SHADER_INPUT_FORMAT_UINT8 = SPVC_MSL_SHADER_VARIABLE_FORMAT_UINT8,
SPVC_MSL_SHADER_INPUT_FORMAT_UINT16 = SPVC_MSL_SHADER_VARIABLE_FORMAT_UINT16,
SPVC_MSL_SHADER_INPUT_FORMAT_ANY16 = SPVC_MSL_SHADER_VARIABLE_FORMAT_ANY16,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a breaking change, isn't it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I restored the any's, if anything, to keep existing tests from breaking.

@@ -319,6 +326,7 @@ class CompilerMSL : public CompilerGLSL
uint32_t shader_input_buffer_index = 22;
uint32_t shader_index_buffer_index = 21;
uint32_t shader_patch_input_buffer_index = 20;
uint32_t draw_info_index = 20;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Conflict with shader_patch_input_buffer_index?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, though tessellation is not yet supported together with geometry shaders. 19 felt a bit low, but I guess it can default to that.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, though tessellation is not yet supported together with geometry shaders

what is the plan to implement this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no concrete plan, partly because, as far as I know, we have not seen a game that needs it. I'm guessing that we'll want to bake tessellation programs into the object stage and generate a tessellated payload (i.e. multiple primitives instead of just one), and then spawn a larger mesh thread grid over it. But yes, it needs to be hashed out.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seeing as MoltenVK advertises tessellation, that needs to be solved before geometry shaders could be merged… it doesn’t really matter what current games aren’t using, the reality is that these features are in the specifications, expected to work together, and presumably tested in CTS.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a fine point to raise in the corresponding MoltenVK PR, and I don't completely disagree, but I see no reason to delay upstreaming of useful, if incomplete, functionality. It should be fine spec-wise if MoltenVK decides to just not advertise geometry shaders until it makes sense to, and on our side that will lift the burden of rebasing until it's done.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just gonna leave this here:

Screenshot_2023-06-11-04-58-43-84_40deb401b9ffe8e1df2f1cc5ba480b12.jpg

spirv_msl.hpp Outdated
@@ -505,6 +513,14 @@ class CompilerMSL : public CompilerGLSL
// Note: Only Apple's GPU compiler takes advantage of the lack of coherency, so make sure to test on Apple GPUs if you disable this.
bool readwrite_texture_fences = true;

// Compile for use with a geometry shader. If set, vertex shaders will be compiled as [[object]]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Weird indent.

spirv_msl.cpp Outdated
@@ -36,6 +36,21 @@ static const uint32_t k_unknown_location = ~0u;
static const uint32_t k_unknown_component = ~0u;
static const char *force_inline = "static inline __attribute__((always_inline))";


static bool builtin_is_per_primitive_mesh_output(BuiltIn builtin) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Brace on next line

Copy link
Contributor

@HansKristian-Work HansKristian-Work left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not seeing any tests here, so that needs to be done first before I'll look at this.

Similar to the Xfb PR, please add a more detailed description of what the implementation strategy is, how the code is supposed to look, what the corner cases are.

Can this implementation handle triangle strip + restart for example? What about tessellation + geometry shader? Is performance any good, or is this purely a feature to hack around game compat? Does it pass CTS?

@js6i js6i force-pushed the geometry_shaders branch 2 times, most recently from fbb0d4c to 1ca83d8 Compare October 5, 2023 14:03
@js6i
Copy link
Contributor Author

js6i commented Oct 5, 2023

The strategy here was to get some useful results in terms of running games with CrossOver, and later refine and get the CTS passing. Right now hardly any geometry test can pass, because there's some problem with handling the gl_in block, which happens to not occur with SPIRV generated by vkd3d-shader. That's the first thing I want to look into next. This is also why the test I added does not use gl_in[i].gl_Position. There's no support for geometry+tessellation, primitive restart (there's a simple check, but it has not been tested), and probably other features which did not come up in our testing. Bugs are expected.
Performance should be reasonable, there's nothing excessive going on, and the validity of this general approach was confirmed by our contacts at Apple. I'm not sure however if this is how they ended up doing things in the game porting toolkit - this is something to look into.

@alyssarosenzweig
Copy link

alyssarosenzweig commented Nov 13, 2023

As I previously said, this implementation strategy is a non-starter with tessellation, which you already support and therefore must work, as well as with transform feedback, which is closely related and has work in progress code with a similar set of issues. Geometry shaders and transform feedback have complex interactions and it’s not possible to properly implement one without considering the other.

(This is putting aside the issues with this implementation that could be fixed with a lot of work. Primitive restart for example, is very difficult to handle with certain topologies.)

In general, now that there exists conformant drivers supporting this functionality for the hardware families supported by metal, knowingly merging/shipping broken implementations hurts the wider open graphics ecosystem. There is no fundamental reason that a conformant implementation of geometry shaders — including with tessellation and transform feedback — could not be implemented on top of metal, so I don’t see a “portability” reason to skip the hard problems. But to do so would require a substantially different approach than what is proposed here, and so merging this would be a step back for correctness and for the ecosystem.

As such, I don’t think this should be merged in its current form. Sorry.

@js6i
Copy link
Contributor Author

js6i commented Nov 21, 2023

As I previously said, this implementation strategy is a non-starter

I must have missed that, as I'm unaware of any reasons why it would be impossible to implement the missing features on top of this PR. I'd probably agree with your objection if that was substantiated, otherwise your argument makes not much sense to me - requiring a fully featured implementation in a single PR seems unnecessary, regardless if there exists a conforming implementation somewhere else or not.

But to do so would require a substantially different approach than what is proposed here

Different how? I'm not familiar with how the driver you mentioned works, is there a write-up somewhere?

@HansKristian-Work let me know what's your stance on this.

@HansKristian-Work
Copy link
Contributor

The strategy here was to get some useful results in terms of running games with CrossOver, and later refine and get the CTS passing.

The motivation here seems to get something working for very specific games that only benefits CrossOver, rather than attempting to do something that is useful to the larger ecosystem. That is fine for your goals of course, but probably not for something upstream.

and later refine and get the CTS passing

I'm very wary here, because as me and Alyssa know from experience, getting working Tess/Geom/XFB working together is extremely difficult when you don't have it supported natively by the target API/hardware. The current Geom/XFB work seems to only consider these features in complete isolation. Their interactions are critical to have a conformant implementation. If you expose Tess and Geom, they have to work together, if you expose XFB, they have to work together with Tess/Geom.

It seems like the CTS concern is hand-waved away in a "we're going to make it work somehow, but we haven't thought it through yet". For more simple features, I'd be inclined to do iterations over many PRs, since monster PRs are horrible to review, since it's fairly obvious how things would progress over time and I would be able to see the finish line, but getting Tess/Geom/XFB/PrimitiveRestart working together is something that needs to be well thought out before you start committing to writing code. It's not a trivial problem to solve.

Apple seems to have gone the approach of doing it all in mesh shaders, not even bothering with their native tessellator (because it cannot be conformant with XFB), I'm not sure how, but it might look similar to RDNA2+ NGG style implementation. A correct solution is probably related to that.

The existing designs I've seen so far in these PRs can only work in the most trivial situations, which may be enough to get some specific games to run, but I'm a bit confused why this should be pushed upstream. I'm not particularly happy maintaining code paths I know are broken. And once this is pushed up, it has to be maintained since someone might start relying on it for their own engine. SPIRV-Cross Metal output is used by several projects, not just MoltenVK.

@HansKristian-Work
Copy link
Contributor

HansKristian-Work commented Nov 21, 2023

If I had infinite spare time I'd look into an open implementation that would take SPIR-V for vertex/tess/geom and fart out a task + mesh shader with whatever glue shaders are necessary. Wondering if that kind of approach is necessary to make tess/geom/xfb/indirect/restart and friends viable on Metal.

Then all of this would hinge on a solid mesh shader implementation in SPIRV-Cross, which is hard (since MSL mesh shaders are kinda incompatible with Vulkan/D3D12 >_<), but it is at least a useful goalpost.

@HansKristian-Work
Copy link
Contributor

Different how? I'm not familiar with how the driver you mentioned works, is there a write-up somewhere?

Other than the work being done in Mesa, I don't think there is anything open that can be looked at. This is typically the domain of proprietary suffering done by IHVs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants