Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for shader storage buffer objects (SSBOs) #5950

Merged
merged 20 commits into from
Aug 15, 2023

Conversation

smoogipoo
Copy link
Contributor

@smoogipoo smoogipoo commented Jul 28, 2023

Prereqs:

SSBOs are a way to submit large amounts of data to GPUs.

While SSBOs are pretty well supported by modern GPUs1, this PR encapsulates two modes of operation under the "SSBO" umbrella - one which implements SSBOs directly, and a fallback mode which utilises UBOs. This encapsulation is provided by IRenderer.CreateShaderStorageBufferObject<T>(uboSize, ssboSize).

UBOs are generally limited to at least 16KiB23, though modern GL implementations frequently provide a larger size (usually up to 64KiB). SSBOs can generally be up to the size of available VRAM.

For the time being, this implementation provides no validation as to the size of the buffers, so it is left up to users to specify "safe" values as documented by the CreateShaderStorageBufferObject method. This may change with future work.

In general, however, users should instead use a secondary implementation provided in this PR - ShaderStorageBufferObjectStack<T>. This automates the heavy lifting process of binding multiple UBOs when the data exceeds the 16KiB limit, or when the total amount of data isn't known ahead of time.

Using either these implementations is somewhat complicated as the shader needs to implement both pathways, however a sample implementation of both is provided in TestSceneShaderStorageBufferObject, sh_SSBOTest.vs, and sh_SSBOTest.fs. This is deserving of its own wiki page.

The OSU_GRAPHICS_NO_SSBO environment variable may be set to 1 to globally disable SSBOs.

Footnotes

  1. 80% of these GPUs support GL_ARB_shader_storage_buffer_object.

  2. OpenGL: MAX_UNIFORM_BLOCK_SIZE

  3. D3D11: Each constant buffer can hold up to 4096 vectors; each vector contains up to four 32-bit values.

Now looks and behaves much more like the Veldrid implementation.

The extra call to `glUniformBlockBinding` is unlikely to be an issue -
although this is the main drawing function, it's not "hot" per-se (in
that it shouldn't receive more than 1000 draw calls in an extreme
setting).

SSBOs and UBOs don't share the same binding points - `g_GlobalUniforms`
and `g_ColourBuffer` (in the test scene's example) could (and in
practice, do) have an index of 0. What discerns them is the range type,
which is why we need to use `glUniformBlockBinding` for one and
`glShaderStorageBlockBinding` for the other.
Comment on lines +217 to +220
// ... and these are the same items from the first pushes above.
Assert.That(stack.CurrentBuffer[2], Is.EqualTo(size + 1));
Assert.That(stack.CurrentBuffer[1], Is.EqualTo(size));
Assert.That(stack.CurrentBuffer[0], Is.EqualTo(copiedItem1));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, I don't get this. These items were popped, so why are they still in the buffer in any way? I'd expect them to get overwritten?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SSBOStack never overwrites data because it doesn't know if a draw call has been launched for the previous data. I'm also unsure whether it would be more expensive to overwrite said data while it's in use by the GPU, in this case.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay I guess I somehow didn't read xmldoc so that's my bad.

I do feel like the current xmldoc is leaving things half-stated, because if I'm understanding correctly, you would generally call .Clear() at the start of every draw call to reclaim all of the buffer space and prevent runaway VRAM usage, right? I'd spell that out explicitly there.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that sounds about right... I do wonder if this is yet another one of those things that should be "automagically" handled for you like every other instance of ResetCounters() or whatever (VBOs/UBOs...).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've also added some XMLDoc to .Clear(), but if you have any better suggestions then I'm all ears. Proper documentation without requiring developers to have white-box knowledge about everything, is still one of the most difficult things for me.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's probably okay as is. The main reference as to how the thing should be used is the test, and it spells things out pretty clearly.

There are things you could do to force the consumers to do the clear every frame by API design but I'm not sure it's worth the hassle and implementation overheads.

@bdach
Copy link
Collaborator

bdach commented Aug 14, 2023

The veldrid prereq is merged, but it doesn't look like the veldrid bump made it here yet?

@peppy peppy self-requested a review August 15, 2023 05:15
@peppy
Copy link
Sponsor Member

peppy commented Aug 15, 2023

The veldrid prereq is merged, but it doesn't look like the veldrid bump made it here yet?

Probably need to cherry-pick d71f9c7 across to here?

@peppy peppy removed their request for review August 15, 2023 06:52
@bdach
Copy link
Collaborator

bdach commented Aug 15, 2023

Uh... possible bad news. On my box, on windows, upon opening the SSBO test scene, I get this:

Exception Info: System.Runtime.InteropServices.SEHException (0x80004005): External component has thrown an exception.
   at Vortice.Direct3D11.ID3D11DeviceContext.DrawIndexed(Int32 indexCount, Int32 startIndexLocation, Int32 baseVertexLocation)
   at osu.Framework.Graphics.Veldrid.VeldridRenderer.DrawVertices(PrimitiveTopology type, Int32 vertexStart, Int32 verticesCount) in D:\Open Source\osu-framework\osu.Framework\Graphics\Veldrid\VeldridRenderer.cs:line 601
   at osu.Framework.Graphics.Veldrid.Batches.VeldridVertexBatch`1.Draw() in D:\Open Source\osu-framework\osu.Framework\Graphics\Veldrid\Batches\VeldridVertexBatch.cs:line 161
   at osu.Framework.Graphics.Veldrid.Batches.VeldridVertexBatch`1.Add(T v) in D:\Open Source\osu-framework\osu.Framework\Graphics\Veldrid\Batches\VeldridVertexBatch.cs:line 116
   at osu.Framework.Tests.Visual.Graphics.TestSceneShaderStorageBufferObject.RawStorageBufferDrawNode.Draw(IRenderer renderer) in D:\Open Source\osu-framework\osu.Framework.Tests\Visual\Graphics\TestSceneShaderStorageBufferObject.cs:line 129
   at osu.Framework.Graphics.Containers.CompositeDrawable.CompositeDrawableDrawNode.Draw(IRenderer renderer) in D:\Open Source\osu-framework\osu.Framework\Graphics\Containers\CompositeDrawable_DrawNode.cs:line 204

Wonderfully unspecific, of course, looks to be a generic access violation. Not sure how do I even find out what this is exactly.

Co-authored-by: Dan Balasescu <smoogipoo@smgi.me>

With the Direct3D11 surface active, opening
`TestSceneShaderStorageBufferObject` would crash with:

    Exception Info: System.Runtime.InteropServices.SEHException (0x80004005): External component has thrown an exception.
       at Vortice.Direct3D11.ID3D11DeviceContext.DrawIndexed(Int32 indexCount, Int32 startIndexLocation, Int32 baseVertexLocation)
       at osu.Framework.Graphics.Veldrid.VeldridRenderer.DrawVertices(PrimitiveTopology type, Int32 vertexStart, Int32 verticesCount) in D:\Open Source\osu-framework\osu.Framework\Graphics\Veldrid\VeldridRenderer.cs:line 601
       at osu.Framework.Graphics.Veldrid.Batches.VeldridVertexBatch`1.Draw() in D:\Open Source\osu-framework\osu.Framework\Graphics\Veldrid\Batches\VeldridVertexBatch.cs:line 161
       at osu.Framework.Graphics.Veldrid.Batches.VeldridVertexBatch`1.Add(T v) in D:\Open Source\osu-framework\osu.Framework\Graphics\Veldrid\Batches\VeldridVertexBatch.cs:line 116
       at osu.Framework.Tests.Visual.Graphics.TestSceneShaderStorageBufferObject.RawStorageBufferDrawNode.Draw(IRenderer renderer) in D:\Open Source\osu-framework\osu.Framework.Tests\Visual\Graphics\TestSceneShaderStorageBufferObject.cs:line 129
       at osu.Framework.Graphics.Containers.CompositeDrawable.CompositeDrawableDrawNode.Draw(IRenderer renderer) in D:\Open Source\osu-framework\osu.Framework\Graphics\Containers\CompositeDrawable_DrawNode.cs:line 204

After some _extensive_ probing of this opaque error, an ETW trace turned
up the following debug message:

    ID3D11DeviceContext::DrawIndexed: The Shader Resource View in slot 0
    of the Pixel Shader unit was not created with
    the D3D11_BUFFEREX_SRV_FLAG_RAW flag, however the shader expects a
    RAW Buffer. This mismatch is invalid if the shader actually uses
    the view (e.g. it is not skipped due to shader code branching).

As per the documentation (https://learn.microsoft.com/en-us/windows/win32/direct3d11/overviews-direct3d-11-resources-intro#raw-views-of-buffers):

    You can think of a raw buffer, which can also be called a byte
    address buffer, as a bag of bits to which you want raw access, that
    is, a buffer that you can conveniently access through chunks of one
    to four 32-bit typeless address values.

The HLSL shader code emitted by the cross-compiler is using one of those
access methods, namely `Load4()`.

To resolve, specify `rawBuffer: true` when creating the SSBO
(in the `UseStructuredBuffers` path, which indicates they're supported).
This flag only has an effect on D3D11, so it should come with no
negative effects on other platforms.
@bdach bdach merged commit 91b82bc into ppy:master Aug 15, 2023
11 of 13 checks passed
@smoogipoo smoogipoo deleted the shader-storage-buffer-objects branch September 11, 2023 02:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants