-
Notifications
You must be signed in to change notification settings - Fork 399
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for shader storage buffer objects (SSBOs) #5950
Conversation
Now looks and behaves much more like the Veldrid implementation. The extra call to `glUniformBlockBinding` is unlikely to be an issue - although this is the main drawing function, it's not "hot" per-se (in that it shouldn't receive more than 1000 draw calls in an extreme setting). SSBOs and UBOs don't share the same binding points - `g_GlobalUniforms` and `g_ColourBuffer` (in the test scene's example) could (and in practice, do) have an index of 0. What discerns them is the range type, which is why we need to use `glUniformBlockBinding` for one and `glShaderStorageBlockBinding` for the other.
osu.Framework/Graphics/OpenGL/Buffers/GLShaderStorageBufferObject.cs
Outdated
Show resolved
Hide resolved
osu.Framework.Tests/Visual/Graphics/TestSceneShaderStorageBufferObject.cs
Outdated
Show resolved
Hide resolved
// ... and these are the same items from the first pushes above. | ||
Assert.That(stack.CurrentBuffer[2], Is.EqualTo(size + 1)); | ||
Assert.That(stack.CurrentBuffer[1], Is.EqualTo(size)); | ||
Assert.That(stack.CurrentBuffer[0], Is.EqualTo(copiedItem1)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait, I don't get this. These items were popped, so why are they still in the buffer in any way? I'd expect them to get overwritten?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SSBOStack never overwrites data because it doesn't know if a draw call has been launched for the previous data. I'm also unsure whether it would be more expensive to overwrite said data while it's in use by the GPU, in this case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay I guess I somehow didn't read xmldoc so that's my bad.
I do feel like the current xmldoc is leaving things half-stated, because if I'm understanding correctly, you would generally call .Clear()
at the start of every draw call to reclaim all of the buffer space and prevent runaway VRAM usage, right? I'd spell that out explicitly there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that sounds about right... I do wonder if this is yet another one of those things that should be "automagically" handled for you like every other instance of ResetCounters()
or whatever (VBOs/UBOs...).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've also added some XMLDoc to .Clear()
, but if you have any better suggestions then I'm all ears. Proper documentation without requiring developers to have white-box knowledge about everything, is still one of the most difficult things for me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's probably okay as is. The main reference as to how the thing should be used is the test, and it spells things out pretty clearly.
There are things you could do to force the consumers to do the clear every frame by API design but I'm not sure it's worth the hassle and implementation overheads.
The veldrid prereq is merged, but it doesn't look like the veldrid bump made it here yet? |
Probably need to cherry-pick d71f9c7 across to here? |
Uh... possible bad news. On my box, on windows, upon opening the SSBO test scene, I get this:
Wonderfully unspecific, of course, looks to be a generic access violation. Not sure how do I even find out what this is exactly. |
Co-authored-by: Dan Balasescu <smoogipoo@smgi.me> With the Direct3D11 surface active, opening `TestSceneShaderStorageBufferObject` would crash with: Exception Info: System.Runtime.InteropServices.SEHException (0x80004005): External component has thrown an exception. at Vortice.Direct3D11.ID3D11DeviceContext.DrawIndexed(Int32 indexCount, Int32 startIndexLocation, Int32 baseVertexLocation) at osu.Framework.Graphics.Veldrid.VeldridRenderer.DrawVertices(PrimitiveTopology type, Int32 vertexStart, Int32 verticesCount) in D:\Open Source\osu-framework\osu.Framework\Graphics\Veldrid\VeldridRenderer.cs:line 601 at osu.Framework.Graphics.Veldrid.Batches.VeldridVertexBatch`1.Draw() in D:\Open Source\osu-framework\osu.Framework\Graphics\Veldrid\Batches\VeldridVertexBatch.cs:line 161 at osu.Framework.Graphics.Veldrid.Batches.VeldridVertexBatch`1.Add(T v) in D:\Open Source\osu-framework\osu.Framework\Graphics\Veldrid\Batches\VeldridVertexBatch.cs:line 116 at osu.Framework.Tests.Visual.Graphics.TestSceneShaderStorageBufferObject.RawStorageBufferDrawNode.Draw(IRenderer renderer) in D:\Open Source\osu-framework\osu.Framework.Tests\Visual\Graphics\TestSceneShaderStorageBufferObject.cs:line 129 at osu.Framework.Graphics.Containers.CompositeDrawable.CompositeDrawableDrawNode.Draw(IRenderer renderer) in D:\Open Source\osu-framework\osu.Framework\Graphics\Containers\CompositeDrawable_DrawNode.cs:line 204 After some _extensive_ probing of this opaque error, an ETW trace turned up the following debug message: ID3D11DeviceContext::DrawIndexed: The Shader Resource View in slot 0 of the Pixel Shader unit was not created with the D3D11_BUFFEREX_SRV_FLAG_RAW flag, however the shader expects a RAW Buffer. This mismatch is invalid if the shader actually uses the view (e.g. it is not skipped due to shader code branching). As per the documentation (https://learn.microsoft.com/en-us/windows/win32/direct3d11/overviews-direct3d-11-resources-intro#raw-views-of-buffers): You can think of a raw buffer, which can also be called a byte address buffer, as a bag of bits to which you want raw access, that is, a buffer that you can conveniently access through chunks of one to four 32-bit typeless address values. The HLSL shader code emitted by the cross-compiler is using one of those access methods, namely `Load4()`. To resolve, specify `rawBuffer: true` when creating the SSBO (in the `UseStructuredBuffers` path, which indicates they're supported). This flag only has an effect on D3D11, so it should come with no negative effects on other platforms.
Prereqs:
SSBOs are a way to submit large amounts of data to GPUs.
While SSBOs are pretty well supported by modern GPUs1, this PR encapsulates two modes of operation under the "SSBO" umbrella - one which implements SSBOs directly, and a fallback mode which utilises UBOs. This encapsulation is provided by
IRenderer.CreateShaderStorageBufferObject<T>(uboSize, ssboSize)
.UBOs are generally limited to at least 16KiB23, though modern GL implementations frequently provide a larger size (usually up to 64KiB). SSBOs can generally be up to the size of available VRAM.
For the time being, this implementation provides no validation as to the size of the buffers, so it is left up to users to specify "safe" values as documented by the
CreateShaderStorageBufferObject
method. This may change with future work.In general, however, users should instead use a secondary implementation provided in this PR -
ShaderStorageBufferObjectStack<T>
. This automates the heavy lifting process of binding multiple UBOs when the data exceeds the 16KiB limit, or when the total amount of data isn't known ahead of time.Using either these implementations is somewhat complicated as the shader needs to implement both pathways, however a sample implementation of both is provided in
TestSceneShaderStorageBufferObject
,sh_SSBOTest.vs
, andsh_SSBOTest.fs
. This is deserving of its own wiki page.The
OSU_GRAPHICS_NO_SSBO
environment variable may be set to1
to globally disable SSBOs.Footnotes
80% of these GPUs support
GL_ARB_shader_storage_buffer_object
. ↩OpenGL:
MAX_UNIFORM_BLOCK_SIZE
↩D3D11:
Each constant buffer can hold up to 4096 vectors; each vector contains up to four 32-bit values.
↩