Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ability to pass compute buffers to vertex/fragment shaders #6989

Open
jtsorlinis opened this issue May 31, 2023 · 13 comments
Open

Add ability to pass compute buffers to vertex/fragment shaders #6989

jtsorlinis opened this issue May 31, 2023 · 13 comments

Comments

@jtsorlinis
Copy link

jtsorlinis commented May 31, 2023

Describe the project you are working on

Procedural mesh generation in compute shaders.

Describe the problem or limitation you are having in your project

The buffers containing the mesh data have to be read back and processed on the CPU into vertex buffers.

Describe the feature / enhancement and how it helps to overcome the problem or limitation

I think it would be really helpful to allow storage buffers to be bound to and also accessed in vertex/fragment shaders.

This would greatly expand the potential uses for compute shaders as the current implementation requires CPU readback which can be a significant performance bottleneck.

Describe how your proposal will work, with code, pseudo-code, mock-ups, and/or diagrams

A storage buffer can be bound to a vertex/fragment shader with a function similar to set_shader_parameter eg

var buffer := rd.storage_buffer_create(input_bytes.size(), input_bytes)
myShaderMaterial.set_shader_parameter("myBuffer", buffer) # something like this

Which could then be used in said shader to allow procedural drawing eg.

shader_type spatial;

layout(set = 0, binding = 0, std430) restrict buffer bufferType {
    vec4 positions[];
} myBuffer;

void vertex() {
  VERTEX += myBuffer.positions[VERTEX_ID];
}

void fragment() {
  COLOR = vec4(0.4, 0.6, 0.9, 1.0);
}

If this enhancement will not be used often, can it be worked around with a few lines of script?

I don't believe it's possible to implement this as a plugin as its part of the core rendering pipeline.

Is there a reason why this should be core and not an add-on in the asset library?

As above, I don't think it's possible.

@Ali32bit
Copy link

this should absolutely be a thing. its such a huge bottleneck and de-service to what godot can do in the right hands if we dont have "render textures" and "compute buffers" that we can pass to shaders.

this would allow for : real time interactive shaders such as dynamic wind and dynamic water . real time mirrors or planar reflections , much more efficient interactive TV screens or view ports or anything else that would require rendering part of the game and making it a texture somewhere else. currently implementing any of this stuff is PAINFULLY slow. adding just 3 viewports for passing compute shaders can kill the framrate and is super unreliable as it depands on setting up the node paths every time you want to load assets that use such shaders. which tend to break completely when you load the scene somewhere else.

@NomiChirps
Copy link

+1! I'm working on a project that requires simulating an arbitrarily deformable/cuttable/etc object with high precision. I'm currently using a 512x512x512 3D texture to store an SDF representing the object's surface. Writing a fragment shader to render it was "no problem" (ha ha), but making changes to it from the CPU is extremely slow. I'd love it if Godot could make it easy to use a compute shader to directly modify such a buffer on the GPU.

@Daedalus1400
Copy link

Let's extend this to it's logical conclusion: All shaders should be able to access compute buffers, and compute shaders should be able to access render buffers.

I'm currently working on a compute shader based particle simulation, and the frame rate is terrible for large simulations despite neither my CPU nor GPU being taxed. It's bottle necked by writing the position data to a MultiMeshInstance3D from the CPU. If I could access the position buffer from inside a particle shader, the particles would position themselves with no CPU overhead.

The complete memory isolation of gdshader scripts is a huge limitation.

@wojtekpil
Copy link

I totally agree on importance of this proposal. Without it the usages of compute shaders are very limited. With possibility of editing buffers and textures from compute shaders we create opportunity for custom GPU based culling systems, custom LOD systems (even for multimeshes - imagine that you could use LOD per instance). Add to it possibility of reading render buffers from CS and even post processing effects that are hard to achieve otherwise could be easily added and chained (e.g. gaussian blur for some effects needs 2-3 viewports, that could be chained in 2 simple compute shaders). It could also help to avoid issues with transparency of full screen post-processing planes in front of camera. Also it would probably make life easier for creating things like GPU painting applications or terrain editors, as we are not limited by one 8-bit viewport texture.

@Facundo15
Copy link

It must definitely be something integrated into Godot, I am working on a projectile simulation system in which I am calculating the movement, and a very primitive physics (rectangles and buffer of all rectangular shapes) to detect collisions and send them to the CPU.
But I use Particles to visualize the behavior and having to send everything to the CPU for then the particle shader became a very slow process.

@Facundo15
Copy link

@jtsorlinis I have a better way that can be implemented at the code level when you want to make that association.

Basically it is that the buffers are "shared" so that the Compute Shader can directly write uniforms as it wants within the godot shader and it can be something like

var buffer := rd.storage_buffer_create(input_bytes.size(), input_bytes)
material_shader.set_shared_uniform("unform_data", buffer_rid);

shader_type canvas_item;

shared uniform vec2 uniform_data;

This might be a simpler way to tell the shader code that the buffer will be shared between the compute shader and the godot shader (a simpler form of adding fo following Godot's simple philosophy).

I'm not an expert in glsl and shaders, so I don't know and the shared keyword could be used for this case.

@oxi-dev0
Copy link

oxi-dev0 commented Oct 5, 2023

I also agree with this proposal. I may be understanding this wrong, but in general I think it would be very beneficial for the engine to support sharing resources in a gpu -> gpu manner rather than requiring copying to the cpu. E.G in my blood surface system, I render to a framebuffer in order to generate a mask to use in a material. However, in the current 4.1.2 build there is no way to create the framebuffer from an ImageTexture that the material can sample from. I have to use a custom build of the engine (based on this fork huisedenanhai/godot@32a05a5 ) that allows me to get the render device id of the ImageTexture. The current "supported" method would be to create a texture with the render device and use that for the framebuffer, then at some point sync and copy the data from that texture onto the cpu, then copy it into the ImageTexture for the material to use, which is very slow and inefficient.

@kpietraszko
Copy link

@Facundo15 Keep in mind that uniforms have a much lower size limit compared to storage buffers. So for many cases they're not a solution.

@BenMcLean
Copy link

BenMcLean commented Jan 3, 2024

Because of this issue, I invested a good deal of time working out how to treat a texture uniform as if it was a storage buffer to "smuggle" my raw byte data onto the GPU as a workaround.
https://gist.github.com/BenMcLean/9327690b93690b8a92a921df003f7954
My eventual goal would be to do rendering from a sparse voxel octree in a fragment shader and a storage buffer would be the ideal way to send the octree data from the CPU to the GPU for that, but in the meantime, I'm going to be trying to fit my octree data inside a texture uniform.

@critopadolf
Copy link

Because of this issue, I invested a good deal of time working out how to treat a texture uniform as if it was a storage buffer to "smuggle" my raw byte data onto the GPU as a workaround.

https://gist.github.com/BenMcLean/9327690b93690b8a92a921df003f7954

My eventual goal would be to do rendering from a sparse voxel octree in a fragment shader and a storage buffer would be the ideal way to send the octree data from the CPU to the GPU for that, but in the meantime, I'm going to be trying to fit my octree data inside a texture uniform.

Haha nice to see someone else trying the same thing. I used a texture2drdarray where my struct fits onto a 2x2xN float texture. It sounds like yours is more generalized though I haven't looked over it.

@TokisanGames
Copy link

TokisanGames commented Jan 3, 2024

@BenMcLean You can do this more directly, reading and writing 32-bit uints on the CPU and in the shader. 32-bit floats are even easier. There's no need to use an rgba8 format and having to convert numbers and risk precision loss.

For a working example of transferring non-image data to the shader in production, we transfer a texture array full of up to 1GB (16k^2 32-bit) of bit packed uint data. You can look through the code for Terrain3D, our bit packed control map format, en/decoders, CPU writer, shader and here.


Regarding this ticket
compute -> vertex/fragment without copying to the CPU, doesn't Bastian's water compute demo already do this using a Texture2DRD? He says Instead of copying data from texture to texture to create this history, we simply cycle the RIDs. and in the code I don't see any functions that pull the Texture from compute to an Image. It just gets the texture RID from the compute shader and assigns it to the shader material uniform. @BastiaanOlij?

@kb173
Copy link

kb173 commented Jan 10, 2024

Regarding this ticket compute -> vertex/fragment without copying to the CPU, doesn't Bastian's water compute demo already do this using a Texture2DRD? He says Instead of copying data from texture to texture to create this history, we simply cycle the RIDs. and in the code I don't see any functions that pull the Texture from compute to an Image. It just gets the texture RID from the compute shader and assigns it to the shader material uniform.

This issue (as I understand it) is about passing vertex data, i.e. meshes, directly from compute to vertex shader to facilitate procedural mesh generation, right? So workarounds with textures don't work because we still can't create new vertices without involving the CPU.

@clayjohn
Copy link
Member

Just to add some context here. We already have plans to expose a way to create/update meshes using compute shaders without copying the data to the CPU and back #7209

Using individual SSBO's for each channel will be a bit too cumbersome I think and will be pretty limiting as it wouldn't allow you to create meshes totally on the GPU and then use them in rendering.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests