Skip to content
This repository has been archived by the owner on Oct 22, 2020. It is now read-only.

Entrypoint definition #69

Open
Kamayuq opened this issue Sep 21, 2018 · 6 comments
Open

Entrypoint definition #69

Kamayuq opened this issue Sep 21, 2018 · 6 comments

Comments

@Kamayuq
Copy link

Kamayuq commented Sep 21, 2018

Have you given any thought on how the binding model from the CPU part of rust could look like? I have been thinking about this and wanted to ask what would you think to have something along the lines of this on the GPU side:

#[spirv(DescriptorLayout)]
struct InputData
{
	#[spirv(ShaderVisible_PS ...)]
	diffuse_textures: &[Texture],  //dynamic (un)bounded array slice
	
	#[spirv(ShaderVisible_ALL ...)]
	some_constant: u32,
	
	...
}

#[spirv(Interpolants)]
struct Interpolants
{
	color : vec3,
	uv: vec2,
	#[spirv(NoInterpolation)]
	index: u32,
}

#[spirv(vertex)]
fn vertex_shader(..., input: InputData) -> Interpolants
{ ... }

#[spirv(OutputMerger)]
struct OutputData
{
	#[spirv(Blendmode_Additive, ...)]
	color : vec3
}

#[spirv(fragment)]
fn fragment_shader(... , interp: Interpolants, input: InputData) -> OutputData
{
 	let color = input.diffuse_textures[interp.index].Sample(interp.uv);
 	...
}

#[spirv(ShaderPipeline)]
struct VSPSPipeline
{
	descriptor: InputData
	blendmode: OutputData
	vertex: typeof(vertex_shader)
	pixel: typeof(pixel_shader)
}

and some code similar to this on the CPU to create and launch a draw:

fn main()
{	
	let some_textures : Vec<Texture> = get_textures();
	let some_constant = 42;
	let input_data = InputData::new(some_textures , some_constant);

	let rendertarget: RenderTarget<vec3> = get_rendertarget()
	let output_data = OutputData::new(rendertarget);
	
	let pipeline = VSPSPipeline::new(input_data, output_data, vertex_shader, fragment_shader);
	
	cmd_buffer.set_pipeline(pipeline);
	cmd_buffer.draw_indexed(...);
}

One reason behind this is that I would like to have a typesafe interface between CPU and GPU maybe even go further down the road of having units of measure and types like NormalMapTexture to bind against. And the other reason would be that I am really really sick of writing all that CPU binding code which is super super boring. I am not attached to any particular syntax maybe you have another proposal?

@MaikKlein
Copy link
Owner

I am not sure if you are aware but this is how it currently looks

#[spirv(fragment)]
fn fragment(
    frag: Fragment,
    uv: Input<N0, Vec2<f32>>,
    time: Uniform<N0, N0, f32>,
) -> Output<N0, Vec4<f32>> {
...
}

It is relatively similar to what you are proposing, I just haven't really looked at the cpu side.

I think most of this should be built on top of rlsl. Of course some things like #[spirv(NoInterpolation)] are still necessary.

I just explicitly set the binding/set, which gives you more flexibility but I think what you are poposing makes more sense.

#[spirv(DescriptorLayout)]
#[spirv(set = 0)]
struct InputData
{
	#[spirv(binding = 0)]
	some_constant: u32,

	#[spirv(binding = 1)]
	diffuse_textures: &[Texture],  //dynamic (un)bounded array slice
	...
}

Because you need that information anyways. Also most attributes can probably be hidden by a type

#[spirv(Interpolants)]
struct Interpolants
{
	color : vec3,
	uv: vec2,
	index: NonInterp<u32>,
}

I think this library should be developed in lockstep with an actual rendering library. I hacked together a framegraph to experiment more on the cpu side of things. Although at the moment I am spending most of my time on rlsl, there are a few things that have a much higher priority right now.

I am not sure that I like #[spirv(ShaderVisible_PS ...)] which can be accessed in the vertex and fragment shader in rlsl. We would have to check for the usage at compile time. Although this might still be the best alternative.

@Kamayuq
Copy link
Author

Kamayuq commented Sep 22, 2018

Yeah I have looked at what you have already and the GPU shader-entry-point definitions are already how everyone would do them. I am not a huge fan of bindpoints anymore just because I have written so many binding and plumbing code in my life that I have really grown sick of it. And now I see language integrated shading languages (aka. language inception) and I see the opportunity to finally get rid of it. This guy does something similar for ComputeShaders and I was in love the first time I saw it: aleacuda
One could still annotate the bindings like in your example (#[spirv(set = 0)], #[spirv(binding = 0)]) if someone wants to really write his or her own plumbing. I personally would just never use it if there is an automatic way, even it is not the most optimal layout. Little life anecdote: If we like that or not in case my rendering is not as optimal there will be a driver new hardware that fixes this for my anyways.

I really do like the idea of hiding interpolation properties behind types, as it is much more clean the way you propose!

I don't like the shader visibility flags either, maybe we should just drop them entirely and have a different input descriptor binding struct for each shader stage?

I have been working on a framegraph myself RDAG (excuse the ugliness of C++). At this point in time I only treat resource management in the graph. Binding is difficult to do concisely, because there are passes with many draws and many different pipelines. In my graph I easily could add a shader as another input type and generate that mapping, but that ultimately feels dissonant to me. What one could do instead is that the graph outputs the binding structure and than at bind or set pipeline time the ones with the same name and type are matched and uploaded automatically, even if it comes at some runtime cost. Maybe the bindfunction could be a macro at looks at the name of both structures (the shaderinput one, and what falls out of the framegraph) and just copies over the matching names.

@MaikKlein
Copy link
Owner

I have been working on a framegraph myself RDAG

I'll definitely check it out! I kinda run into some design issues e.g how I handle external resources like from an asset manager, or how I get data inside the passes. It is all a bit hacky right now.

I personally would just never use it if there is an automatic way, even it is not the most optimal layout. Little life anecdote: If we like that or not in case my rendering is not as optimal there will be a driver new hardware that fixes this for my anyways.

Interestingly when I started to write rlsl, everything was automatic but I moved to something more explicit. But I am definitely open for a more automatic solution.

I don't like the shader visibility flags either, maybe we should just drop them entirely and have a different input descriptor binding struct for each shader stage?

We might be able to compute them if rlsl knows about the pipelines, but I am not sure if that would be a good idea. Also we generally would need a way to move the data from rlsl to rustc.

At this point in time I only treat resource management in the graph. Binding is difficult to do concisely, because there are passes with many draws and many different pipelines.

The way I approach this is by having my passes do only one thing. A pass can only have one type of a pipeline, but you easily combine them like this

pub fn render_pass(fg: &mut Framegraph<Recording>, resolution: Resolution) {
    let triangle_compute = TriangleCompute::add_pass(fg);
    let triangle_data = TrianglePass::add_pass(fg, triangle_compute.storage_buffer, resolution);
    Presentpass::add_pass(fg, triangle_data.color);
}

I end up with way more nodes but I think that is a good thing because that gives me more room to reorder/merge the passes.

I also infer the vertex input layout and the descriptor layout from the types and then I runtime check it with the shader when I try to load it. Not perfect but it works right now. I definitely would like a more reliable bridge.

@Kamayuq
Copy link
Author

Kamayuq commented Sep 22, 2018

how I handle external resources like from an asset manager

Immutable resources (like textures and meshes for drawing) I would not manage at all. Just capture them in the execution lambda. Most of the resources in the graph are changing and are transient (e.g the graph manages their lifetime) And than there are things like the framebuffer, temporal AA history as well as cached shadowmaps that I would treat as "external". In my code I just import them into the graph but I don't do any lifetime management for them.

Also we generally would need a way to move the data from rlsl to rustc.

Isn't having the same data layout good enough? rlsl could just generate the bytecode for the shader and some include for rustc (literally using the same struct). Or where are the issue with that approach?

A pass can only have one type of a pipeline

We rejected such an approach as not practical because in the GameEngines I worked on we had thousands of artist created shaders and pipelines. And we would just end up with too much nodes that really don't convey any useful information. Unities ScriptableRenderPipeline solve part of rendering the meshes quite well. They build filters for culling etc. And I believe that those 3 problems are orthogonal:

  1. Shaderbinding and compilation
  2. Frame-resource management (mostly UAVs and Rendertargets)
  3. collecting and drawing meshes from the world

I definitely would like a more reliable bridge.

Unreal has the concept of Uniformbuffers (they are sort of Descriptor Tables) those are defined in the host language (in this case C++) and than they are literally printed in front of each shader.

One could do it also the other way around define it in the guest language and than import that description into the host. Either way is fine I think. Especially in this case where host (rust) and guest-language (rlsl) are so similar.

@MaikKlein
Copy link
Owner

Unities ScriptableRenderPipeline solve part of rendering the meshes quite well. They build filters for culling etc. And I believe that those 3 problems are orthogonal:

I definitely should look more closely at the scriptable renderpipeline, I only went over it very briefly.

We rejected such an approach as not practical because in the GameEngines I worked on we had thousands of artist created shaders and pipelines.

For me passes shouldn't convey much meaning when they are compiled, for example for me a pass could be a simple image copy. But you can of course combine many smaller passes into a bigger pass conceptually. At least that is the idea.

Isn't having the same data layout good enough? rlsl could just generate the bytecode for the shader and some include for rustc (literally using the same struct). Or where are the issue with that approach?

I meant generally. rlsl has much more information than rustc has and maybe some of that information could be used for codegen. For the layout we need to make sure that everything is aligned correctly, I know rust has a few attributes, but I need to look more closely at them. For example #[repr(C, align = "16")], might be good enough. But I also would want to make sure at compile time that everything has the correct layout. But this is trivial to check in rlsl.

Thanks for opening the issue, it is always good to see different perspectives and I agree that something more automatic would be very beneficial.

@Kamayuq
Copy link
Author

Kamayuq commented Sep 27, 2018

For me passes shouldn't convey much meaning when they are compiled, for example for me a pass could be a simple image copy.

To have another Pass for a single image copy is fine. In fact everything that has some involvement in native passes or transitions should be in its own pass. I was just meaning the mesh drawing passes where different objects in the world could have different materials: And there I have seen cases where the average material re-use was at around two draws, which would explode into thousand passes where reordering of them might not be a thing for efficiency reasons.

For the layout we need to make sure that everything is aligned correctly, I know rust has a few attributes, but I need to look more closely at them. For example #[repr(C, align = "16")], might be good enough.

I believe one can also implement it's own attribute and packing rules with procedural macros, if necessary. Or for the beginning one could keep the Constant buffer opaque and focus on there views as well as SRVs, Samplers and UAVs.

Thanks for opening the issue, it is always good to see different perspectives and I agree that something more automatic would be very beneficial.

You are welcome. I am not a super specialist myself but I will do some research as well. If I come up with something interesting I will let you know.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants