Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Render target / render pass investigation #23

Closed
Kangz opened this issue Jun 28, 2017 · 2 comments
Closed

Render target / render pass investigation #23

Kangz opened this issue Jun 28, 2017 · 2 comments

Comments

@Kangz
Copy link
Contributor

Kangz commented Jun 28, 2017

Render targets / Render passes

Things overlooked: programmable sample position, layered rendering.

Basically:

  • All APIs require the sample count and formats of the attachments at pipeline creation time.
  • D3D12 is "immediate mode" and you can set render targets, clear, resolve, discard at any time.
  • Metal render-targets part of the encoder state and force clear, resolve and discards to be declared at encoder creation time. Probably because it helps mobile tilers a lot.
  • Vulkan is similar to Metal but adds a "renderpass" concept declaring multiple passes at the same time, so that some of the attachments can be kept in tile memory only.

D3D12

In D3D12 attachments are defined by Render-Target Views (RTV) descriptors living in a CPU RTV descriptor heap. These descriptors are created by ID3D12Device::CreateRenderTargetView and defined by a D3D12_RENDER_TARGET_VIEW_DESC. I didn't look at all the dimensions of texture, but you seem to be able to select the mip-level, array slice, and depth (for a 3D texture) to render to. For depth stencil things are similar but with Depth-Stencil views (DSV).

RTV and DSV descriptors are then used in ID3D12GraphicsCommandList::OMSetRenderTargets to set the current attachments, without any other information. Clearing, discarding and resolving resources are respectively done through ID3D12GraphicsCommandList::ClearRenderTargetView (and ID3D12GraphicsCommandList::ClearDepthStencilView), ID3D12GraphicsCommandList::DiscardResource and ID3D12GraphicsCommandList::ResolveSubresource.

At pipeline creation time, only the format of the RTV and DSV need to be declared in D3D12_GRAPHICS_PIPELINE_STATE_DESC along with the sample count.

Metal

In Metal binding render targets is done for the duration of a MTLRenderCommandEncoder by specifying the MTLRenderPassDescriptor at render encoder creation. Each attachment (color, depth or stencil) is specified with a texture, a mip-level, a slice (for arrays) and a depth plane for 3D textures. Attachments also get a MTLLoadAction action (don't care, clear or load) to help optimize memory traffic as well as a MTLStoreAction (don't care, store and or resolve). The texture storing the resolved data is specified per-attachment with a texture, level, slice and depth as well. A clear value can be provided when the load action is "clear".

At pipeline creation time, the format of each attachment is set separately, but the sample counts set for the whole pipeline, see MTLRenderPipelineDescriptor's doc. I didn't find it in the doc but certainly, the pipeline's attachment format and sample count must match those of the render encoder's attachment.

Inside the render encoder, you can't clear, resolve or discard attachment, but can choose a store action previously set as unknown.

Vulkan

See both parts of "Renderpass" in these slides

Vulkan has a concept of renderpass where the structure of the rendering algorithm is described to the driver in advances. A renderpass contains a list of attachment formats and samples, and a list of subpass that use these attachments. Think of it like each subpass is the equivalent of a MTLRenderPassDescriptor. Dependencies between subpasses are expressed that the driver can optimize. Vulkan also has a concept of "input attachment" which is an attachment which can potentially stay in tile memory, or in the texture cache between different subpasses. This allwos things like a GBuffer to stay close to the ALU, instead of having to store it in main memory. Renderpasses are created with vkCreateRenderPass.

Pipelines are created with vkCreateGraphicsPipelines for use at a specific subpass of a renderpass, so that shader code manipulating the tile memory can be emitted.

In command buffers, renderpasses are started, stepped and ended with vkCmdBeginRenderPass, vkCmdNextSubpass and vkCmdEndRenderPass. The attachments must be given when beginning a renderpass as a compatible VkFramebuffer created with vkCreateFramebuffer from a bunch of VkImageViews.

Suggestion for WebGPU

The APIs all need the attachment formats and sample count for creating pipelines, so let's do that (specifying a subpass of a renderpass also gives that information). Metal and Vulkan need renderpasses explicitly started and ended with load and store actions to help with tilers, so let's do that.

The only things is whether we should have Metal single-passes or Vulkan render-passes. Some people [needs citation] has seen more than 30% improvement in power and perf on mobile when using Vulkan render passes, so I think we should go that direction.

@kvark
Copy link
Contributor

kvark commented Jun 29, 2017

The APIs all need the attachment formats and sample count for creating pipelines, so let's do that
Metal and Vulkan need renderpasses explicitly started and ended with load and store actions to help with tilers, so let's do that.

Agreed, that seems straightforward.

Some people [needs citation] ...

Here is an excerpt from Obsidian API doc:

during the "Vulkan Game Development on Mobile" session at GDC, Hans-Kristian (of ARM) showed 30% FPS improvement and 80% bandwidth reduction from using sub-passes in a deferred renderer, when testing on Galaxy S7

The full talk can be seen on YouTube.

@Kangz
Copy link
Contributor Author

Kangz commented Sep 2, 2021

Render passes in WebGPU have stabilized a long time ago. There is another investigation for a subpass kind of feature in #435

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants