Clone this wiki locally
This page is initially a dump/transfer from my personal redmine where I tracked everything from super long-term goals (some of these items are 18 months old), little tasks & feature requests I've received, and bugs - which are now tracked as github issues.
I've tried to organise the tasks both logically, and I've marked up tasks which I reckon will be small, isolated or simple enough for someone to tackle as a way to get a handle on the project. Some of these tasks are incredibly simple and the only reason I haven't gotten around to them is just because it hasn't been a priority, so don't be surprised to see trivial tasks listed :).
Since it turns out wiki pages charge the very reasonable rate of $0/page I've linked any topics where I have more of a braindump or ideas to put down.
I think all of the tasks here are understandable but let me know if anything is unclear!
Improvements to existing functionality
- Some ASM instructions & operands aren't implemented yet
- No support yet for debugging GS, HS, DS. These shaders have more complex execution patterns than VS, PS, CS as it's not simple input-output. The primary problem to solve here is getting inputs and handling the overall control flow. The core loop of iterating through instructions and emulating them will largely just work.
- Compute shaders are simulated in isolation as if only one thread was running. In some cases this assumption is fine, in others it's not. Simulating an entire thread group either lockstep or randomly (to simulate timing issues) is probably not likely to be practical unless the entire simulation is offloaded to the GPU (doable potentially but a lot of work).
- Per-instruction breakout view highlighting input and output registers.
- We don't display data after the HS. There are many issues with doing this - both actually getting at the data (custom DS with matching signature?) and displaying it sensibly at different frequencies across the mesh.
- It would be good to have the mesh viewer able to collapse down headers, so that you could look at only the column you're interested in. Better options for highlighting to visually make sense of the giant table of data would be great. small task
- The texture viewer doesn't currently allow you to drop more than one texture tab side-by-side. This would be very valuable to compare two textures or view a whole gbuffer, and should be supported.
- The texture list is rather anemic. It should probably be split out to a separate window, and have more flexible filtering support, thumbnail previews (either of one highlighted texture or a thumbnail grid).
- The callstack pane splitter doesn't drag up from the bottom, you have to double click it (because that's how the underlying control works). small task
- At the moment the data in the API view is just the text-serialised form of the function calls, so isn't very useful or programmatically usable. The data should be in a machine format so that we can do things like link resources to the pipeline state etc.
- The current texture usage bar is good but could be expanded to be able to show the dependencies and dependents for a drawcall (ie. anything that uses as input the outputs of this draw, or vice-versa, and recursively).
- The texture usage markers need to scale better, currently important markers can be missed because they're skipped drawing as too close to another, when you're zoomed out. There's not necessarily enough room for all of them, but there needs to be some indication.
- Add a real autos window that will show registers from the few instructions before/after and their contents. small task
- I'm sure there's a better way to trigger editing shaders from the shader viewer than going back to the pipeline state and clicking edit. It would also be nice to choose edit only the instance of the shader on that drawcall, or globally (currently it's always global).
- The event browser should also have a filter-as-you-type box that will filter the tree down to what is specified, in addition to the existing find box. small task
- It would be useful to have previous/next drawcall buttons that jump regardless of any markers to whichever is the next drawcall. Also have the ability to choose whether this should apply to only draw calls rather than clears and other events.
- There are many cases where invalid data passed to d3d (out of bounds indexes or lengths, null pointers where it isn't valid etc) aren't being handled robustly. RenderDoc should not crash no matter what the app does, other than things out of its control like driver crashes due to invalid work or GPU load. People use debuggers when they have bugs, not when they have a perfect program. small task
- Where relevant, the standard multisampling pattern of offsets is assumed. We should readback from the GPU what the actual pattern is, per (count,quality) combination.
When marking resources as used, they are marked whenever they are bound at the point of a drawcall, dispatch, etc. This means even a Dispatch will mark VS/PS resources as used, incorrectly. small task
- Within the D3D driver whenever we need to do some operations that will change state we push and pop the entire D3D state vector. If this becomes a bottleneck at any point (either for overlay in-process or in the UI somewhere) there should be a selective push/pop tracker that will only restore the state that's modified (possibly a small subset).
- Drawcall timers are still not accurate, needs more investigation as to how to get truly reliable timestamps out.
- It would be good to expose the existing functionality to get texture data back - probably with an option to have it block-decompressed for you. This would be useful for automated use of RenderDoc - both for end users and for testing during dev.
- The UI in general is fairly synchronous. This isn't too much of a problem on a small log locally, but on large logs or especially replaying remotely (across a network) the UI can become sluggish and unresponsive. There are a few things that can be done to mitigate this:
- The UI shouldn't try to synchronously paint anything - any Invoke() calls right now to render to the screen should be asynchronous.
- Any requests/paints/etc should be able to be cancelled or discarded so that e.g. the user doesn't have to wait for the whole UI to update and paint after changing event before they can switch to a new event. This would allow faster jumping when people don't need to see the results and are just browsing.
- For long UI operations there should be a spinner or progress bar that shows progress (especially for long tasks like shader debugging), so that the UI can be responsive while the user is aware that something is happening.
- For the network specifically, I suspect that caching contents locally on each end and transferring only the delta to a rect/region that has changed would speed things up. For drawing small objects without much screen coverage this could change the data that needs to be transferred by an order of magnitude.
Every time we replay part of a frame, we apply the initial contents of all resources to ensure we have a clean start. For some (many) resources, they haven't changed in the frame at all so this initial contents apply is redundant and wastes time. small task
In situations with memory pressure - this comes up most often on 32bit applications - the overheads from RenderDoc are too high and can cause out of memory problems. There are a lot of options to heavily reduce this pressure, but they mostly trade off memory against performance so it might be best to have these be configurable (perhaps default-enabled for 32bit programs?).
- Memory overhead primarily comes from a) CPU-side shadow copies of buffers or textures, so that we can skip fetching initial resource contents. b) GPU copies of resources at the start of a frame that might be modified in the frame, so that we can serialise it out at the end of the frame.
- To combat (a) we can place a memory limit on shadow copies and start kicking buffers out and falling back to lazily fetching the contents at frame start.
- We also store a lot of chunks with CPU-side data e.g. for maps and updates such. These chunks can likewise be kicked out if necessary to save space and fetched lazily.
- As an alternative to kicking out data, if it proves a valuable tradeoff, would be to serialise out data to disk that might be used, and bring it back in later to merge into the final capture file. This means more disk I/O but depending on the OS there might be kernel operations to make file-to-file shifting data efficient.
- Potentially with the above, (b) becomes even worse as we need to lazily fetch even more data. Since we don't know which data we need to fetch before the frame we need to be conservative and save it all, but instead of keeping these copies as GPU objects we can copy them out to CPU memory immediately and save them to disk. This would be quite a hit, but means you only need perhaps only enough memory free to copy the largest resource - perhaps less if you do partial copies out of GPU memory.
- At the moment we have the whole capture file in memory (in chunks) before flushing, but this is purely for convenience and doesn't need to happen at all.
Replay UI improvements
- There's basically been no optimisation done so there's a ton of low hanging fruit here. I don't want to put much down as really profiling needs to happen first before any optimisation.
- One algorithmic thing that can change is that when moving forward in the frame, we should be able to just replay the parts that we're advancing by, rather than always replaying from the start of the frame.
- To expand on this, we could do something similar going backwards if we saved snapshots of state+resource contents at different parts of the frame so we can efficiently rewind to it then replay just the delta. This could work if you detected passes and saved out only the couple of render targets that are changing in that pass, and make a snapshot/checkpoint at the start of the pass.
- Attaching to running programs
The Render Doctor (High-level performance/hazard/bug analysis)
Any parameters specified at creation generally aren't visible in renderdoc, unless they correspond to some aspect of the current pipeline state. In particular handling of texture views that cast the format or only show part of a resource (some mips, some slices) aren't handled very gracefully.
- The texture viewer needs a kind of 'sticky' mode, where you can select a pixel and it will do its best to continue showing that pixel so you can flip between inputs and outputs, or between targets, etc, to trace a pixel through the frame.
- A new overlay for the texture viewer to display mip usage over an object. One difficulty in this is getting the uv mapping, since there are no guarantees about which pixel shader input should be used as UVs (there may not even be one). This is also non-trivial on the UI side because you'd need to be able to choose the texture slot - maybe the overlay dropdown should have nested options?
- Related to the above, an overlay on textures instead of on outputs that shows the unwrapped model in UV space mapped to the texture. Same problem as the mip usage overlay - how to determine the UV inputs to the pixel shader in the general case.
- Quick-diagnosing overlays like stencil/depth test fail for backface culling, write mask or potentially blending/blend mode. small task
- You should be able to visualise cubemaps either by viewing them as a flat cube from the inside (standard lookup), or by placing them onto a cube or sphere that you can look at/spin around. small task
- Export options in the mesh viewer should be expanded. The only difficulty with these is marking up streams like UVs etc but this can be done by educated guesswork or configuration.
- Export to OBJ format small task
- Export to FBX small task
- Much requested feature - highlighting redundant api calls. We have all the data we need as we already track the pipeline state, so all we need to do is note down per api event whether the pipeline state that it's modifying has changed. Then figure out a UI to display it in a friendly manner :).
- The ability to diff two events against each other. This would be particularly powerful if you could do this between two captures, and it would tell you the difference in state, textures bound, shaders bound, etc. Would also be really powerful in combination with automation support - in theory you could have a script that takes two captures perhaps from a regression test, and diff them down to tell you exactly what part of the rendering changed.
- There currently isn't anything like this, but having a more advanced overlay in-process might be desirable to some - e.g. a bit more timing/performance information for people using RenderDoc as a profiler (when that is practical). It would speed the iteration loop rather than having to capture + analyse all the time. However, maybe many people have this kind of thing built-in and it wouldn't be useful?
- Editing state & resources. The whole pipeline state should be editable, both in terms of changing the state at a given drawcall, as well as (where relevant) changing a property across the whole frame.
- Extremely basic support for this already exists, in terms of replacing a shader across the shader with an edited shader.
- This shouldn't just be limited to changing a fixed function property of the pipeline, you should also be able to bind different resources to different slots, or even bring in a resource that didn't exist then load it up and bind it.
- Currently the shader viewer will show hlsl if available, and D3D bytecode. Where possible it would be nice to also display IHV-specific IL or asm. I think the only relevant & available instance of this is AMD IL and GCN ISA bytecode, but it would be great to support in general.
- A RenderDoc API could expose extremely powerful features, and provide better ways for games and engines to integrate with RenderDoc if they wish, beyond existing functionality like perf markers or resource names.