Assorted updates to the batch renderer #1961

mosra · 2022-12-06T00:17:53Z

Motivation and Context

Contains various minor updates and fixes that don't depend on changes from #1947, thus hopefully possible to be reviewed & merged in isolation:

Ability to bake-in an initial transformation for added hierarchy (i.e., without it being overwritten next time the hierarchy transform is updated). Will be used for specifying initial scale and coordinate frame correction and should fix 99.7% of nasty transformation bugs I had in the initial gfx-replay demo screenshots in September.
Opt-in flag to generate texture mipmaps. This is something that Vulkan will not give us for free in the driver (and so the assets will need to contain full mip pyramids themselves), so it's not enabled by default, but it's currently used to have a pixel-perfect match with the classic renderer.
Adds a more efficient way to retrieve rendered images -- instead of putting them into a newly-allocated memory, it reads them to a pre-allocated view.
Extracted projection matrix calculation out of CameraSensor to a utility function on the CameraSensorSpec so I can use it even in scenegraph-less contexts. I felt like redoing all that calculation from scratch on my side would be too error prone, so I did it like this, and simplified the code slightly in the process. I hope it doesn't conflict with the upcoming/ongoing sensor refactor. The new API entrypoint isn't used anywhere at the moment, but it'll be in the ReplayBatchRenderer that depends on Decouple gfx-replay player from scene graph; integrate the batch renderer into it #1947.
Adds gfx::RenderTarget::blitRgbaTo() which is like blitRgbaToDefault() but with an ability to choose an arbitrary rectangle of an arbitrary framebuffer, which I need for feature parity in the batch renderer replay GUI. The blitRgbaToDefault() is now built on top of this new API.

How Has This Been Tested

The first two parts have a corresponding update in GfxBatchRendererTest.

The projection matrix is I assume tested somewhere already. I tried to exactly match the original behavior, including taking the hfov override from CameraSensor instead of CameraSensorSpec, please let me know if something feels off.

As there's already an API to change the transformation of a hierarchy right after it's added, this parameter was kinda redundant. But I need to have a way to set an initial "coordinate frame correction" transformation and keep it there, forever, which is what this parameter could do instead -- so now it applies the transform to the immediate child nodes instead, leaving the top-level transformation as an identity.

Need this to achieve a pixel-perfect correspondence with the classic renderer.

Need this for a scenegraph-less rendering.

0mdc · 2022-12-06T05:36:07Z

Looks good!

It was in the CUDA interop test already but that one isn't ran on non-CUDA builds.

I don't intend these to be used that much, nevertheless it's good to not do stupidly inefficient things such as allocating and then copying a huge buffer every frame just because "it's fine for a test". It also allows for easy querying of image subranges, which is a nice bonus.

Generalizes blitRgbaToDefault(), which was a bit too limited.

0mdc · 2022-12-07T15:25:55Z

src/esp/gfx/RenderTarget.cpp

@@ -180,22 +180,22 @@ struct RenderTarget::Impl {

  void renderExit() {}

-  void blitRgbaToDefault() {
+  void blitRgbaTo(Mn::GL::AbstractFramebuffer& target,
+                  const Mn::Range2Di& targetRectangle) {


Thanks for that! I was doing the same thing on my side to visualize all batch renderer environments side-by-side.

0mdc · 2022-12-07T15:36:47Z

src/esp/gfx_batch/RendererStandalone.cpp

@@ -164,13 +165,42 @@ Mn::Image2D RendererStandalone::colorImage() {
                                  colorFramebufferFormat());
 }

+void RendererStandalone::colorImageInto(const Magnum::Range2Di& rectangle,
+                                        const Mn::MutableImageView2D& image) {


Nitpick: I would rename colorImageInto and depthImageInto to something a bit more explicit.

…rge it cleanly. Because there's a thing called clang-format which makes any clean merges just straight impossible. Yay!

…erer into it (#1947) * Decouple SceneNode from replay player callbacks. * Move replay player callbacks into a struct. * Decouple gfx-replay player from scene graph. * Code format. * Standardize header guard in PlayerCallbacks.h. * Minor fixes. * Clean up code. * Code clean up. * Minor fixes. * Change createSceneGraphPlayerCallbacks description. * Avoid copying PlayerCallbacks when constructing player instances. Other minor fixes. * gfx_batch: change addMeshHierarchy() to bake in the initial transform. As there's already an API to change the transformation of a hierarchy right after it's added, this parameter was kinda redundant. But I need to have a way to set an initial "coordinate frame correction" transformation and keep it there, forever, which is what this parameter could do instead -- so now it applies the transform to the immediate child nodes instead, leaving the top-level transformation as an identity. * gfx_batch: actually test the images for import success. * gfx_batch: opt-in flag to generate image mipmaps. Need this to achieve a pixel-perfect correspondence with the classic renderer. * gfx_batch: forgot to assert on scene ID bounds here. * sensor: add projection matrix calculation directly on CameraSensorSpec. Need this for a scenegraph-less rendering. * [wip] dirty, DO NOT LOOK * gfx_batch: wait what is the product() nonsense?? * gfx_renderer: properly test the depth image retrieval too. It was in the CUDA interop test already but that one isn't ran on non-CUDA builds. * gfx_batch: add a more efficient way to get images on the CPU. I don't intend these to be used that much, nevertheless it's good to not do stupidly inefficient things such as allocating and then copying a huge buffer every frame just because "it's fine for a test". It also allows for easy querying of image subranges, which is a nice bonus. * gfx: add RenderTarget::blitRgbaTo(framebuffer, rectangle). Generalizes blitRgbaToDefault(), which was a bit too limited. * [wip] more, now even a GUI * Revert everything that went into #1961 in order to merge it cleanly. Because there's a thing called clang-format which makes any clean merges just straight impossible. Yay! * Actually I do need to link. * sim: change how AbstractReplayRenderer classes get implemented. The public base class interface is now non-virtual, which allows it to perform various checks before delegating into the concrete implementation. Which allows the implementations to omit all checks that would otherwise be duplicated between them. * sim: change loadAndCreateRenderAssetInstance() back to return SceneNode*. * gfx/replay: use a virtual interface instead of heavy std::function. Also putting that back to a single file -- a Player can't really be used without the implementation class anyway so a separate header doesn't make sense. The light setup and semantic ID setting is optional, with the default implementation being a no-op. Also changed gfx::replay::GfxReplayNode* to gfx::replay::NodeHandle. It's shorter and the pointer is a part of the typedef, thus removing a potential ambiguity about whether one should use a * with this type or not. * clang! format! * replayer: further optimize keyframe loading. I felt physical pain when the string views had to be turned into a std::string, and then with a strlen() applied on it again. It's megabytes of data we're dealing with here! * sim: add composite file preloading API to ReplayBatchRenderer. It's a no-op by default (and for the classic renderer), but it's an important piece of functionality for the batch renderer. * replayer: make it possible to pause the simulation. * sim: cleanup. Haha silly me, inherited the batch player implementation from the scenegraph one, so when I removed the dummy virtual overrides it attempted to interpret the node handles as SceneNode* and caught fire. * sim: don't clear in the ReplayRenderer implementation. Doing it in the replayer instead, which allows me to exclude it from CPU profiling and lets me see the actual CPU overhead coming from the implementation, not waits for a next vsync. * gfx_batch: reset framebuffer viewport back to its original state after. * See, Clang Format, you made me not care for indentation anymore. * gfx/replay: add an API for deleting all instances. Even though deleting one by one wasn't really implemented yet for the batch renderer, it'd be extremely inefficient in any data-oriented implementation. This is a better approach, and is implemented there already. * gfx_batch,replayer: tiny fixes for SERIOUS bugs. * Clang Format is so advanced that puny humans can't guess what is The One and Only Correct Way To Format The Code. * Fix some valid Clang Tidy complaints. * Suppress / work around the remaining -- insane -- Clang Tidy suggestions. * C++17 was a mistake, as I always said. * Don't build the replayer utility on Emscripten. I don't see any practical use for it there. * Fix a TODO related to crash on exit. * Clean up, clarify TODOs, revert useless changes. * replayer: repeat the replays by default. Pass --once to the app to repeat just once. Useful for benchmarking the renderer without the JSON parsing overhead, which is apparently significantly higher than all operations the renderer needs to do :D * Let the machine make the code soulless again. * Forgot to add the actual Clang Tidy suppression. Because *imagine* how it would be if we could put comments next to the actual option we're adding. IMAGINE. * Make the Emscripten build compile at least. Not sure how important is to have this particular thing working there, so it's just an unreachable assert. * Give up and use a shared_ptr for PlayerImplementation. It was crashing because the python bindings for ReplayManager::readKeyframesFromFile() returned a Player but didn't tell Python the Player depends on the PlayerImplementation stored inside the ReplayManager. Which would involve some additional bookkeeping in the bindings, and ... everything else is using a shared_ptr anyway, so trying to avoid the unnecessary overhead in just a single place of all is like a drop in the ocean. Doesn't crash anymore with this change. * Forgot to link KtxImporter to the replayer class. * replayer: add some actual help text. * Rename ReplayRenderer to ClassicReplayRenderer for clarity. * sim: split up *ReplayRenderer classes into dedicated files. * Just don't use the Clang Tidy option at all, instead of enable + disable. The comment stays there though, for historical records. * Bad habits, sorry. Co-authored-by: Vladimír Vondruš <mosra@centrum.cz>

mosra added 6 commits December 6, 2022 01:02

gfx_batch: actually test the images for import success.

4ac516a

gfx_batch: opt-in flag to generate image mipmaps.

eab0a8b

Need this to achieve a pixel-perfect correspondence with the classic renderer.

gfx_batch: forgot to assert on scene ID bounds here.

2b806f3

sensor: add projection matrix calculation directly on CameraSensorSpec.

ce47845

Need this for a scenegraph-less rendering.

"AI" changed my code again. I feel robbed.

baccf39

mosra requested review from aclegg3, eundersander and 0mdc December 6, 2022 00:17

facebook-github-bot added the CLA Signed Do not delete this pull request or issue due to inactivity. label Dec 6, 2022

0mdc approved these changes Dec 6, 2022

View reviewed changes

mosra added 4 commits December 6, 2022 22:07

gfx_batch: wait what is the product() nonsense??

9595c2e

gfx_batch: properly test the depth image retrieval too.

a0a2306

It was in the CUDA interop test already but that one isn't ran on non-CUDA builds.

gfx: add RenderTarget::blitRgbaTo(framebuffer, rectangle).

53406af

Generalizes blitRgbaToDefault(), which was a bit too limited.

mosra force-pushed the gfx-batch-assorted-updates branch from 33923b2 to 53406af Compare December 6, 2022 21:14

0mdc self-requested a review December 7, 2022 15:19

0mdc approved these changes Dec 7, 2022

View reviewed changes

mosra merged commit 976cf36 into main Dec 7, 2022

mosra deleted the gfx-batch-assorted-updates branch December 7, 2022 15:39

mosra added a commit to 0mdc/habitat-sim that referenced this pull request Dec 7, 2022

Revert everything that went into facebookresearch#1961 in order to me…

90a07a0

…rge it cleanly. Because there's a thing called clang-format which makes any clean merges just straight impossible. Yay!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Assorted updates to the batch renderer #1961

Assorted updates to the batch renderer #1961

mosra commented Dec 6, 2022 •

edited

0mdc commented Dec 6, 2022

0mdc Dec 7, 2022

0mdc Dec 7, 2022

Assorted updates to the batch renderer #1961

Assorted updates to the batch renderer #1961

Conversation

mosra commented Dec 6, 2022 • edited

Motivation and Context

How Has This Been Tested

0mdc commented Dec 6, 2022

0mdc Dec 7, 2022

Choose a reason for hiding this comment

0mdc Dec 7, 2022

Choose a reason for hiding this comment

mosra commented Dec 6, 2022 •

edited