Render all contributing gaussian IDs/weights#340
Merged
Conversation
…ng_gaussian_ids (and its sparse equivalent) without the need for max_samples Removed the need for shared memory to store the IDs, weights and ordering during rasterization Started fixing tests for the new return types Signed-off-by: Jonathan Swartz <jonathan@jswartz.info>
Fix for batched rendering check Signed-off-by: Jonathan Swartz <jonathan@jswartz.info>
Signed-off-by: Jonathan Swartz <jonathan@jswartz.info>
…dense pixel spec codepath Signed-off-by: Jonathan Swartz <jonathan@jswartz.info>
Signed-off-by: Jonathan Swartz <jonathan@jswartz.info>
Signed-off-by: Jonathan Swartz <jonathan@jswartz.info>
Signed-off-by: Jonathan Swartz <jonathan@jswartz.info>
Update filenames, update fvdb-test-data hash Signed-off-by: Jonathan Swartz <jonathan@jswartz.info>
Signed-off-by: Jonathan Swartz <jonathan@jswartz.info>
Signed-off-by: Jonathan Swartz <jonathan@jswartz.info>
Signed-off-by: Jonathan Swartz <jonathan@jswartz.info>
Signed-off-by: Jonathan Swartz <jonathan@jswartz.info>
…ing_gaussians so that the 'top' behaviour is run by setting the 'top_k_contributors' parameter to a non-0 value. This is because it can often be much more expensive to compute the exhaustive set of contributing gaussian IDs and sometimes we might just want the top few Signed-off-by: Jonathan Swartz <jonathan@jswartz.info>
Signed-off-by: Jonathan Swartz <jonathan@jswartz.info>
Signed-off-by: Jonathan Swartz <jonathan@jswartz.info>
Signed-off-by: Jonathan Swartz <jonathan@jswartz.info>
Signed-off-by: Jonathan Swartz <jonathan@jswartz.info>
Signed-off-by: Jonathan Swartz <jonathan@jswartz.info>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This refactors
render_top_contributing_gaussian_idsasrender_contributing_gaussian_ids, fixes #331. This function returns an exhaustive list of all the contributing Gaussian IDs and weights per-pixel per-image. The original behaviour is still available through an optional integer parameter,top_k_contributors(defaults to 0) which, if >0, will return only the top K-number of contributing gaussians.The motivation was that given we had implemented
render_num_contributing_gaussians, we could know how many contributing Gaussians there were per-pixel, per-camera and could make rendering the complete set of IDs/weights tractable. This could a.) save the user from having to guess based on intuition or deep understanding of the rendering of their scene and b.) potentially save a lot of memory not tracking empty or over-specified pixels. The intent is to allocate aJaggedTensorof ldim=2 where we have jagged batches of number of pixels per-images/cameras and then jagged batches of Gaussian IDs-per-pixel which could be correctly sized for each batch and then written to by the rasterization kernel. This format would become the new return data fromrender_contributing_gaussian_ids(either square or sparse-pixel render modes)However, when implementing this, there were two aspects that stopped me from writing the ideal implementation.
RasterizeCommonArgswould need to be changed and that code is re-used across the other rasterization kernels and contains a lot of common functionality (ideally I don't want to fork it).RasterizeCommonArgs, we'd need to extend theJaggedAccessorto be aware of JaggedTensors withldim==2.Updating these seemed like more effort than we wanted to take on in this stage before we prove out this functionality and so decided to just modify the existing 'topContributingGaussians' kernel to run the same logic but on the
max(numContributingGaussians)and then copy the results into an optimally sized JaggedTensor of ldim==2 to return the results to the user. In this form, the API keeps the obvious shape for this data and we can leave this optimization of the last layer of this refactor for a later stage. In effect, this approach might over-allocate more memory than is needed during rasterization, compared to being able to optimally allocate and use a JaggedTensor at that stage, and it adds some runtime for the copy into the return format. This further optimization is left for later work #341After implementing this, I found that for a test scene (the garden scene) that rendering all the contributing gaussian IDs was ~9x slower (3ms vs 27ms) and with the overallocation of memory that I was doing to fit all the results plus the extra results copy, it used ~4x the memory (60MB vs 250MB). However, I found other examples in our
test_gsplatunit tests that were faster and used less memory when using the render 'all' vs. 'top-k' kernels. Given the use-case and scene characteristics, there could be a strong reason to use the 'top-k' style of rendering (even if our implementation of 'all' was the most optimal it could be).Therefore, from those results, I thought it might be prudent to preserve both methods of execution. One in which the user can supply a
top_k_contributorsargument to therender_contributing_gaussian_idswhere the user can use the top_k behaviour (but return in the new format) for a reduced set of results faster or get all the contributing IDs. Both modes produce the results in the same format, this should be transparent for the user.I have also updated the tests to test
render_top_contributing_gaussian_idsand compare the results of thetop_k_contributorsmode when run with a number of contributors that is the maximum number of contributors for any pixel against the 'all' mode to validate the kernels produce the same results.