[3.x] Implement Octahedral Map Normal/Tangent Attribute Compression #46800

The-O-King · 2021-03-08T18:28:09Z

Bugsquad edit: 3.x version of #60309.

Using spherical coordinates, normal and tangent vectors can be compressed more
efficiently than the current compression technique used
Using sphere mapping techniques described in this link and this link
normal and tangent vectors can be compressed more efficiently than the current
compression technique we have implemented

Using an octahedral mapping technique described in this paper normal and tangent
vectors can be compressed more efficiently than the current compression technique
we have implemented
Rather than each normal/tangent being a vec4 they can be stored as
vec2 saving 2 bytes each

This will help reduce vertex memory bandwidth usage, which is particularly useful
on mobile devices

Needed to update the GLES2 shader class since I hit the limit of the 32 bit mask
being used to set the state of conditionals in the shaders themselves - let me
know if the updated implementation needs changing

Currently, this change would require a reimport of all assets since the underlying
data representation has changed
Also it is more important now to use the "Ensure Correct Normals" flag when
using non-uniform scaling as decompression can result in slightly skewed normals
that will skew further with scaling

Implementation of godotengine/godot-proposals#2395

BastiaanOlij

Looks like a neat way to add this in as a feature. My only question is if it would be more efficient to just leave one axis out and doing a z = sqrt(1.0 - x*x - y*y) approach. Don't know which is faster on modern GPUs, I can imagine cos/sin lookups are faster then sqrt.

Also this probably needs an implementation on master as well.

I feel @reduz should provide an opinion on this as well.

clayjohn · 2021-03-18T06:28:25Z

Two things:

First to mirror Bastiaan above, I'm not sure spherical coordinates are the best approach. The spheremap transform described here has better quality and speed. And the simple reconstruction has better speed as well. Spherical coordinates are going to use a lot of cycles. Have you profiled how this PR affects performance, in theory there should be a nice gain by reducing bandwidth pressure, but for some render tasks bandwidth isn't the bottleneck.

Second, this PR is only necessary for 3.x so don't worry about 4.0. in 4.0 reduz removed the "compress" option.

The-O-King · 2021-03-18T15:19:32Z

I thought about the leave-the-z-element-out approach but I don't think that works for raw mesh data because that approach needs to assume the sign of the z element, which may work for G-Buffer deferred rendering (you can assume your vertices are facing towards the screen as they would be in view space) but using it for general mesh storage requires us to store the sign somewhere so it's more tricky

I did do some profiling and I did not see any performance degradation using spherical coordinates (although there was more ALU usage), only reductions in vertex memory bandwidth usage and reductions in vertex fetch stalls.
The spheremap transformation is quite interesting though, and doesn't seem difficult at all to implement! I'll try to implement it as well and see how it looks/performs (seems more promising than spherical not gonna lie!) let me know if you all would like that as an alternative

There is also a third option I have used before - QTangents, which store the entire tangent space, but might take a little more effort to get working within the current setup that Godot has which treats the normal/tangent as separate vectors entirely (although not impossible)

Let me know what you all think :)

clayjohn · 2021-03-21T05:30:54Z

@The-O-King Subject to your testing, I think it makes sense to just use the spheremap transformation rather than use spherical coordinates (no need for an option). If https://aras-p.info/texts/CompactNormalStorage.html is still accurate on modern hardware, then spheremap should result in better quality and less ALU usage.

The-O-King · 2021-03-22T19:57:55Z

Got it, yup I got a chance to do the implementation and it was cheaper to run while providing better visual quality :) definitely worth it!

I've updated the PR and the description for this change, should I also go back and update the original Proposal as well?

Also you mentioned that 4.0 doesn't need this change implemented - does it do mesh import compression differently (or have an even better technique?)

Calinou · 2021-03-22T20:47:42Z

I've updated the PR and the description for this change, should I also go back and update the original Proposal as well?

Feel free to update the proposal, but don't feel obliged to do it if you don't have time 🙂

clayjohn · 2021-03-22T21:55:20Z

In 4.0 instead of choosing between compressed and uncompressed, it just uses uses a single format that should be close to optimal. See commit here: 70f5972#diff-dc12bc981c6b42b8579800ffb22b4f82e4f1153acb40db6dfc98a6655bcad0b8

See the datatypes here:

godot/servers/rendering_server.h

Lines 216 to 229 in a7fb5f8

    
           ARRAY_VERTEX = 0, // RG32F or RGB32F (depending on 2D bit) 
        
           ARRAY_NORMAL = 1, // A2B10G10R10 
        
           ARRAY_TANGENT = 2, // A2B10G10R10, A flips sign of binormal 
        
           ARRAY_COLOR = 3, // RGBA16F 
        
           ARRAY_TEX_UV = 4, // RG32F 
        
           ARRAY_TEX_UV2 = 5, // RG32F 
        
           ARRAY_CUSTOM0 = 6, // depends on ArrayCustomFormat 
        
           ARRAY_CUSTOM1 = 7, 
        
           ARRAY_CUSTOM2 = 8, 
        
           ARRAY_CUSTOM3 = 9, 
        
           ARRAY_BONES = 10, // RGBA16UI (x2 if 8 weights) 
        
           ARRAY_WEIGHTS = 11, // RGBA16UNORM (x2 if 8 weights) 
        
           ARRAY_INDEX = 12, // 16 or 32 bits depending on length > 0xFFFF 
        
           ARRAY_MAX = 13

Not sure if your PR is actually relevant or not for 4.0. If you can further compress these values, then a PR would be welcome.

Regarding the shader permutations, we have been avoiding making the keys 64 bits and instead of been using #if defined(DEF_NAME) instead. This results in fewer permutations. We discussed this in adding #31062

I am not opposed to adding a 64 bit key, but I know that it has been rejected in the past.

The-O-King · 2021-03-22T22:39:54Z

For 4.0 - 32 bits are used to store normals and tangent each, but with this technique only 16 bits are needed (along with some decompression in the vertex shader), so memory wise it would be saving 2 bytes for each normal/tangent
I've seen A2B10G10R10 be used pretty commonly for normals/tangents definitely, but I'm not super sure how big a quality difference it and Sphere Mapping would have, that could be something researched though if that savings seems worth it!

For shader permutations - since this change requires that the actual inputs types of the vertex attributes change (from vec3/vec4 to vec2) wouldn't this necessitate additional permutations? I guess I'm also not sure what the difference between a #ifdef DEF_NAME and #if defined(DEF_NAME) is in terms of reducing permutations

servers/visual_server.cpp

doc/classes/Mesh.xml

doc/classes/SurfaceTool.xml

doc/classes/VisualServer.xml

servers/visual_server.h

scene/resources/mesh.h

Implement Octahedral Compression for normal/tangent vectors *Oct32 for uncompressed vectors *Oct16 for compressed vectors Reduces vertex size for each attribute by *Uncompressed: 12 bytes, vec4<float32> -> vec2<unorm16> *Compressed: 2 bytes, vec4<unorm8> -> vec2<unorm8> Binormal sign is encoded in the y coordinate of the encoded tangent Added conversion functions to go from octahedral mapping to cartesian for normal and tangent vectors sprite_3d and soft_body meshes write to their vertex buffer memory directly and need to convert their normals and tangents to the new oct format before writing Created a new mesh flag to specify whether a mesh is using octahedral compression or not Updated documentation to discuss new flag/defaults Created shader flags to specify whether octahedral or cartesian vectors are being used Updated importers to use octahedral representation as the default format for importing meshes Updated ShaderGLES2 to support 64 bit version codes as we hit the limit of the 32-bit integer that was previously used as a bitset to store enabled/disabled flags

The-O-King · 2021-07-30T14:30:34Z

Thanks for pointing out the _bind_method() @akien-mga! Hopefully that takes care of the documentation

akien-mga · 2021-07-30T15:34:50Z

Thanks!

aaronfranke · 2021-08-05T01:07:53Z

This PR causes a regression that locks up my entire desktop environment on Ubuntu with Nvidia. I also tested with the follow-up PR to this PR (#51258) and I still get the same freezing.

clayjohn · 2021-08-05T01:22:20Z

@aaronfranke Have you tried turning on the split stream as suggested in #51258 (comment)?

aaronfranke · 2021-08-05T01:29:16Z

@clayjohn Adding mesh_storage/split_stream=true to project.godot and opening the project with #51258 did not help much, my system still freezes and locks up. However, it did give me this backtrace:

Dumping the backtrace. Please include this when reporting the bug on https://github.com/godotengine/godot/issues
[1] /lib/x86_64-linux-gnu/libc.so.6(+0x46210) [0x7f10cbf17210] (??:0)
[2] /lib/x86_64-linux-gnu/libnvidia-glcore.so.460.91.03(+0xeeed71) [0x7f10c9444d71] (??:0)
[3] /lib/x86_64-linux-gnu/libnvidia-glcore.so.460.91.03(+0xeef12c) [0x7f10c944512c] (??:0)
[4] /lib/x86_64-linux-gnu/libnvidia-glcore.so.460.91.03(+0xa2fcc6) [0x7f10c8f85cc6] (??:0)
[5] /lib/x86_64-linux-gnu/libnvidia-glcore.so.460.91.03(+0xa2e5fb) [0x7f10c8f845fb] (??:0)
[6] /lib/x86_64-linux-gnu/libnvidia-glcore.so.460.91.03(+0xa377ef) [0x7f10c8f8d7ef] (??:0)
[7] RasterizerSceneGLES3::_render_list(RasterizerSceneGLES3::RenderList::Element**, int, Transform const&, CameraMatrix const&, RasterizerStorageGLES3::Sky*, bool, bool, bool, bool, bool) (/home/aaronfranke/workspace/godot-3/drivers/gles3/rasterizer_scene_gles3.cpp:1894)
[8] RasterizerSceneGLES3::render_scene(Transform const&, CameraMatrix const&, int, bool, RasterizerScene::InstanceBase**, int, RID*, int, RID*, int, RID, RID, RID, RID, int) (/home/aaronfranke/workspace/godot-3/drivers/gles3/rasterizer_scene_gles3.cpp:4028)
[9] VisualServerScene::_render_scene(Transform, CameraMatrix const&, int, bool, RID, RID, RID, RID, int) (/home/aaronfranke/workspace/godot-3/servers/visual/visual_server_scene.cpp:2738)
[10] VisualServerScene::render_camera(RID, RID, Vector2, RID) (/home/aaronfranke/workspace/godot-3/servers/visual/visual_server_scene.cpp:2317 (discriminator 2))
[11] VisualServerViewport::_draw_3d(VisualServerViewport::Viewport*, ARVRInterface::Eyes) (/home/aaronfranke/workspace/godot-3/servers/visual/visual_server_viewport.cpp:68)
[12] VisualServerViewport::_draw_viewport(VisualServerViewport::Viewport*, ARVRInterface::Eyes) (/home/aaronfranke/workspace/godot-3/servers/visual/visual_server_viewport.cpp:109)
[13] VisualServerViewport::draw_viewports() (/home/aaronfranke/workspace/godot-3/servers/visual/visual_server_viewport.cpp:338)
[14] VisualServerRaster::draw(bool, double) (/home/aaronfranke/workspace/godot-3/servers/visual/visual_server_raster.cpp:108 (discriminator 2))
[15] VisualServerWrapMT::draw(bool, double) (/home/aaronfranke/workspace/godot-3/servers/visual/visual_server_wrap_mt.cpp:92)
[16] Main::iteration() (/home/aaronfranke/workspace/godot-3/main/main.cpp:2166)
[17] OS_X11::run() (/home/aaronfranke/workspace/godot-3/platform/x11/os_x11.cpp:3638)
[18] /home/aaronfranke/workspace/godot-3/bin/godot.x11.tools.64(main+0x125) [0x1777a9b] (/home/aaronfranke/workspace/godot-3/platform/x11/godot_x11.cpp:57)
[19] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf3) [0x7f10cbef80b3] (??:0)
[20] /home/aaronfranke/workspace/godot-3/bin/godot.x11.tools.64(_start+0x2e) [0x17778be] (??:?)
-- END OF BACKTRACE --

The-O-King · 2021-08-05T02:01:28Z

@aaronfranke @clayjohn Updated that PR (#51258) that should fix the issues you are reporting here, it was a single line fix in RasterizerSceneGLES3 that didn't make it in this PR (RasterizerSceneGLES2 should be working fine)

aaronfranke · 2021-08-05T14:21:54Z

I can confirm that the bug is fixed with both the updated #51258 and the latest 3.x as of writing. Thanks!

nikitalita · 2021-12-27T09:29:25Z

@The-O-King why was this only implemented in 3.4 and not in master (4.x)?

clayjohn · 2021-12-28T06:54:13Z

@nikitalita it will be implemented in master eventually.

The-O-King · 2021-12-28T16:54:01Z

Yup I will be looking into the 4.x implementation probably at the start of the new year :) for now we have been working on 3.x and getting an idea for all the potential problem spots/considerations, that way we will have a bit more context on what we may encounter moving it over to 4.x!

The-O-King requested review from a team as code owners March 8, 2021 18:28

Calinou added enhancement topic:3d topic:rendering labels Mar 8, 2021

Calinou added this to the 3.2 milestone Mar 8, 2021

The-O-King force-pushed the normal_compression branch 2 times, most recently from d381791 to 034e36f Compare March 8, 2021 19:24

Base automatically changed from 3.2 to 3.x March 16, 2021 11:11

aaronfranke modified the milestones: 3.2, 3.3 Mar 16, 2021

BastiaanOlij approved these changes Mar 18, 2021

View reviewed changes

BastiaanOlij requested a review from reduz March 18, 2021 01:43

Calinou mentioned this pull request Mar 19, 2021

(Epilepsy Warning) Random black Screenflashes with GLES3 and dense Meshes with Blendshapes #45591

Closed

The-O-King force-pushed the normal_compression branch 2 times, most recently from ec7a18d to 432a017 Compare March 22, 2021 19:36

The-O-King changed the title ~~[3.2] Implement Spherical Coordinate Normal/Tangent Attribute Compression~~ [3.2] Implement Sphere Map Normal/Tangent Attribute Compression Mar 22, 2021

The-O-King force-pushed the normal_compression branch from 432a017 to 4f3d483 Compare March 22, 2021 19:57

akien-mga changed the title ~~[3.2] Implement Sphere Map Normal/Tangent Attribute Compression~~ [3.3] Implement Sphere Map Normal/Tangent Attribute Compression Mar 22, 2021

akien-mga modified the milestones: 3.3, 3.4 Mar 22, 2021

akien-mga changed the title ~~[3.3] Implement Sphere Map Normal/Tangent Attribute Compression~~ [3.x] Implement Sphere Map Normal/Tangent Attribute Compression Mar 22, 2021

The-O-King commented Apr 22, 2021

View reviewed changes

servers/visual_server.cpp Outdated Show resolved Hide resolved