New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add 3D MSAA and scaling support to GLES3 #83976
Conversation
3ca0b5f
to
1c278d0
Compare
cc @Calinou , you might enjoy testing this :) |
1c278d0
to
10df63b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tested locally, it works as expected. Downsampling is also possible with a resolution scale set above 1.0, and it can be combined with MSAA.
Performance
Tested on https://github.com/godotengine/godot-demo-projects/tree/master/3d/truck_town in 3840×2160 fullscreen.
Largely academic because a RTX 4090 is hardly kept busy when using the Compatibility rendering method, but I figure I'd include these nonetheless. Expect the MSAA impact to be greater on slower GPUs.
Similar test on integrated graphics: #83976 (comment)
MSAA
No MSAA | 2× MSAA | 4× MSAA | 8× MSAA |
---|---|---|---|
1274 FPS (0.78 mspf) | 1246 FPS (0.80 mspf) | 1245 FPS (0.80 mspf) | 1241 FPS (0.81 mspf) |
Resolution scaling
0.25 | 0.99 | 1.0 | 1.5 | 2.0 | 2.0 + 8× MSAA |
---|---|---|---|---|---|
1240 FPS (0.80 mspf) | 1244 FPS (0.80 mspf) | 1274 FPS (0.78 mspf) | 988 FPS (1.01 mspf) | 654 FPS (1.52 mspf) | 336 FPS (2.97 mspf) |
654 FPS in an internal resolution of 8K is nothing to scoff at, if you ask me. 🙂
Performance is slightly lower when scaling kicks in at sub-native resolutions, but this is likely because the cost of additional copies/framebuffers outweighs the cost of actually rendering the scene on such a fast GPU.
Issues encountered
I noticed the FSR2 error is printed 4 times when you set the scaling 3D mode to FSR2. This doesn't occur with FSR1. Also, this should be a warning rather than an error for consistency with the FSR1 warning:
Tests done on https://github.com/godotengine/godot-demo-projects/tree/master/3d/truck_town, in 2560×1600 fullscreen on i9-13900H's Intel Iris Xe (DDR5-5600 C40 dual channel). Laptop on battery in power-saving power profile – this is a worst-case scenario, but I made sure it was consistent across all tests. MSAAForward+
Forward Mobile
Compatibility
Resolution scaling
Forward Mobile
Compatibility
It seems the Compatibility renderer's scaling method is much more expensive, despite bilinear scaling being used in all tests here (also in Forward+/Mobile). This is also something you could notice on the RTX 4090 tests above (albeit it's much less significant there). Notice how |
@Calinou great testing, thanks! The FSR2 error was already there, indeed agree we should make this a warning and ensure it's outputted once. Note when looking at scaling, if you set scaling to 1.0 and method to bilinear, scaling is disabled (this was actually broken when FSR2 was introduced), so the difference between 0.9 and 1.0 is that with 1.0 scaling is skipped, while 0.9 adds an extra blit pass. I suspect the difference in performance will be more extreme on TBDR. As I mentioned when fixing things in the Vulkan renderer a few days ago, I believe that we should actually have a proper "OFF" mode for scaling, instead of relying on scaling being set to 1.0, but that would be a breaking change due to us wanting OFF to be mode 0 instead of Bilinear being mode 0. |
10df63b
to
0f68664
Compare
Ok, so turns out MSAA is different on OpenGLES than it is on OpenGL. Working on resolving that... edit did a first pass of this but haven't tested yet, so may not work. |
2ba8be1
to
fb2dfff
Compare
Ok, after talking to Clay I'm going to change things around to using |
So.... diving more into MSAA I've run into a snag.
Now these last two extensions actually make a lot of sense on mobile hardware, especially considering the warning ARM gave that on mobile GPUs you should never write out MSAA data to texture, it kills all your bandwidth. This is also something that Meta goes into some detail in in their materials (I think Meta took the lead on defining So it really would make sense to base our MSAA approach on these later two extensions with maybe a fallback to the first if the others are not supported (that would only apply to normal older devices anyway) and thus disabling MSAA in multiview unless I have some ideas for this and will try them out, I'm writing this mostly to get my thoughts down, but it fits well with our single buffer approach. Our biggest issue is the split we have in render buffers (2D) and scene buffers (3D), with The idea I have is to base rendering purely on these two extensions, with maybe a fallback to using the first but just not making that available if multiview is used. |
I haven't ever implemented this, but if you use the Oculus multiview extension for WebXR (as opposed to the standard one), you can implement MSAA in a way that appears to work similar to the latter two extension that you link. You setup the FBO with |
@dsnopek I think it makes a lot of sense, I'll be busy for a few days yet but I think I have a pretty good picture of making it work. |
@clayjohn @dsnopek so it turns out that the GL_EXT_multiview_texture_multisample This extension is however an OpenGL 4 extension that removes a limitation on Funky how all these work differently. I'm rewriting stuff so we use the correct extension depending on whether we're on mobile hardware or not... |
fb2dfff
to
d0439ed
Compare
Ok, I think I have most of 3D MSAA working now, I'm still having an issue when combining MSAA and upscaling on mobile hardware using the Also still hunting down a few stray errors and still need to test multiview. Once thats all confirmed to be working I'll implement 2D MSAA which I think will mostly be copy/paste. |
6791410
to
638a85e
Compare
I just posted draft PR #84686 which adds support for MSAA with multiview on the web (via WebXR). It depends on the changes in this PR, and the diff that I put in my previous comment. |
2595477
to
4007d92
Compare
@dsnopek your changes are merged in so you can rebase your PR, thanks! I've also removed all the 2D MSAA stuff and moving that into a separate PR, so that we can move to merging this once 4.2 is released |
Thank you guys for the great progress. With @dsnopek changes, is it ready to test some MSAA for 3D for Web Export ? |
Yep! It worked in my testing, but more testing would certainly be welcome :-) |
@BastiaanOlij @dsnopek |
@MrZak-dev Ah, I think I know the solution to this! The key error in your screenshot is "Samples must not be greater than the maximum supported value for the format". In my WebXR PR, I added some code to limit the samples to the reported maximum, which should probably be brought over into this PR. Until then, you could try setting a lower MSAA setting, maybe 2x, and see if that works? @BastiaanOlij See 63dd028#diff-f6e1f37382eb13c287a1e289ce2eb6f83c2301827e2c2b94cc8ad2ac2ba47571R244 - I can make a diff of just that later |
@dsnopek Yes it did work on 2x and 4x . |
Great, thanks! @BastiaanOlij Merge this diff into this PR to fix the problem on the web that @MrZak-dev encountered: diff --git a/drivers/gles3/storage/render_scene_buffers_gles3.cpp b/drivers/gles3/storage/render_scene_buffers_gles3.cpp
index c096f0ec6e..65f7eaed1f 100644
--- a/drivers/gles3/storage/render_scene_buffers_gles3.cpp
+++ b/drivers/gles3/storage/render_scene_buffers_gles3.cpp
@@ -215,6 +215,12 @@ void RenderSceneBuffersGLES3::configure(const RenderSceneBuffersConfiguration *p
const GLsizei samples[] = { 1, 2, 4, 8 };
msaa3d.samples = samples[msaa3d.mode];
+ GLint max_samples;
+ glGetIntegerv(GL_MAX_SAMPLES, &max_samples);
+ if (msaa3d.samples > max_samples) {
+ msaa3d.samples = max_samples;
+ }
+
if (!use_multiview && !config->rt_msaa_supported) {
// Render to texture extensions not supported? fall back to MSAA framebuffer through GL_EXT_framebuffer_multisample.
// Note, if 2D MSAA matches 3D MSAA and we're not scaling, it would be ideal if we reuse our 2D MSAA buffer here. |
4007d92
to
d955fba
Compare
@dsnopek I did it slightly differently so we retrieve this once in GLES3::Config and then re-use the value, we'll later need to embed this for 2D MSAA as well. Let met know if it works fine for you guys. |
@BastiaanOlij The code looks good, and works in my testing! |
will we see this implemented on the next beta update for Godot ? |
4.2 is in feature freeze, so any new features will be for 4.3 at the earliest. |
Indeed, and I think these changes are too substantial to cherry pick for a 4.2.1 release. |
d955fba
to
b61d18e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tested locally and this works great. Left a suggestion to fix a harmless OpenGL error.
Code seems fine to me.
I can't test all the edge cases with the multisample extensions. So I have to trust that you tested them all sufficiently.
b61d18e
to
caddce1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me now!
Thanks! |
Huzzah! 🚀 Now that this one is merged, the WebXR/multiview MSAA PR is ready for review - see #84686 |
This PR adds MSAA and scaling support to the compatibility renderer.
Implemented in this PR are:
Moved to followup PR:
I did a big rewrite of this so GL_EXT_multisampled_render_to_texture and GL_OVR_multiview_multisampled_render_to_texture are used if supported. These are OpenGLES extensions so only available on Android. The advantage of using these techniques is that MSAA data is never written out of tile memory but resolved when written out to texture.
The added advantage for VR is that this can be combined with foveated rendering.
On other devices separate MSAA buffers are being used, just like in Godot 3. This can not be combined with foveated rendering (just doesn't have a performance improvement).
Stupid little test project ;)
MSAA test.zip