-
-
Notifications
You must be signed in to change notification settings - Fork 2.8k
[rlsw] Simplify framebuffer logic and add blit/copy fast path #5312
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Oh and the first commit adds the |
my french slipped out
|
@Bigfoot71 Nice! Thanks for the improvement! 🚀 |
|
@Bigfoot71 It seems this PR breaks compilation on Visual Studio...
|
…5312 Added a workaround but it has other probably undesired implications
|
Reviewed the issue but not sure if that's the best approach, is it possible to just avoid alignas()? |
|
@Bigfoot71 Found another issue when trying to build for |
`PLATFORM_DRM` depends on it but if there is a better approach to get the buffer, it can just be removed again and replaced by alternative.
I was going to make another PR, but I'll take care of this first, I'll fix this problem for DRM |
We could, but it's still preferable, I'll review that too, I forgot it was C11 :/ |

I further simplified the framebuffer management logic, mainly by unifying the color and depth buffers.
I also added a fast path for blit/copy operations where a blit with identical dst/src dimensions now falls back to a simple copy, and a copy with the same dimensions and format (RGBA32/RGB16) as the framebuffer performs a simple linear copy.
glClearDepth()andGL_DEPTH_CLEAR_VALUE(getter) have also been added.There was also a fix in the framebuffer copy, if the dimensions weren't exact, the pointer was being incorrectly incremented.
I also revisited the SIMD functions for
float <=> unorm8color conversion, they now avoid explicit clamp (min/max) operations.Regarding the unified buffer, it has both advantages and drawbacks...
The main drawback is that it's an AoS, which is suboptimal (or even incompatible) if we ever move to fully manual SIMD rasterization.
However, reverting to separate buffers at that point would be trivial compared to the rest of the work.
The advantages are a cleaner and more natural logic, a bit less pointer arithmetic, and logically fewer cache misses during rasterization.
Initially, after unifying the buffers, I saw no visible performance difference on my side with
-O3, which is good considering the code simplification.With the subsequent changes, I noticed a slight performance increase on the bunnymark test.