Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ogl bringup part 1 #12114

Merged
merged 23 commits into from May 31, 2022
Merged

ogl bringup part 1 #12114

merged 23 commits into from May 31, 2022

Conversation

kd-11
Copy link
Contributor

@kd-11 kd-11 commented May 28, 2022

A rather large set of OpenGL fixes. Initially I was investigating one bug, but this backend has suffered from a lot of rot and is missing a lot of features. Highlights:

  • Do texture deswizzling on the GPU. Cuts texture upload times by 80% in some games that use this RSX feature.
  • Avoid needless state changes by committing to the central state management structure (gl_state)
  • Rewrite the whole buffer object API to be bindless (DSA). This helps avoid leaking buffer bindings and improves performance.
  • Rewrite the whole fbo API to be fully DSA as well.
  • Works around poor OpenGL performance on AMD proprietary drivers. Investigations show they do not properly support persistent mapped index buffers. If you do this, on every draw call the index data is manually copied over inline. Ouch.
  • Optimize OGL compute dispatches.
  • Optimize texture upload utils by using simple_array instead of vector. Turns out vector tries to zero-initialize the array which we do not want since this is just some scratch memory. Saves around 3ms in my test scenario (resogun)
  • Implements buffer-to-d24x8 shader pipes to avoid CPU readback for typeless operations (mesa + AMD)

Fixes some issues too:
Fixes #12083
Fixes #12082

Also partially addresses #11197
And hopefully #11941

kd-11 added 16 commits May 28, 2022 23:05
- There is special handling for some cross-aspect bitcasts in vulkan, but this is not possible using OpenGL
- Gets rid of spammy BindBuffer calls on every draw
- Avoids making too many invocations, especially given the 1D nature of some GPU dispatch handlers
…alize

- This cuts down processing time significantly by eliminating calls to memset_stosb
- Keep buffers around longer to allow driver heurestics to work
- Properly initialize the shaders to allow optimal workgroup dispatch size
- Turns out the AMD driver really hates it if you render with a mapped index buffer.
  The driver internally seems to make a copy of the consumed indices and uses that. Very slow.
  I was able to isolate this after observing that glDrawArrays is not entirely shit, but glDrawElements duration scaled linearly with the number of vertices.
@kd-11
Copy link
Contributor Author

kd-11 commented May 28, 2022

This PR doesn't resolve all OpenGL issues, there is still a ton of work to be done there. But at least I now know what the main problems are and how to resolve them. Some redesign will have to be done for this backend in the coming days to at least make it usable as a fallback.

@Realmantik
Copy link

I tested Resident Evil Veronica. Texture upload time has imroved significantly and fps now ~30 in comparison with master build (~12). But PR introduces some artefacts. See screenshots:

master build
current master build openGL

PR
kd-11 openGL improvement

@RainbowCookie32
Copy link
Contributor

RainbowCookie32 commented May 29, 2022

This PR fixes the menus on Uncharted 1 on OpenGL, but framerate got nuked out of orbit (from 60fps to 4).

Master:
image

PR:
image

Log if it's worth anything:
RPCS3.log.gz

Edit: Same deal with Split/Second:

Master:
image

PR:
image

RPCS3.log.gz

@kd-11 kd-11 marked this pull request as draft May 29, 2022 16:41
@kd-11
Copy link
Contributor Author

kd-11 commented May 29, 2022

Marked draft. I'll investigate all the regressions reported:

  • RE veronica artifacts
  • RDR artifacts
  • UC1 broken ingame visuals
  • UC1 performance
  • Split/Second performance

Even if I cannot fix them right now if the scope is too large, I'll try and find out the root cause and open a tracker with pending work.

@Linear524
Copy link

@kd-11,
Thank you for reanimation of OpenGL and keeping this backend alive ! :)
OpenGl is a grandfather to all other latest API's, and it's the best gift from 3dfx Voodoo era. Even real PS3 graphics API is tweaked OpenGL...
Sometimes it is an important thing to compare RPCS3 rendering bugs between Vulkan and OpenGL in order to track down some issues.

@kd-11
Copy link
Contributor Author

kd-11 commented May 30, 2022

Time for a retest @RainbowCookie32 @Realmantik

@RainbowCookie32
Copy link
Contributor

Performance is back on both games 👍🏼

image

image

@Darkhost1999
Copy link
Contributor

image
Fixes #12082 and looks good doing it

@kd-11 kd-11 marked this pull request as ready for review May 31, 2022 12:38
@kd-11
Copy link
Contributor Author

kd-11 commented May 31, 2022

That's enough for this PR. The last main remaining issue is AMD performance doing untyped copy (for bitcasts). That will need custom compute or graphics pipelines to basically reimplement glGetTexSubImage/glTexSubImage in shaders.

@ghost
Copy link

ghost commented Jun 1, 2022

Just a heads up @kd-11, I've seen multiple people on AMD GPUs with RPCS3 crashing with this error that's solved by reverting to 0.0.22-13670 (the build before this PR)

E RSX: ERROR: 0:44: 'f32_to_d24' : no matching overloaded function found 
ERROR: 0:44: 'assign' : cannot convert from ' const float' to ' temp highp uint'
ERROR: 0:44: '' : compilation terminated 
ERROR: 3 compilation errors. No code generated.


E RSX: 
F {RSX [0x0082920]} SIG: Thread terminated due to fatal error: Failed to compile compute shader```

@kd-11
Copy link
Contributor Author

kd-11 commented Jun 1, 2022

Just a heads up @kd-11, I've seen multiple people on AMD GPUs with RPCS3 crashing with this error that's solved by reverting to 0.0.22-13670 (the build before this PR)

E RSX: ERROR: 0:44: 'f32_to_d24' : no matching overloaded function found 
ERROR: 0:44: 'assign' : cannot convert from ' const float' to ' temp highp uint'
ERROR: 0:44: '' : compilation terminated 
ERROR: 3 compilation errors. No code generated.


E RSX: 
F {RSX [0x0082920]} SIG: Thread terminated due to fatal error: Failed to compile compute shader```

No new code was added doing a f32 conversion, which means it was always broken but another bug prevented the code from being run. Please open a ticket with steps on how to reproduce so I can investigate.

@Realmantik
Copy link

Resident Evil Code Veronica still has artifacts. Many tickets have already been opened about regression, so I will open a ticket if it will not be fixed in future patches.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
6 participants