Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cache framebuffer copies (for self-texturing) until the next TexFlush GPU instruction #17032

Merged
merged 4 commits into from
Mar 6, 2023

Conversation

hrydgard
Copy link
Owner

@hrydgard hrydgard commented Mar 1, 2023

Fixes #17030 , or at least improves on it - for optimal performance that big framebuffer used for bloom should be split like in Killzone, but it's not trivial. So there's still a bunch of copies happening, just less of them.

The performance regression in Dante's Inferno in 1.14 is fixed with this, at least.

I tried it with a few other games with no issues - it seems games are using TexFlush when needed. But let's see if it really is safe to rely on that...

There might also be other places we should call DiscardFramebufferCopy in.

For these copies, ideally we should look at texture coordinates to figure out the regions to copy, and if the previously copied region was enough. Actually we do that already, in some cases - and I just realized that this isn't necessarily valid in that case. So need a couple more checks. EDIT: Added - works because that path isn't currently active in Dante for some reason.

… instruction.

Fixes #17030 , or at least improves on it - for optimal performance that
big framebuffer used for bloom should be split like in Killzone, but it's not trivial.

The regression in 1.14 is fixed with this, at least.

I tried it with a few other games with no issues - it seems games are
using TexFlush when needed. But let's see if it really is safe to rely
on that...

There might also be other places we should call DiscardFramebufferCopy
in.
@hrydgard hrydgard added the GE emulation Backend-independent GPU issues label Mar 1, 2023
@hrydgard hrydgard added this to the v1.15.0 milestone Mar 1, 2023
@hrydgard hrydgard marked this pull request as draft March 1, 2023 21:46
@hrydgard hrydgard marked this pull request as ready for review March 1, 2023 21:51
@unknownbrackets
Copy link
Collaborator

Hm, this seems slightly dangerous but maybe okay. I did have reason to believe people's assertions about texsync or texflush were a bit off, but I think it was texsync.

Let me check in a couple games, though. Especially concerned about Motorstorm for example.

-[Unknown]

Copy link
Collaborator

@unknownbrackets unknownbrackets left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to discard in a few more situations, let me see real quick...

-[Unknown]

// If max is not > min, we probably could not detect it. Skip.
// See the vertex decoder, where this is updated.
// TODO: We're currently not hitting this path in Dante. See #17032
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's probably not using vertex coords that get checked, so maxU probably equals minU. We don't currently set the range for all vertex types, just 16-bit through iirc. This has been true forever, initially to avoid any perf impact to vertex decode for the few games this was initially helping.

-[Unknown]

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, right. Possibly it would always be worth it to find the min/max in through mode, although then this change would need more sophistication. So, maybe one day :)

This avoids retaining the framebuffer copy any longer than the current
framebuffer target.
GPU: Discard framebuffer copy when clearing
@hrydgard
Copy link
Owner Author

hrydgard commented Mar 6, 2023

I'm gonna go for it, easy to revert if it causes problems, but it doesn't really seem to.

As mentioned, future work will include tracking the affected areas more accurately.

@hrydgard hrydgard merged commit ebd8a63 into master Mar 6, 2023
@hrydgard hrydgard deleted the cache-framebuffer-copy branch March 6, 2023 08:03
hrydgard added a commit that referenced this pull request May 11, 2023
…es for overlap

Fixes #17451, while also keeping the Dante performance fix from #17032.

Of course, it's possible that something else could slow down now... But
hopefully not. This could also fix other problems.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
GE emulation Backend-independent GPU issues
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Dante's Inferno buffer copy performance problem
2 participants