Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TextureCache: Track efb copies used in a partially updated texture (fixes regression for 5.0) #3902

Merged

Conversation

@phire
Copy link
Member

phire commented Jun 16, 2016

Fixes a major performance regression in Skies of Arcadia during battle transitions, caused by #3752

I had plans for a more advanced version of this code after 5.0, but here is a minimal implementation for now.


This change is Reviewable

// This is used to keep track of both:
// * efb copies used by this partially updated texture
// * partially updated textures which refer to this efb copy
std::unordered_set<TCacheEntryBase*> refrences;

This comment has been minimized.

Copy link
@degasus

degasus Jun 16, 2016

Member

As we cache those entries, this set must be cleaned on every release of an entry. So I doubt this will work fine.

What do you think about an u64 counter of all generated efb copies and just store this number of applied copies here?

This comment has been minimized.

Copy link
@phire

phire Jun 16, 2016

Author Member

It gets cleared in freeTexture()

Unless there is some other code path which can result in releasing a texture entry???

@phire

This comment has been minimized.

Copy link
Member Author

phire commented Jun 16, 2016

The additional plans along this line involve partially updated textures keeping track of all the textures they are built up of (both efb copies and regular textures). Then texture cache can call CalculateHash() on each subtexture to see if the partially updated texture is still valid (currently it just updates the texture anyway)

This also requires the ability to split texture cache entries (into potentially 4 different entries) when an efb copy overwrites just part of the texture and calculate new hashes.

This allows us to correctly handle the case where a copy overlaps a previous copy (needed to fix both Spyro: A Heros Tail and Super-Mega-Awesome-Hybrid-XFB). It's also very useful for locking, where we need to preserve parts of efb copies in the cache until they are copied to memory or completely overwritten.

@JMC47

This comment has been minimized.

Copy link
Contributor

JMC47 commented Jun 16, 2016

Tested Wind Waker, Super Mario Sunshine, Mario Superstar Baseball, Sonic riders, sonic riders zero gravity, new super mario bros. Wii, and Donkey Kong Country Returns.

And of course Skies of Arcadia, which is the main purpose of this.

@Helios747

This comment has been minimized.

Copy link
Contributor

Helios747 commented Jun 16, 2016

LGTM


Reviewed 2 of 2 files at r2.
Review status: all files reviewed at latest revision, 1 unresolved discussion.


Comments from Reviewable

@delroth

This comment has been minimized.

Copy link
Member

delroth commented Jun 16, 2016

@phire: This comment grants you the permission to merge this pull request whenever you think it is ready. After addressing the remaining comments, click this link to merge.


If @mimimi085181 is fine with the change, feel free to merge. I'm sure I will regret that, but oh well.

@dolphin-emu-bot allowmerge

@BigheadSMZ

This comment has been minimized.

Copy link

BigheadSMZ commented Jun 16, 2016

@phire Everything you said in your second post, would this, or is it possible, to work around the "issue" mentioned here: https://forums.dolphin-emu.org/Thread-gc-skies-of-arcadia-legends--26018?pid=374027#pid374027

Currently the game is stacking a screenshot on top of a UI texture, or a UI texture on top of a UI texture, making a retexture for either impossible.

@lioncash

This comment has been minimized.

Copy link
Member

lioncash commented Jun 16, 2016

Source/Core/VideoCommon/TextureCacheBase.cpp, line 1331 [r2] (raw file):

  // Unlink any references
  for (auto& reference : entry->references)

This block can likely be it's own function, UnlinkEntryReferencesor something, which would be more self-documenting. It can also likely be a member function of TCacheEntryBase itself


Comments from Reviewable

@phire

This comment has been minimized.

Copy link
Member Author

phire commented Jun 16, 2016

@BigheadSMZ Yes, I think that would actually work.

Would require a slight change to the design in my head to force the split textures to go through custom texture loading.

@JMC47

This comment has been minimized.

Copy link
Contributor

JMC47 commented Jun 16, 2016

Can that wait until after release? Why not make an issue report to memorialize the idea it's any kind of significant change.

@Helios747

This comment has been minimized.

Copy link
Contributor

Helios747 commented Jun 16, 2016

Agreed. I really only prefer to touch the minimum amount of code necessary to fix this issue.

@phire

This comment has been minimized.

Copy link
Member Author

phire commented Jun 16, 2016

Review status: all files reviewed at latest revision, 2 unresolved discussions.


Source/Core/VideoCommon/TextureCacheBase.cpp, line 1331 [r2] (raw file):

Previously, lioncash (Mat M.) wrote…

This block can likely be it's own function, UnlinkEntryReferencesor something, which would be more self-documenting. It can also likely be a member function of TCacheEntryBase itself

Would it make more sense to have an .reset() function on TCacheEntryBase which does everything including clearing the set and setting framecount to invalid.

Comments from Reviewable

@lioncash

This comment has been minimized.

Copy link
Member

lioncash commented Jun 16, 2016

Source/Core/VideoCommon/TextureCacheBase.cpp, line 1331 [r2] (raw file):

Previously, phire (Scott Mansell) wrote…

Would it make more sense to have an .reset() function on TCacheEntryBase which does everything including clearing the set and setting framecount to invalid.

Yeah, probably.

Comments from Reviewable

@phire

This comment has been minimized.

Copy link
Member Author

phire commented Jun 16, 2016

@JMC47 @Helios747 Stop panicking. We are talking about the improved future version of these changes here, not this PR.

@BigheadSMZ Actually, we made a bunch of changes since mid-2015 and custom textures might actually work now. If you make sure efb2ram is disabled, we will always write black to that copy of the screen, so you will only have one or two versions of that texture.

@BigheadSMZ

This comment has been minimized.

Copy link

BigheadSMZ commented Jun 16, 2016

Oh my, I think you're right! It's been awhile since I checked it out, and I'm only getting 2 versions of the texture now using the latest master. This is wonderful, looks like nothing needs to be done after all!

@phire

This comment has been minimized.

Copy link
Member Author

phire commented Jun 16, 2016

Review status: 0 of 2 files reviewed at latest revision, 2 unresolved discussions.


Source/Core/VideoCommon/TextureCacheBase.cpp, line 1331 [r2] (raw file):

Previously, lioncash (Mat M.) wrote…

Yeah, probably.

Done.

Comments from Reviewable

@mimimi085181

This comment has been minimized.

Copy link
Contributor

mimimi085181 commented Jun 16, 2016

I don't know how this unordered set works, especially what:
entry->refrences.emplace(entry_to_update);
does. To me it looks like it writes the pointer to the texture cache entry into the set. So this would really stop applying the same update to the same texture over and over. But wouldn't it be possible for another texture to get exactly the same pointer, once the old texture is released?

I would give each texture a unique u32 or u64 counter to identify them.

@degasus

This comment has been minimized.

Copy link
Member

degasus commented Jun 16, 2016

:lgtm:


Reviewed 2 of 2 files at r3.
Review status: all files reviewed at latest revision, all discussions resolved.


Comments from Reviewable

@phire

This comment has been minimized.

Copy link
Member Author

phire commented Jun 17, 2016

@mimimi085181 I think I'll add some more comments, degasus raised the same concerns.

The magic actually happens in TextureCacheBase::CopyRenderTargetToTexture() which you might have missed because I didn't have to change any code there. One of the first things it does during an efb copy is free any textures which share a starting address with the new efb copy. The old efb copy is going to be in that set. (BTW, it will also remove a partially updated texture that starts at that address, which will impact performance, but we won't worry about that before 5,0)

As part of the FreeTexture() call, it now calls Reset() on the texture cache entry, and iterates through references which will be pointers to all the partially updated textures which use this efb copy as a sub-texture. It then removes the reference from each partially updated texture to this efb copy before clearing the whole references set.

The efb copy will continue, and there is a high chance it will get the texture cache entry with the same pointer from the pool, but the next time PartialTextureUpdates() is called it will iterate through all overlapping textures and see the new efb copy, which is guaranteed to not be in it's set of references. And it works as expected.

TextureCacheBase::CopyRenderTargetToTexture() also calls FreeTexture() on all textures which overlap with a non-strided efb copy. One again less than ideal, and I have plans to change this for mega-super-duper-extreme-hybrid-xfb after 5,0, but it ensures there will never be overlapping efb copies in texture cache.

@phire

This comment has been minimized.

Copy link
Member Author

phire commented Jun 17, 2016

Come to think about it, FreeTexture() is very much the wrong name for what that function does.

Fixes a major preformance regression in Skies of Arcadia during
battle transisions.

I had plans for a more advanced version of this code after 5.0,
but here is a minimal implemenation for now.
@phire phire force-pushed the phire:lets_just_release_5_0_already branch from e16d89c to 974ace6 Jun 17, 2016
@degasus

This comment has been minimized.

Copy link
Member

degasus commented Jun 17, 2016

Yeah, the new comments sound fine.


Reviewed 2 of 2 files at r4.
Review status: all files reviewed at latest revision, 1 unresolved discussion.


Comments from Reviewable

@phire

This comment has been minimized.

Copy link
Member Author

phire commented Jun 17, 2016

Reviewed 2 of 2 files at r4.
Review status: :shipit: all files reviewed at latest revision, all discussions resolved, all commit checks successful.


Comments from Reviewable

@lioncash

This comment has been minimized.

Copy link
Member

lioncash commented Jun 17, 2016

Review status: all files reviewed at latest revision, 1 unresolved discussion, all commit checks successful.


Source/Core/VideoCommon/TextureCacheBase.h, line 118 [r4] (raw file):

      TCacheEntryBase(const TCacheEntryConfig& c) : config(c) {}
      virtual ~TCacheEntryBase();

Unnecessary newline.


Comments from Reviewable

@phire phire force-pushed the phire:lets_just_release_5_0_already branch from 974ace6 to 96ab76f Jun 17, 2016
@degasus

This comment has been minimized.

Copy link
Member

degasus commented Jun 17, 2016

By the way, in my opinion, this PR should delay the release by at least a week.

@phire

This comment has been minimized.

Copy link
Member Author

phire commented Jun 17, 2016

Reviewed 2 of 2 files at r5.
Review status: all files reviewed at latest revision, all discussions resolved.


Comments from Reviewable

@degasus

This comment has been minimized.

Copy link
Member

degasus commented Jun 17, 2016

Reviewed 2 of 2 files at r5.
Review status: all files reviewed at latest revision, all discussions resolved, some commit checks pending.


Comments from Reviewable

@dolphin-emu-bot dolphin-emu-bot merged commit 847cb54 into dolphin-emu:master Jun 17, 2016
12 checks passed
12 checks passed
code-review/reviewable 2 files reviewed
Details
default Very basic checks passed, handed off to Buildbot.
Details
lint Build succeeded on the Buildbot.
Details
pr-android Build succeeded on the Buildbot.
Details
pr-deb-dbg-x64 Build succeeded on the Buildbot.
Details
pr-deb-x64 Build succeeded on the Buildbot.
Details
pr-freebsd Build succeeded on the Buildbot.
Details
pr-osx-x64 Build succeeded on the Buildbot.
Details
pr-ubu-nogui-x64 Build succeeded on the Buildbot.
Details
pr-ubu-x64 Build succeeded on the Buildbot.
Details
pr-win-dbg-x64 Build succeeded on the Buildbot.
Details
pr-win-x64 Build succeeded on the Buildbot.
Details
@phire phire deleted the phire:lets_just_release_5_0_already branch Jun 17, 2016
@phire

This comment has been minimized.

Copy link
Member Author

phire commented Jun 17, 2016

Certainly, no releases this weekend.

@JMC47

This comment has been minimized.

Copy link
Contributor

JMC47 commented Jun 17, 2016

#free Dolphin 5.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
9 participants
You can’t perform that action at this time.