Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove spurious memory allocations in TextureCacheBase::SerializeTexture and DeserializeTexture #9641

Merged
merged 1 commit into from May 7, 2021

Conversation

lynlevenick
Copy link
Contributor

I'm doing a project with Dolphin which wants to save state incredibly frequently (~every frame). Reallocating texture_data in TextureCacheBase::SerializeTexture takes up a non-negligible amount of the serialization time in my environment. For my own project I might be able to implement a check to see if textures have changed in the last frame and avoid re-serializing them, but this is general enough to save everybody some fractions of a second when they save state - maybe TASers would appreciate it?

This PR fixes the issue by adding a method to PointerWrap which sets aside some space for the caller to read/write/verify, then using it to set aside one large chunk of space all at once in the texture cache. This should not change the save-state format.

Additionally, the new method is used when deserializing textures as well to avoid a single memory allocation/copy/deallocation, but that's of much less concern and could be removed pretty easily.

I'm not a huge fan of the method I went with here - all of the paths that hit DoExternal need to handle the mode of the PointerWrap, but at present there's no enforcement of that. I've personally verified that only read and write operations take place to the pointers, but that's no guard against future changes so it's pretty reasonable to say DoExternal is going too far/breaking abstraction. If so, let me know: I may be able to change this to refactor SerializeTexture to do one allocation instead of worst-case layers * levels copies/reallocations.

If save state performance is important, I've got a few other lower-impact changes which I could also try to contribute if desired.

@JMC47
Copy link
Contributor

JMC47 commented Apr 17, 2021

Savestate performance is always appreciated. You may have to squash that last commit unless you want it to be a separate commit for some reason.

@JosJuice
Copy link
Member

Yes. It's the kind of change that won't be noticed by most users but would be greatly appreciated by a small subset of users. If you have more optimizations you can make, I'm interested in seeing those as well.

All three commits should be squashed into one, not just the last two.

I agree that adding this kind of DoExternal function is a bit "risky" in terms of code design, but I don't really have a better suggestion for how to do it, and I think the performance benefit outweighs the downsides.

Source/Core/Common/ChunkFile.h Outdated Show resolved Hide resolved
Source/Core/VideoCommon/TextureCacheBase.cpp Outdated Show resolved Hide resolved
@lynlevenick
Copy link
Contributor Author

lynlevenick commented Apr 18, 2021

Addressed comments and squashed the commits together.

I'll open up some other changes as separate PRs once I've pulled them out of my branch.

edit: My other changes at present are mildly invasive - avoiding some GPU flushes at a memory tradeoff, adding a threadpool to saturate memory bandwidth - which might hit resource limits or perform worse on some platforms/configurations, so I'm not sure Dolphin would want them.

edit 2: Cleaned up the comment again (it was getting confused with my other branch, apologies).

texture serialization and deserialization used to involve many memory
allocations and deallocations, along with many copies to and from
those allocations. avoid those by reserving a memory region inside the
output and writing there directly, skipping the allocation and copy to
an intermediate buffer entirely.
@leoetlino leoetlino merged commit 2b632f6 into dolphin-emu:master May 7, 2021
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
8 participants