New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Disk Shader Caching #4923
Disk Shader Caching #4923
Conversation
Are there any changes to the UI? Is there an option to enable / disable disk shader caching in settings? |
Posting this message from discord here in case we forget: the introduction of vfs is unnecessary. Can implement a simple vector appender (or not at all) for this use case. |
|
||
auto shader = OGLProgram(); | ||
shader.handle = glCreateProgram(); | ||
glProgramParameteri(shader.handle, GL_PROGRAM_SEPARABLE, GL_TRUE); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an extension. Did you make this optional? (Sorry if I missed some code) nvm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The disk cache only loads with separable shader objects enabled. I didn't wanna play around with caching full shader programs at this point in time ^^; Added a log in case they don't support it so theres some feedback. FYI I'm intentionally still saving the shaders to the transferable cache as i don't see a reason not too.
/// Loads current game's precompiled cache. Invalidates on failure. | ||
std::pair<ShaderDecompiledMap, ShaderDumpsMap> LoadPrecompiled(); | ||
|
||
/// Removes the transferable (and precompiled) cache file. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment indicates the function name is bad. Maybe rename it to InvalidateAll?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I'm not sure I agree, but I'll change it to save on argument. I think it logically follows that if the Transferable cache is invalidated, then you have to invalidate the Precompiled as well. Renamed it anyway
@chris062689 Added ui screenshot (Its a single checkbox in graphics) @wwylele Addressed all concerns and fixed any known issues. Putting it on canary for user testing |
Just a nit, but can't we at least keep the checkbox labeling the same as it is in yuzu? |
how about unseparated shader cache? |
-P "${CMAKE_SOURCE_DIR}/CMakeModules/GenerateSCMRev.cmake" | ||
DEPENDS | ||
# WARNING! It was too much work to try and make a common location for this list, | ||
# so if you need to change it, please update CMakeModules/GenerateSCMRev.cmake as well |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this warning be in CMakeModules/GenerateSCMRev.cmake
as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Eh, a new developer won't be changing that file before changing src/common so i don't see a reason too.
@@ -720,16 +720,15 @@ const std::string& GetUserPath(UserPath path) { | |||
SetUserPath(); | |||
return g_paths[path]; | |||
} | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Accidental?
@@ -546,7 +546,7 @@ std::optional<std::string> GetCurrentDir() { | |||
#endif | |||
free(dir); | |||
return strDir; | |||
} | |||
} // namespace FileUtil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clang format always gets this wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry but it really wants it https://travis-ci.org/citra-emu/citra/jobs/637758338#L576
path.end()); | ||
return std::string(RemoveTrailingSlash(path)); | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we able to use std::filesystem for these?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we support mac sadly, so i don't want to deal with that. last i checked, mac had it under a different header, and it wasn't complete? but i could be wrong.
src/common/zstd_compression.cpp
Outdated
return decompressed; | ||
} | ||
|
||
} // namespace Common::Compression |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
newline
src/video_core/renderer_base.h
Outdated
@@ -6,8 +6,9 @@ | |||
|
|||
#include <memory> | |||
#include "common/common_types.h" | |||
#include "core/core.h" | |||
#include "core/frontend/emu_window.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You have EmuWindow forward declared in this header already.
@@ -15,6 +15,7 @@ | |||
#include "common/microprofile.h" | |||
#include "common/scope_exit.h" | |||
#include "common/vector_math.h" | |||
#include "core/core.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is this included for?
LOG_ERROR(Render_OpenGL, "Failed to load transferable raw entry - skipping"); | ||
return {}; | ||
} | ||
transferable.insert({entry.GetUniqueIdentifier(), {}}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
emplace
// Save VS to the disk cache if its a new shader | ||
if (result) { | ||
auto& disk_cache = impl->disk_cache; | ||
ProgramCode program_code{setup.program_code.begin(), setup.program_code.end()}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this equivalent to just ProgramCode program_code{setup.program_code};
namespace OpenGL { | ||
|
||
class ShaderDiskCacheOpenGL; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Neither of these appear to used in this header.
@weihuoya I'm aware, but any android specific changes aren't a part of this PR and will be published with the rest of the android changes. |
Hello, I have a weird issue on Linux that I think is related to this PR, at least my bisection seems to say so as it does not happen on Canary with building at the previous commit. But disabling disk caching in the options changes nothing. When I switch virtual desktops, Citra's performance drop to probably about the same as if I disable the hardware renderer, obviously I cannot see the FPS counter but the audio is super slow/broken apart/jittery. Now interestingly just alt-tabbing out of Citra does not cause that issue, only switching virtual desktops. I've quickly tried with melonDS and The Dragon's Trap and I don't see that issue there, so I don't think it's my system's fault, but not completely sure either. Specs wise: log: https://gist.github.com/John-Gee/9d38443f53575e1f19f9940d414b2d0d I'll be happy to add any log/information needed or jump on chat if that's more helpful. Thank you! |
Based on: If I understand it correctly, I'm guessing the issue is from the 4940 PR, not this one, but since they are merged, it's hard for me to try either way. |
@John-Gee I'm going to be merging #4940 since replicating the issue you mentioned is hard for me since i develop on windows and that PR is holding up a lot of upcoming changes. I'd still like to take a look at fixing that at some point, so please just open a new issue with your findings so we can track it down. |
Known issues:
|
Did a first round of reviewing,
but I probably need to do a second one to look closer at some logic
src/common/zstd_compression.cpp
Outdated
namespace Common::Compression { | ||
|
||
std::vector<u8> CompressDataZSTD(const u8* source, std::size_t source_size, s32 compression_level) { | ||
compression_level = std::clamp(compression_level, 1, ZSTD_maxCLevel()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
magical number 1, is there no
ZSTD_minCLevel()`? is there no level 0?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Funny enough there isn't in the version we used, but the recently added it upstream so i'm updating and using it here now.
|
||
namespace OpenGL { | ||
|
||
using ShaderCacheVersionHash = std::array<u8, 64>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
magic number
explicit ShaderDiskCacheRaw(u64 unique_identifier, ProgramType program_type, | ||
RawShaderConfig config, ProgramCode program_code); | ||
ShaderDiskCacheRaw(); | ||
~ShaderDiskCacheRaw(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
= default should be in header not in cpp. Same for other ctors/dtors in this file
std::string GetBaseDir() const; | ||
|
||
/// Get current game's title id as u64 | ||
u64 GetProgramID(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Its not actually const. it has to load the program id if it exists from AppLoader.
std::string GetTitleID(); | ||
|
||
template <typename T> | ||
bool SaveArrayToPrecompiled(const T* data, std::size_t length) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this doesn't need a template and could just be const u8* data
Same for the function below
std::vector<std::size_t> load_raws_index; | ||
// Loads both decompiled and precompiled shaders from the cache. If either one is missing for | ||
const auto LoadPrecompiledWorker = | ||
[&](std::size_t begin, std::size_t end, const std::vector<ShaderDiskCacheRaw>& raws, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
begin is unused
this can probably be made nicer (and maybe faster) by using iterators, same for the other lambdas below
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason I made them lambdas is for threaded shader loading (which was a feature that I removed). This is also why begin is unused. I don't mind changing it back, but i think its more readable this way personally so you can see which blocks are what. Just my opinion though.
|
||
compilation_failed = false; | ||
|
||
const auto LoadTransferable = [&](std::size_t begin, std::size_t end, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unused begin
I've addressed all feedback and fixed the sanitize mul issue. I plan to merge this tomorrow so that we can start testing some other things. Thanks for the feedback! |
Based on the excellent work from Rodrigo, this brings over much of the framework that he wrote but with obvious changes to support 3DS shaders instead.
Design:
There are two caches with different reasons to wipe the cache. The transferrable cache is meant to be device agnostic, and store everything from the PICA state needed to generate the shaders. For FS and VS, we store the entire PICA registers, as its simpler to handle storing the same configuration for both types. Additionally, for VS, we store the shader program and shader swizzle data from memory as well. As this cache should be essentially unchanged from version to version, it is only invalidated when either 1) a manual version bump occurs or 2) the cache is corrupted and a shader fails to build
The precompiled cache stores both generated GLSL and the resulting binary cache. This cache is only built during the game launch when the transferrable cache is loaded. If this cache is present, we will read all of the elements from the transferrable cache, and attempt to match up and "inject" the precompiled shaders into the in-memory ShaderCaches. If this cache is present, it should speed up loading as we don't need to run a full shader compile for all of the shaders. This cache is invalidated in the following circumstances: 1) Precompiled shader binary is rejected by the driver 2) any of the long list of files related to shader generation is changed (causing the generated version hash in cmake to not match) 3) the cache is missing both the decompiled glsl and the precompiled binary.
Currently there is NOT a loading screen as I want people to test it out first and focus on the actual implementation. I would rather avoid a loading screen entirely and instead do async compilation on a separate context, but if we can't work that out in time before this gets merged, I can tack on the loading screen I made for yuzu.
UI Screenshot

This change is