Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disk Shader Caching #4923

Merged
merged 19 commits into from Jan 18, 2020
Merged

Disk Shader Caching #4923

merged 19 commits into from Jan 18, 2020

Conversation

jroweboy
Copy link
Contributor

@jroweboy jroweboy commented Sep 7, 2019

Based on the excellent work from Rodrigo, this brings over much of the framework that he wrote but with obvious changes to support 3DS shaders instead.

Design:
There are two caches with different reasons to wipe the cache. The transferrable cache is meant to be device agnostic, and store everything from the PICA state needed to generate the shaders. For FS and VS, we store the entire PICA registers, as its simpler to handle storing the same configuration for both types. Additionally, for VS, we store the shader program and shader swizzle data from memory as well. As this cache should be essentially unchanged from version to version, it is only invalidated when either 1) a manual version bump occurs or 2) the cache is corrupted and a shader fails to build

The precompiled cache stores both generated GLSL and the resulting binary cache. This cache is only built during the game launch when the transferrable cache is loaded. If this cache is present, we will read all of the elements from the transferrable cache, and attempt to match up and "inject" the precompiled shaders into the in-memory ShaderCaches. If this cache is present, it should speed up loading as we don't need to run a full shader compile for all of the shaders. This cache is invalidated in the following circumstances: 1) Precompiled shader binary is rejected by the driver 2) any of the long list of files related to shader generation is changed (causing the generated version hash in cmake to not match) 3) the cache is missing both the decompiled glsl and the precompiled binary.

Currently there is NOT a loading screen as I want people to test it out first and focus on the actual implementation. I would rather avoid a loading screen entirely and instead do async compilation on a separate context, but if we can't work that out in time before this gets merged, I can tack on the loading screen I made for yuzu.

UI Screenshot
citra-qt_2019-09-09_19-51-34


This change is Reviewable

@flamesage
Copy link
Member

@flamesage flamesage commented Sep 7, 2019

Are there any changes to the UI?

Is there an option to enable / disable disk shader caching in settings?
Do we have an option in the game list to open the Disk Shader cache folder like in yuzu?

@wwylele
Copy link
Member

@wwylele wwylele commented Sep 8, 2019

Posting this message from discord here in case we forget: the introduction of vfs is unnecessary. Can implement a simple vector appender (or not at all) for this use case.


auto shader = OGLProgram();
shader.handle = glCreateProgram();
glProgramParameteri(shader.handle, GL_PROGRAM_SEPARABLE, GL_TRUE);
Copy link
Member

@wwylele wwylele Sep 8, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an extension. Did you make this optional? (Sorry if I missed some code) nvm

Copy link
Contributor Author

@jroweboy jroweboy Sep 10, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The disk cache only loads with separable shader objects enabled. I didn't wanna play around with caching full shader programs at this point in time ^^; Added a log in case they don't support it so theres some feedback. FYI I'm intentionally still saving the shaders to the transferable cache as i don't see a reason not too.

/// Loads current game's precompiled cache. Invalidates on failure.
std::pair<ShaderDecompiledMap, ShaderDumpsMap> LoadPrecompiled();

/// Removes the transferable (and precompiled) cache file.
Copy link
Member

@wwylele wwylele Sep 8, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment indicates the function name is bad. Maybe rename it to InvalidateAll?

Copy link
Contributor Author

@jroweboy jroweboy Sep 10, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I'm not sure I agree, but I'll change it to save on argument. I think it logically follows that if the Transferable cache is invalidated, then you have to invalidate the Precompiled as well. Renamed it anyway 👍

@jroweboy
Copy link
Contributor Author

@jroweboy jroweboy commented Sep 10, 2019

@chris062689 Added ui screenshot (Its a single checkbox in graphics)

@wwylele Addressed all concerns and fixed any known issues. Putting it on canary for user testing 👍

@FearlessTobi
Copy link
Contributor

@FearlessTobi FearlessTobi commented Sep 10, 2019

Just a nit, but can't we at least keep the checkbox labeling the same as it is in yuzu?

@weihuoya
Copy link
Contributor

@weihuoya weihuoya commented Sep 12, 2019

how about unseparated shader cache?
separated shader doesn't work well on android device.

WaterFace added a commit to WaterFace/citra that referenced this issue Sep 19, 2019
-P "${CMAKE_SOURCE_DIR}/CMakeModules/GenerateSCMRev.cmake"
DEPENDS
# WARNING! It was too much work to try and make a common location for this list,
# so if you need to change it, please update CMakeModules/GenerateSCMRev.cmake as well
Copy link
Contributor

@BreadFish64 BreadFish64 Nov 5, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this warning be in CMakeModules/GenerateSCMRev.cmake as well?

Copy link
Contributor Author

@jroweboy jroweboy Nov 28, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eh, a new developer won't be changing that file before changing src/common so i don't see a reason too.

@@ -720,16 +720,15 @@ const std::string& GetUserPath(UserPath path) {
SetUserPath();
return g_paths[path];
}

Copy link
Contributor

@BreadFish64 BreadFish64 Nov 5, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Accidental?

@@ -546,7 +546,7 @@ std::optional<std::string> GetCurrentDir() {
#endif
free(dir);
return strDir;
}
} // namespace FileUtil
Copy link
Contributor

@BreadFish64 BreadFish64 Nov 5, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clang format always gets this wrong.

Copy link
Contributor Author

@jroweboy jroweboy Jan 16, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

path.end());
return std::string(RemoveTrailingSlash(path));
}

Copy link
Contributor

@BreadFish64 BreadFish64 Nov 5, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we able to use std::filesystem for these?

Copy link
Contributor Author

@jroweboy jroweboy Nov 28, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we support mac sadly, so i don't want to deal with that. last i checked, mac had it under a different header, and it wasn't complete? but i could be wrong.

return decompressed;
}

} // namespace Common::Compression
Copy link
Contributor

@BreadFish64 BreadFish64 Nov 5, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

newline

@@ -6,8 +6,9 @@

#include <memory>
#include "common/common_types.h"
#include "core/core.h"
#include "core/frontend/emu_window.h"
Copy link
Contributor

@BreadFish64 BreadFish64 Nov 5, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You have EmuWindow forward declared in this header already.

@@ -15,6 +15,7 @@
#include "common/microprofile.h"
#include "common/scope_exit.h"
#include "common/vector_math.h"
#include "core/core.h"
Copy link
Contributor

@BreadFish64 BreadFish64 Nov 5, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this included for?

LOG_ERROR(Render_OpenGL, "Failed to load transferable raw entry - skipping");
return {};
}
transferable.insert({entry.GetUniqueIdentifier(), {}});
Copy link
Contributor

@BreadFish64 BreadFish64 Nov 5, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

emplace

// Save VS to the disk cache if its a new shader
if (result) {
auto& disk_cache = impl->disk_cache;
ProgramCode program_code{setup.program_code.begin(), setup.program_code.end()};
Copy link
Contributor

@BreadFish64 BreadFish64 Nov 5, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this equivalent to just ProgramCode program_code{setup.program_code};

namespace OpenGL {

class ShaderDiskCacheOpenGL;

Copy link
Contributor

@BreadFish64 BreadFish64 Nov 5, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Neither of these appear to used in this header.

@jroweboy
Copy link
Contributor Author

@jroweboy jroweboy commented Nov 28, 2019

@weihuoya I'm aware, but any android specific changes aren't a part of this PR and will be published with the rest of the android changes.

@John-Gee
Copy link

@John-Gee John-Gee commented Dec 5, 2019

Hello,

I have a weird issue on Linux that I think is related to this PR, at least my bisection seems to say so as it does not happen on Canary with building at the previous commit. But disabling disk caching in the options changes nothing.

When I switch virtual desktops, Citra's performance drop to probably about the same as if I disable the hardware renderer, obviously I cannot see the FPS counter but the audio is super slow/broken apart/jittery. Now interestingly just alt-tabbing out of Citra does not cause that issue, only switching virtual desktops.

I've quickly tried with melonDS and The Dragon's Trap and I don't see that issue there, so I don't think it's my system's fault, but not completely sure either.

Specs wise:
I have a 4790k with a Radeon RX 580, running on Arch Linux 64b, using Mesa for the driver, and my DE is KDE's Plasma 5.17.4.

log: https://gist.github.com/John-Gee/9d38443f53575e1f19f9940d414b2d0d
The game I've tried in this log is Fire Emblem Echoes, but I've just tried with the much simpler (I think) Cave Story and it was the same.

I'll be happy to add any log/information needed or jump on chat if that's more helpful.

Thank you!

@John-Gee
Copy link

@John-Gee John-Gee commented Dec 6, 2019

Based on:
#4940 (comment)

If I understand it correctly, I'm guessing the issue is from the 4940 PR, not this one, but since they are merged, it's hard for me to try either way.

@jroweboy
Copy link
Contributor Author

@jroweboy jroweboy commented Dec 15, 2019

@John-Gee I'm going to be merging #4940 since replicating the issue you mentioned is hard for me since i develop on windows and that PR is holding up a lot of upcoming changes. I'd still like to take a look at fixing that at some point, so please just open a new issue with your findings so we can track it down.

@jroweboy
Copy link
Contributor Author

@jroweboy jroweboy commented Dec 16, 2019

Known issues:

  • The cache doesn't take into account accurate mul setting, which can cause some weird situations with games that rely on it, but have a cache that wasn't compiled with it on. Currently investigating what the best route for fixing that is
  • A small number of games have a large number of shaders so load times are exasperated on those. Tomorrow I plan on looking at a fix for that, and if it doesn't look likely, then I can port my load screen over from yuzu.

Copy link
Contributor

@B3n30 B3n30 left a comment

Did a first round of reviewing,

but I probably need to do a second one to look closer at some logic

namespace Common::Compression {

std::vector<u8> CompressDataZSTD(const u8* source, std::size_t source_size, s32 compression_level) {
compression_level = std::clamp(compression_level, 1, ZSTD_maxCLevel());
Copy link
Contributor

@B3n30 B3n30 Jan 12, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

magical number 1, is there no ZSTD_minCLevel()`? is there no level 0?

Copy link
Contributor Author

@jroweboy jroweboy Jan 16, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Funny enough there isn't in the version we used, but the recently added it upstream so i'm updating and using it here now.


namespace OpenGL {

using ShaderCacheVersionHash = std::array<u8, 64>;
Copy link
Contributor

@B3n30 B3n30 Jan 12, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

magic number

explicit ShaderDiskCacheRaw(u64 unique_identifier, ProgramType program_type,
RawShaderConfig config, ProgramCode program_code);
ShaderDiskCacheRaw();
~ShaderDiskCacheRaw();
Copy link
Contributor

@B3n30 B3n30 Jan 12, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

= default should be in header not in cpp. Same for other ctors/dtors in this file

std::string GetBaseDir() const;

/// Get current game's title id as u64
u64 GetProgramID();
Copy link
Contributor

@B3n30 B3n30 Jan 12, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const?

Copy link
Contributor Author

@jroweboy jroweboy Jan 16, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its not actually const. it has to load the program id if it exists from AppLoader.

std::string GetTitleID();

template <typename T>
bool SaveArrayToPrecompiled(const T* data, std::size_t length) {
Copy link
Contributor

@B3n30 B3n30 Jan 12, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this doesn't need a template and could just be const u8* data
Same for the function below

std::vector<std::size_t> load_raws_index;
// Loads both decompiled and precompiled shaders from the cache. If either one is missing for
const auto LoadPrecompiledWorker =
[&](std::size_t begin, std::size_t end, const std::vector<ShaderDiskCacheRaw>& raws,
Copy link
Contributor

@B3n30 B3n30 Jan 12, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

begin is unused

this can probably be made nicer (and maybe faster) by using iterators, same for the other lambdas below

Copy link
Contributor Author

@jroweboy jroweboy Jan 16, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason I made them lambdas is for threaded shader loading (which was a feature that I removed). This is also why begin is unused. I don't mind changing it back, but i think its more readable this way personally so you can see which blocks are what. Just my opinion though.


compilation_failed = false;

const auto LoadTransferable = [&](std::size_t begin, std::size_t end,
Copy link
Contributor

@B3n30 B3n30 Jan 12, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unused begin

@jroweboy
Copy link
Contributor Author

@jroweboy jroweboy commented Jan 16, 2020

I've addressed all feedback and fixed the sanitize mul issue. I plan to merge this tomorrow so that we can start testing some other things. Thanks for the feedback!

@jroweboy jroweboy merged commit e74a402 into citra-emu:master Jan 18, 2020
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants