New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Savestates: Use LZ4 algorithm for faster decompression #12217
Conversation
Please use a submodule for lz4 and break out adding it into its own commit. |
(I guess I could be convinced to have lz4 as not-submodule if you reduce it down to just the source + license without any of the extra files, it's pretty small and doesn't update all that often. But definitely break it out to its own commit.) |
There's a lot of tinkering in regards to the CMakeLists.txt that I wouldn't say I'm comfortable safely trimming excess code from, so I will adjust this to be a submodule. |
Does this apply to the dolphin/Source/Core/Core/State.cpp Lines 98 to 99 in da6339a
I'd hope that that's in an uncompressed header since that's the main thing that determines incompatibility (if the exact emulator version doesn't match it'll still try to load, but if the state version doesn't match it'll reject it). I'm not sure if it actually is or not though. |
The check against |
42c3820
to
4d31bd4
Compare
Sorry for the hectic rebases. OK, this is now a submodule, split into a separate commit. I've swapped from |
Hmm. I'd prefer it if there was a backwards-compatible way of doing things (in that trying to load the savestate on an older version gives a suitable error message). I think this could theoretically be done by either LZO-compressing only the version number or by disabling compression on the version number entirely (since a size of 0 in the header is treated as not compressed). I think we could have whatever data we want after it detects an incorrect version, but I'm not 100% sure. There are two reserved fields in the header, so it would be possible to use them for a new format identifier or something. I'm not entirely sure what that would look like and it would require testing against older versions to make sure things work; maybe the format this PR currently uses gives a good enough error message. One other thing: Dolphin currently includes Zstd since the rvz format uses it. This is supposed to be very fast for decompression; it might be better to use it instead of adding a new algorithm. If LZ4 is faster or about the same in practice though then an extra dependency is probably fine. |
If I understand correctly, lzo decompresses in chunks of 65536 bytes. To retrieve the version number you'd have to decompress that entire block, which seems wasteful.
Storing the revision in the header would be a good idea, but there'd definitely need to be discussion on how we would define this.
They are in the same family of compression algorithms, but lz4 outperforms zstd in terms of decompression speeds. I don't have exact metrics, but here's one source: https://indico.fnal.gov/event/16264/contributions/36466/attachments/22610/28037/Zstd__LZ4.pdf |
Rebased because I realized I had applied CMake-related changes in the second commit accidentally. This should be properly separated now. |
That should be possible, but make sure it's after the version string, not just after the version number.
Then you could disable compression. The part of the file that needs to be readable by old Dolphin versions is only 20 bytes or so. |
In case you missed this in the conversation you resolved, your CMake definitions are a bit wrong in general and don't work correctly with a system-installed lz4, I've fixed this here: AdmiralCurtiss@8f82639 |
Thank you immensely for doing this. I'll apply these changes on my next rebase |
17290b4
to
745d1d2
Compare
Updated my commits as follows:
I guess the last thing to debate is if we want to reserve empty space after the version info for future changes... I believe we don't need to do this. Any new data occurring after the version info and before the compressed data would result in the following:
|
c395b8c
to
9b75a46
Compare
Alrighty, I've appended the version information to the original header. Note that because the version string is a
Note that, like I said in my previous comment, I do not include the compressed size in the extended header. Instead I continuously read bytes so long as we have not read I appreciate any feedback, as I imagine with changes this large, there are a few improvements I could make that I don't immediately see. |
Updated to include @AdmiralCurtiss latest suggestions.
|
Updated to remove |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lots of small things... This is looking pretty good in terms of general structure to me now though.
Updated to address all suggestions that I marked as resolved! Thanks for the thorough review. I left a reply to clarify the remaining comments from @AdmiralCurtiss |
Updated to remove the unnecessary comment I had left. I guess the only thing I'd like to clear up that remains is whether to keep the |
Updated again! Thank you for the continued reviews. All suggestions should now be handled. Also I changed a few error messages from |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems fine now.
After many years, updating zstd to the latest upstream version will become attractive enough. According to official data, the encoding and decoding performance of the latest version of zstd has been objectively improved. |
Compared to LZO, LZ4 has near-identical compression ratio and compression time. However, LZ4 boasts substantial decompression time improvement. While compression time doesn't matter much for TASers, repeated savestate loading requires decompression to be fast, in order to improve efficiency during TAS sessions. While their website shows benchmarks that produce a 67% speedup in decompression time, in practice I found that for savestates it produces typical speedups of about 72%. This improvement is larger when the Internal Resolution is turned up higher, as this increases the typical savestate filesize.
LZ4 can compress at most
LZ4_MAX_INPUT_SIZE
~= 2.11GB at a time. To handle the upper-bound, I compressLZ4_MAX_INPUT_SIZE
at a time, looping each time so long as we have more bytes to compress. This scenario is very unlikely to affect anyone, as I had to increase my Internal Resolution to about 20x for this to occur on my end.Note that Netplay also uses LZO for packet transfer of the memcard between players. Given that I have no experience with this feature, and to prevent possible regression from limited testing due to my lack of experience, I've chosen to leave Netplay untouched. Note that if we change Netplay to use LZ4, then the LZO dependency can be removed from the project.
Support has been kept for forwards/backwards compatibility. These changes support reading the version info for legacy LZO savestates in order to show the user what version the incompatible state was made on. Loading new LZ4 states on older versions of Dolphin results in the version string still being shown to the user in an error message.
These changes aim to future-proof the ability the add more uncompressed data to the header. We can adjust the size of
StateExtendedHeader
by incrementingStateExtendedBaseHeader.payload_offset
. Thus we can support the serialization of any new fields, including char arrays.