Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unroll wildCopy32 into wildCopy64 #1407

Open
wants to merge 1 commit into
base: dev
Choose a base branch
from

Conversation

Nicoshev
Copy link
Contributor

@Nicoshev Nicoshev commented May 12, 2024

Hello, I hope you are doing well

I wanted to propose unrolling wildCopy32 into wildCopy64: a function that copies memory in chunks of 64 bytes.
The intent is to improve branch prediction by halving the amount of execution forks.
FASTLOOP_SAFE_DISTANCE is already defined as 64, so there are minimal changes in the control flow.

Tested it in two scenarios:

Scenario 1:
CPU: Ryzen 5900X
Compilers: gcc-11, clang-8.
Files: enwik9, each file within the silesia corpus, a custom memory dump retrieved from my PC.
Results: Decompression throughput improved in all combinations of files and compilers.

Scenario 2:
CPU: i5-1245U
Compilers: gcc-[9-12], clang-[11-14].
Files: enwik9, 'xml' from the silesia corpus, the custom memory dump retrieved from my PC.
Results: Decompression throughput improved in all combinations of files and compilers.

The improvement percentage varies depending on the file, compiler and CPU measured.

The data seems to suggest that the improvement yielded by the change overcomes the effects of instruction alignment.
Hopefully, your test scenarios show similar results.

There are other unrolling optimizations that can be done, but maybe it's better to test them one at a time.
I'll propose them after deciding whether or not to merge this one.

Cheers,
Nicolas

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant