Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add basic binary operators (and, or, xor, not) to IO::Buffer. #5893

Merged
merged 2 commits into from May 9, 2022

Conversation

ioquatix
Copy link
Member

@ioquatix ioquatix commented May 8, 2022

This introduces both the operators (&, |, ^ and ~) as well as in-place methods: and!, or!, xor!, not!.

string = 'Hello World'
buffer = IO::Buffer.for(string)
mask = IO::Buffer.for('abcd')

pp buffer.xor!(mask).xor!(mask)

@ioquatix ioquatix merged commit cea34bd into ruby:master May 9, 2022
@ioquatix ioquatix deleted the io-buffer-xor branch May 9, 2022 05:19
memory_xor(unsigned char * restrict output, unsigned char * restrict base, size_t size, unsigned char * restrict mask, size_t mask_size)
{
for (size_t offset = 0; offset < size; offset += 1) {
output[offset] = base[offset] ^ mask[offset % mask_size];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

% mask_size is likely quite slower than a bit mask (if the size is power of two), or condition with a separate mask_index resetting to 0 if >= mask_size.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked the assembly and it looks to generate fairly efficient SSE/AVX instructions, so I trust the compiler is doing the right things here. In any case, simple and correct is a good first implementation, and later on we can unroll it by hand if performance is an issue.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not about unrolling. It's about a modulo operation is as expensive as division (many cycles), and a lot more than + - * & ! ^.

Copy link
Member Author

@ioquatix ioquatix May 10, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I played around with this: https://godbolt.org/z/7MEfsrz9z - change -Os to -O3 and you see the compiler will unroll the loop and load the data in chunks. My understanding is that by unrolling the loop carefully, some division operations can be skipped.

If you know how we can count the number of divs being performed, that would be awesome, otherwise, I'm fine with slow but correct to start with, and figuring out a faster implementation later as an improvement. If you think adding a branch to the inner loop will ultimately be faster, let's do that? But at that point I think we need to benchmark it. It's entirely possible the compiler would do a worse job at unrolling a loop with an if statement.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fairly confident it does not remove the division operations, it could only do that if it did the branch I suggest, and there are plenty of divl and divq left in that assembly.
Maybe it does a few less, but it could only do that without branches if it can make sure the mask_size is e.g. > 1, and it cannot know that.

I agree, we should benchmark it, then it should be clear. With the explicit condition there would be 0 division left, but the extra branch. It should still unroll fine.
The best would be if we could have the mask_size as a constant in the source, this should be possible by doing an early check and call with a constant for the mask_size we care about + force inline this function.

schneems pushed a commit to schneems/ruby that referenced this pull request May 23, 2022
schneems pushed a commit to schneems/ruby that referenced this pull request Jul 26, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
2 participants