Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhanced Load Masking for Prefixes and Suffixes #29

Closed
ashvardanian opened this issue Oct 17, 2023 · 0 comments
Closed

Enhanced Load Masking for Prefixes and Suffixes #29

ashvardanian opened this issue Oct 17, 2023 · 0 comments

Comments

@ashvardanian
Copy link
Owner

SimSIMD predominantly relies on unaligned loads for its operations. In instances where AVX-512 is utilized, masked loads are employed to bypass sequential operations on tail elements. However, a more expedient, albeit advanced, scheme can be explored. Under the assumption that any byte within a 64-byte cache line partaking in a vector implies the entire cache line is accessible, we can shift towards exclusively using aligned loads. This approach entails fetching the complete cache line with each load, inevitably conducting some superfluous operations but decidedly evading unaligned loads.

To circumvent potential complications with memory sanitizers, it's advised to incorporate the following attributes: __attribute__((no_sanitize_address)) and __attribute__((no_sanitize_thread)).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant