Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NEON64: enc: convert full encoding loop to inline assembly #98

Closed
aklomp opened this issue Jul 20, 2022 · 0 comments
Closed

NEON64: enc: convert full encoding loop to inline assembly #98

aklomp opened this issue Jul 20, 2022 · 0 comments
Assignees

Comments

@aklomp
Copy link
Owner

aklomp commented Jul 20, 2022

Convert the full encoding loop to an inline assembly implementation for compilers that can use inline assembly.

The motivation for this change is issue #96: when optimization is turned off on recent versions of clang, the encoding table is sometimes not loaded into sequential registers. This happens despite taking pains to ensure that the compiler uses an explicit set of registers for the load (v8-v11).

This leaves us with not much options beside rewriting the full encoding loop in inline assembly. Only that way can we be absolutely certain that the correct registers are used. Thankfully, aarch64 assembly is not very difficult to write by hand.

Fixes #96.
Closes #97.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant