Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

aarch64: Implement I128 Loads and Stores #2985

Merged
merged 2 commits into from
Jun 17, 2021

Conversation

afonso360
Copy link
Contributor

Hey, this implements loading and storing i128 values in the aarch64 backend

@github-actions github-actions bot added cranelift Issues related to the Cranelift code generator cranelift:area:aarch64 Issues related to AArch64 backend. labels Jun 14, 2021
Copy link
Member

@cfallin cfallin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! This looks good -- I very much like the clean generalization of the amode selection.

Just one request for a few more tests below but otherwise LGTM.

; run: %i128_stack_store_load_big_offset(0, -1) == true
; run: %i128_stack_store_load_big_offset(0x01234567_89ABCDEF, 0xFEDCBA98_76543210) == true
; run: %i128_stack_store_load_big_offset(0x06060606_06060606, 0xA00A00A0_0A00A00A) == true
; run: %i128_stack_store_load_big_offset(0xC0FFEEEE_DECAFFFF, 0xDECAFFFF_C0FFEEEE) == true
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add some test cases (compile-tests probably) that do basic load.i128 / store.i128 as well? At least right now, stack_store / stack_load expand to use these under the hood, but it's conceivable (especially if we try to address the codegen improvement ideas you suggest, i.e. treat sp-relative addresses differently) that this might not be true later.

Copy link
Contributor Author

@afonso360 afonso360 Jun 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm glad this needed revisiting, because the previous code had a bug that only showed up with load.i128!

I've added the separate load.i128/store.i128 run tests, but i'm using stack_addr to get a valid address. I think this may be a problem in the future, where we may just end up optimizing that pattern(stack_addr+load) into a stack_load.

Do we have a better way of getting a R+W address?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, thanks for adding this (and I'm glad it found an issue)!

I think this is fine -- there isn't, afaik, a way to allocate some other memory to read/write in a run-test (there's no runtime to call to dynamically allocate, for instance). It's possible we might try to optimize stack_addr + load/store in the future (to use sp directly) but if we do that, we'll see test failures on these compile tests and then work out another solution at that point. Compile-test-only coverage is probably fine at that point, IMHO.

The previous address calculation code had a bug where we tried to
add offsets into a temporary register before defining it, causing
the regalloc to complain.
Copy link
Member

@cfallin cfallin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

; run: %i128_stack_store_load_big_offset(0, -1) == true
; run: %i128_stack_store_load_big_offset(0x01234567_89ABCDEF, 0xFEDCBA98_76543210) == true
; run: %i128_stack_store_load_big_offset(0x06060606_06060606, 0xA00A00A0_0A00A00A) == true
; run: %i128_stack_store_load_big_offset(0xC0FFEEEE_DECAFFFF, 0xDECAFFFF_C0FFEEEE) == true
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, thanks for adding this (and I'm glad it found an issue)!

I think this is fine -- there isn't, afaik, a way to allocate some other memory to read/write in a run-test (there's no runtime to call to dynamically allocate, for instance). It's possible we might try to optimize stack_addr + load/store in the future (to use sp directly) but if we do that, we'll see test failures on these compile tests and then work out another solution at that point. Compile-test-only coverage is probably fine at that point, IMHO.

@cfallin cfallin merged commit de1edd4 into bytecodealliance:main Jun 17, 2021
@afonso360 afonso360 deleted the aarch64-i128-load-store branch June 25, 2021 10:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cranelift:area:aarch64 Issues related to AArch64 backend. cranelift Issues related to the Cranelift code generator
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants