Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

aarch64 add basic i128 bit ops #2959

Merged
merged 3 commits into from
Jun 9, 2021

Conversation

afonso360
Copy link
Contributor

Hey, some more ops implemented for i128 support.

I think the shifts can be reduced by a few instructions (especially the last csel), but I'm not seeing how right now.

I also didn't implement support for immlogic, changing it to support multiple registers didn't seem like a simple change.

This adds support for:

  • bnot
  • band
  • bor
  • bxor
  • band_not
  • bor_not
  • bxor_not
  • ishl
  • ushr
  • sshr

v5, v6 = isplit v4
return v5, v6
}
; run: %ishl_i128_i8(0x01010101_01010101, 0x01010101_01010101, 2) == [0x04040404_04040404, 0x04040404_04040404]
Copy link
Contributor Author

@afonso360 afonso360 Jun 2, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cfallin I pretty much copied the testing code from x64/shift-i128-run.clif. And added a few more test cases where my implementation had issues.

Do you think it would be a good idea to merge run tests so that they could be multi arch? I've been thinking about something along the lines of:

testfiles/runtests/i128-bitops.clif:

test run
target x86_64
target aarch64
target s390x

function %rest_of_the_tests() {}

I think we should be able to do this for all runtests? This way we could reuse all of these test cases for all arches.

The down side of this is that we may have to be more granular with the testfiles. For example this file right here fails for x86_64 because some of the bit ops fail. So we would have to split this into shift_run and bitops_run or something along those lines.

What do you think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, sharing run-tests makes a lot of sense! In general I actually want to try to shift tests from golden-code (compile tests) to golden-output (run tests) as this makes our suite more robust against cross-cutting backend changes, such as regalloc optimizations; so there are additional benefits, aside from sharing across architectures.

Could you be more specific about the failures you're seeing on x86_64, though? That's somewhat concerning -- unimplemented opcode, incorrect result, or something else?

Copy link
Contributor Author

@afonso360 afonso360 Jun 2, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great! Ill try to make a PR changing this soon.

About x86_64:
BandNot is failling on this assert.

BorNot / BxorNot / Cls (not on this PR yet) are not implemented at all.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, OK, for tests like those for BandNot where we don't have an implementation on x86_64, we can just omit the target line in the test file, perhaps with a comment saying "not yet implemented on $PLATFORM".

@github-actions github-actions bot added cranelift Issues related to the Cranelift code generator cranelift:area:aarch64 Issues related to AArch64 backend. labels Jun 2, 2021
Copy link
Member

@cfallin cfallin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

I'll plan to merge your runtests-consolidation PR (#2964) first, then I think this should be rebased over it, if I'm not mistaken?

@afonso360
Copy link
Contributor Author

afonso360 commented Jun 9, 2021

Yeah, sure, merge that first, and then I'll update this

Currently we just basically use a two instruction version of the same i64 ops.
IMMLogic doesn't really support multiple register inputs, so its left as a TODO for future optimizations.
@afonso360 afonso360 force-pushed the aarch64-i128-bit-ops branch 3 times, most recently from 8383d71 to 988ddfd Compare June 9, 2021 21:47
@cfallin cfallin merged commit caa85c2 into bytecodealliance:main Jun 9, 2021
@afonso360 afonso360 deleted the aarch64-i128-bit-ops branch June 10, 2021 13:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cranelift:area:aarch64 Issues related to AArch64 backend. cranelift Issues related to the Cranelift code generator
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants