[codegen] legalize imul for 64-bit and 128-bit operands #1109
Conversation
@bjorn3 this is exactly what I needed! Thanks. |
On x86 you can directly use x86_u/smulx to get both low and hi bits at once: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it just me, or does it need to be more complex. compiler-builtins uses 18 lines to implement this: https://github.com/rust-lang/compiler-builtins/blob/462b73c1fe1f67a62223a3ccf830f02a2571c016/src/int/mul.rs#L7-L26
// TODO(ryzokuken): explore the perf diff w/ x86_umulx and consider have a | ||
// separate legalization for x86. | ||
narrow.legalize( | ||
def!(a = imul.I64(x, y)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please also implement this and further i64 -> i32 narrowing legalizations for i128?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could legalize imul.ty
in terms of imul.ty_half
which is good, but I guess I'll need to legalize umulhi
for I64 for that too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For x86_64 it is already implemented: https://github.com/CraneStation/cranelift/blob/6ed06a1e8e428aedb7adfd825b131abb9a7a3d15/cranelift-codegen/meta/src/isa/x86/encodings.rs#L725
32bit support is not needed for me at least for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so umulhi
is defined for x86_64. I guess it's okay to add the legalization since it's not defined for any other ISA anyway...
6f8d8aa
to
8be64f4
Compare
8be64f4
to
ab41f63
Compare
Could you please add a run test with very large 128bit ints, say 0x98fe985354ab06f2347ac4503f1e24 and 0x42e1f3054ca7432f606ba453589ef89? It should return 0xa363ce3b6849f307be2044b2742ebd44. They have to be very large, so that no parts are zero and 128bit, so the legalizations actually get used on 64bit archs. Tested using: fn main() {
println!("{:x}", 0x98fe985354ab06f2347ac4503f1e24u128.wrapping_mul(0x42e1f3054ca7432f606ba453589ef89u128));
} Note: I just typed some random numbers and letters to generate those numbers. |
ab41f63
to
92cdc8d
Compare
iconst.i64 + iconcat.i128 could be used instead of iconst.i128 for now. |
@bjorn3 |
It will probably work when leaving |
Immediate values are limited to 64bit as 128bit would drastically increase memory usage. You could use the non imm variant of |
@bjorn3 looks like this will now work as soon as
|
1b18bf7
to
10c4993
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, looking nice!
10c4993
to
71d2f9b
Compare
Add a legalization that legalizes imul.I64 for 32-bit ISAs and imul.I128 for 64-bit (and subsequently 32-bit) ISAs. Refs: bnjbvr/cranelift-x86#4
71d2f9b
to
c9f5ec9
Compare
Add a legalization that legalizes imul.I64 for 32-bit ISAs.
Refs: bnjbvr/cranelift-x86#4
/cc @wingo @caitp
@bnjbvr PTAL.
There's one small problem though, there's no existing way to handle carry output in
imul
. Good news? According to https://www.felixcloutier.com/x86/imul,imul
does set the appropriate flags on x86-32 atleast, so I can just add an alternateimul_ifcout
instruction, I guess? wdyt?