Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.
Sign upwidening_mul #2417
Conversation
strake
added some commits
Apr 24, 2018
This comment has been minimized.
This comment has been minimized.
le-jzr
commented
Apr 24, 2018
|
I think we could come up with a better name for the method, but otherwise this is something I'd very much like to see added. Given that on some architectures (x86 to name one), multiplication always produces result twice the width of the operands, it seems rather inefficient to not provide a way to use the higher half. |
This comment has been minimized.
This comment has been minimized.
|
@le-jzr We can also choose an efficient-as-possible implementation on architectures where it doesn't and save users the bother. Some name ideas:
|
This comment has been minimized.
This comment has been minimized.
|
For reference, here is the generic definition from my u2size crate: #[inline(always)]
fn carrying_mul(x: usize, y: usize) -> (usize, usize) {
let n = 0usize.count_zeros() >> 1;
let halves = |x| (x >> n, x & (!0 >> n));
let ((x1, x0), (y1, y0)) = (halves(x), halves(y));
let (z2, z0) = (x1 * y1, x0 * y0);
let z1 = (x1 + x0) * (y1 + y0) - z2 - z0;
let (w, c) = usize::overflowing_add(z0, z1 << n);
(z2 + z1 >> n + c as u32, w)
}The emitted machine code for this is quite heinous. |
This comment has been minimized.
This comment has been minimized.
|
Having never heard of this operation before, but immediately understanding why it would be useful, I think the name needs to have some form of the word "wide" in it.
|
This comment has been minimized.
This comment has been minimized.
le-jzr
commented
Apr 24, 2018
|
After a bit of though, I see the logic behind |
This comment has been minimized.
This comment has been minimized.
tspiteri
commented
Apr 24, 2018
•
|
@strake I know it's not really the point, but couldn't you do something like this? #[inline(always)]
fn carrying_mul(x: usize, y: usize) -> (usize, usize) {
let usize_bits = 0usize.count_zeros();
if usize_bits <= 32 {
let z = (x as u64) * (y as u64);
((z >> usize_bits) as usize, z as usize)
} else if usize_bits <= 64 {
let z = (x as u128) * (y as u128);
((z >> usize_bits) as usize, z as usize)
} else {
// fallback
let n = usize_bits >> 1;
let halves = |x| (x >> n, x & (!0 >> n));
let ((x1, x0), (y1, y0)) = (halves(x), halves(y));
let (z2, z0) = (x1 * y1, x0 * y0);
let z1 = (x1 + x0) * (y1 + y0) - z2 - z0;
let (w, c) = usize::overflowing_add(z0, z1 << n);
(z2 + z1 >> n + c as u32, w)
}
} |
This comment has been minimized.
This comment has been minimized.
devonhollowood
commented
Apr 25, 2018
|
Maybe instead of returning a tuple, it would make sense to return a struct? So e.g. |
This comment has been minimized.
This comment has been minimized.
|
I have only seen this called "widening multiply", but mostly in low-level (compiler/ISA) contexts. |
This comment has been minimized.
This comment has been minimized.
|
I do wonder why this is a |
This comment has been minimized.
This comment has been minimized.
|
The broader rust project already has this in compiler-builtins (https://github.com/rust-lang-nursery/compiler-builtins/blob/master/src/int/mul.rs#L7), so I'd personally be happy to have it in core instead. A quick playground experiment shows the ASM is simpler for returning But I wonder if this really wants to be ternary instead, like fn mul_with_carry(self, multiplier: Self, carry: Self) -> (Self, Self) |
This comment has been minimized.
This comment has been minimized.
|
Despite what the drawbacks section says, `a as u128 * b as u128` results in
ideal code for widening u64 multiplication (with or without optimisations,
provided overflow checks are disabled.) Very similar results are obtained
if result is split into two parts afterwards.
|
This comment has been minimized.
This comment has been minimized.
tspiteri
commented
Apr 25, 2018
|
Regarding output tuple order, I think it could be made more obvious using the method name. For example |
This comment has been minimized.
This comment has been minimized.
|
To me this operation falls in the same category as leading/trailing zeros, rotates, bit reverse, saturation, etc. in that it
For these operations I'm in favor of providing them in the Rust standard library, both to offer a canonical way to write out these important primitives and to make it easier for the compiler to generate good code reliably (i.e., no need to pattern match a particular way to implement the operation). |
joshtriplett
added
the
T-libs
label
Apr 25, 2018
This comment has been minimized.
This comment has been minimized.
|
@joshtriplett the @nagisa yes, it does for me too, but i as a crate author worry whether it will remain so on other targets — will it generate such code on a 32-bit machine? a 128-bit machine? is Rust willing to preclude ever supporting 128-bit targets (e.g. RV128I)? I would prefer to have a method which means efficient double-wide multiplication rather than rely on the compiler to figure out what i mean and myself to not mess it up every time. @rkruppe thanks for verbalizing these points — yes, the notion is to have a canonical form so neither the compiler nor a future reader needs to guess what is going on.
|
This comment has been minimized.
This comment has been minimized.
|
Personally I think that the return type should be the next size up of an integer type. I wouldn't mind creating |
This comment has been minimized.
This comment has been minimized.
|
@clarcharr I'd love to have |
This comment has been minimized.
This comment has been minimized.
JonSeverinsson
commented
Apr 26, 2018
|
As the order of the high and low half of the result tuple is not obvious suggests to me that a tuple is probably not the best return type. The obvious solution would be a struct with two fields named "high" and "low", but I'm wondering if simply returning a wider integer type is not simpler (though that means i128/u128 will unfortunately lack the method). I.e. add the following functions (and the equivalent for unsigned integers) to the standard library.
|
This comment has been minimized.
This comment has been minimized.
|
@JonSeverinsson Not sure there is much of a point in doing that because you can simply cast to the wider type and perform the multiplication. |
This comment has been minimized.
This comment has been minimized.
le-jzr
commented
Apr 26, 2018
|
Returning twice as wide type instead of a tuple has the issue that it wouldn't work on |
This comment has been minimized.
This comment has been minimized.
|
I'd be fine with returning either a tuple or a struct. As @le-jzr noted, returning twice-as-wide type is impossible for |
This comment has been minimized.
This comment has been minimized.
|
Does it make sense to define fn widening_mul(self, other: isize) -> (isize, usize); |
This comment has been minimized.
This comment has been minimized.
|
@kennytm absolutely imho. |
This comment has been minimized.
This comment has been minimized.
I took a stab at the simplest version of that using the proposed method, and I think it ends up being something like this: fn scalar_mul_assign(big: &mut Vec<u8>, scalar: u8) {
let mut carry = 0;
for i in big.iter_mut() {
let (low, high) = i.wide_mul(scalar);
let (low, overflowed) = low.overflowing_add(carry);
*i = low;
carry = high + if overflowed { 1 } else { 0 };
}
if carry > 0 {
big.push(carry);
}
}That's not great, and I doubt LLVM will do a great job with it. So I think it'd be good to have the carry built-in, so it can be something like pub fn scalar_mul_assign2(big: &mut Vec<u8>, scalar: u8) {
let mut carry = 0;
for i in big.iter_mut() {
let (x, c) = i.mul_with_carry(scalar, carry);
*i = x;
carry = c;
}
if carry > 0 {
big.push(carry);
}
}Since one can always pass 0 as the carry, which LLVM will fold away easily. |
This comment has been minimized.
This comment has been minimized.
Pratyush
commented
May 6, 2018
|
+1, would make writing certain cryptographic code very simple |
This comment has been minimized.
This comment has been minimized.
|
One interesting use case for this might be a more efficient version of the
I mean, |
This comment has been minimized.
This comment has been minimized.
|
ping have we reached consensus on the name |
Centril
added
A-arithmetic
A-inherent-impl
A-primitive
labels
Nov 22, 2018
This comment has been minimized.
This comment has been minimized.
sm-Fifteen
commented
Nov 30, 2018
|
I'd like to point out that the apfloat component of the official librustc has had a |
This comment has been minimized.
This comment has been minimized.
TheIronBorn
commented
Nov 30, 2018
|
The |
This comment has been minimized.
This comment has been minimized.
|
@sm-Fifteen It is no coincidence I wrote that function as a non-allocating bigint multiplication helper when I ported APFloat from C++ and #2417 (comment) - so don't count my "vote" for |
strake
changed the title
carrying_mul
widening_mul
Dec 5, 2018
strake
force-pushed the
strake:carrying_mul
branch
from
c420699
to
329e58b
Dec 5, 2018
This comment has been minimized.
This comment has been minimized.
|
Consensus for name seems to be |
This comment has been minimized.
This comment has been minimized.
fstirlitz
commented
Jan 19, 2019
|
What's the rationale for having |
This comment has been minimized.
This comment has been minimized.
|
@fstirlitz One use case would be intermediate results that may overflow, e.g. rationals. While the end result of x*y/z might fit in a |
This comment has been minimized.
This comment has been minimized.
|
@fstirlitz I had 2 motivating use cases for
|
This comment has been minimized.
This comment has been minimized.
lu-zero
commented
Jan 29, 2019
|
Would be good to have the variants that return the wider type directly, since it has direct uses for a number of common dsp operations. |
This comment has been minimized.
This comment has been minimized.
|
@lu-zero |
strake commentedApr 24, 2018
•
edited
View