Description
Background and motivation
On Arm64, there is already a MultiplyHigh
intrinsic that returns only the high part of an integer multiply. On x86/x64, mul
and imul
calculate the high part and low part at the same time can return the results in 2 registers, however there is no intrinsic that gives access to the high part of the result.
At present, the only way to get the high part of an integer multiply on x86/x64 is to use Math.BigMul
and discard the low part of the result. BigMul
is not currently optimized, although there is an open proposal for a multiply intrinsic that returns both halves of the result in a tuple #58263
I would like to propose a cross-platform way of accessing this functionality, as Math.MulHigh
.
API Proposal
namespace System;
public static class Math
{
public static int MulHigh(int left, int right);
public static uint MulHigh(uint left, uint right);
public static long MulHigh(long left, long right);
public static ulong MulHigh(ulong left, ulong right);
}
API Usage
ulong hi = Math.MulHigh(a, b);
Alternative Designs
Most C compilers recognize a large multiply followed by a shift right as a smaller multiply high. Something like:
int64 hi = (int64)((int128)int64 * int64) >> 64);
int32 hi = (int32)((int64)int32 * int32 >> 32);
With the new Int128
type coming to .NET, this could be a possibililty.
#58263 proposes a tuple-returning multiply that would return both halves of the result. It would be possible to simply ignore the lower half as long as JIT could optimize this.
Or Math.BigMul
could be optimized to use mul/imul and JIT could be made to recognize a discard of the low part parameter of , such that
ulong hi = Math.BigMul(a, b, out _);
would generate optimal code.
Risks
No response