Skip to content

<__msvc_int128.hpp>: use __umulh on ARM64/ARM64EC (#6184)#6281

Open
Adesh4477 wants to merge 1 commit into
microsoft:mainfrom
Adesh4477:feat/int128-arm64-umulh
Open

<__msvc_int128.hpp>: use __umulh on ARM64/ARM64EC (#6184)#6281
Adesh4477 wants to merge 1 commit into
microsoft:mainfrom
Adesh4477:feat/int128-arm64-umulh

Conversation

@Adesh4477
Copy link
Copy Markdown

@Adesh4477 Adesh4477 commented May 11, 2026

On ARM64 _UMul128 falls through to the Knuth fallback because _STL_128_INTRINSICS is x64-only. ARM64 has umulh as a single instruction so we can do this in two ops
instead of ~thirty. Patch adds the obvious #elif branch using __umulh for the high half and a regular 64-bit * for the low half.

Tested locally this microbench on Snapdragon X Elite (5M random uint64 pairs * 5 reps):
Knuth fallback : ~82 ms (~3.27 ns/op)
__umulh path : ~27 ms (~1.08 ns/op)
Speedup : ~3.03x

Per the issue author, x64 is intentionally not modified -- _umul128 remains preferable there.

Fixes #6184

Copilot AI review requested due to automatic review settings May 11, 2026 10:41
@Adesh4477 Adesh4477 requested a review from a team as a code owner May 11, 2026 10:41
@github-project-automation github-project-automation Bot moved this to Initial Review in STL Code Reviews May 11, 2026
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds an ARM64/ARM64EC runtime fast-path for _Base128::_UMul128() in <__msvc_int128.hpp> by computing the high 64 bits via __umulh() and the low 64 bits via a normal 64-bit multiply, avoiding the existing Knuth base-2^32 fallback in non-constant-evaluated code.

Changes:

  • Add an ARM64/ARM64EC __umulh-based implementation for the high half of the 128-bit product.
  • Keep the existing constexpr/Knuth fallback for constant evaluation and other targets.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread stl/inc/__msvc_int128.hpp
Copy link
Copy Markdown
Contributor

@AlexGuteniev AlexGuteniev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You'd also need to use power words to link the PR to the issue from the PR description.
Please edit the description to include "Fixes #6184" or "Resolves #6184" or in any other form.

Comment thread stl/inc/__msvc_int128.hpp Outdated
Comment thread stl/inc/__msvc_int128.hpp
@StephanTLavavej StephanTLavavej added performance Must go faster ARM64 Related to the ARM64 architecture ARM64EC I can't believe it's not x64! labels May 11, 2026
Adds an ARM64/ARM64EC fast path to _Base128::_UMul128 that uses the
__umulh intrinsic for the high 64 bits and a plain 64-bit multiply for
the low 64 bits, in place of the Knuth-base-2^32 fallback.

Microbench on Snapdragon X Elite (5M random uint64 pairs * 5 reps):
  Knuth fallback : ~82 ms (~3.27 ns/op)
  __umulh path   : ~27 ms (~1.08 ns/op)
  Speedup        : ~3.03x

Disassembly collapses from ~30 ops (incl. /GS cookie push) to 4 ops
(umulh / mul / str / ret).

_STL_128_INTRINSICS is intentionally not enabled for ARM64; that macro
also gates _addcarry_u64, _subborrow_u64, __shiftleft128, __shiftright128,
and _udiv128/_div128, which have no direct single-instruction ARM64
equivalents and are out of scope for this change.

Per the issue author, x64 is intentionally not modified -- _umul128 remains
preferable there.
@Adesh4477 Adesh4477 force-pushed the feat/int128-arm64-umulh branch from d4723b9 to 00f4ead Compare May 12, 2026 03:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ARM64EC I can't believe it's not x64! ARM64 Related to the ARM64 architecture performance Must go faster

Projects

Status: Initial Review

Development

Successfully merging this pull request may close these issues.

<__msvc_int128.hpp>: investigate using __mulh and __umulh on ARM64 and ARM64EC

4 participants