Skip to content

Conversation

sayantn
Copy link
Contributor

@sayantn sayantn commented Oct 4, 2025

This PR adds an alignment parameter to the SIMD intrinsics simd_masked_load and simd_masked_store. This parameter is the (byte) alignment of the ptr parameter, so this is kind of a generalization from the previous signature to allow under-aligned and over-aligned pointers.

The main motive for this is stdarch - most vector loads are either fully aligned (to the vector size) or unaligned (byte-aligned), so the previous signature doesn't cut it.

I introduced a const parameter instead of a const-generic parameter because portable-simd uses pointers aligned to the element type, and so needs to pass align_of::<T>() as the alignment, but this isn't possible with const-generic parameters without GCE.

Alternatives

Using a const-generic parameter, with 0 having the special meaning of using the element type's alignment. This will be useful in the common case of using the element type's alignment, and also offer enough flexibility to use in stdarch

cc @workingjubilee @RalfJung @BoxyUwU

@rustbot
Copy link
Collaborator

rustbot commented Oct 4, 2025

The Miri subtree was changed

cc @rust-lang/miri

Portable SIMD is developed in its own repository. If possible, consider making this change to rust-lang/portable-simd instead.

cc @calebzulawski, @programmerjake

Some changes occurred to the platform-builtins intrinsics. Make sure the
LLVM backend as well as portable-simd gets adapted for the changes.

cc @antoyo, @GuillaumeGomez, @bjorn3, @calebzulawski, @programmerjake

Some changes occurred in compiler/rustc_codegen_ssa

cc @WaffleLapkin

Some changes occurred to the intrinsics. Make sure the CTFE / Miri interpreter
gets adapted for the changes, if necessary.

cc @rust-lang/miri, @RalfJung, @oli-obk, @lcnr

@rustbot rustbot added A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Oct 4, 2025
@rustbot
Copy link
Collaborator

rustbot commented Oct 4, 2025

r? @lcnr

rustbot has assigned @lcnr.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rust-log-analyzer

This comment has been minimized.

let default = i32x4::splat(0);
let mask = i32x4::from_array([!0, !0, !0, 0]);
let vals = unsafe { intrinsics::simd_masked_load(mask, buf.as_ptr(), default) };
let vals = unsafe { intrinsics::simd_masked_load(mask, buf.as_ptr(), default, 4) };
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i32 doesn't always have alignment 4, so this should use align_of::<i32>()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh! Didn't know that, is that true for all primitive types too?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

basically all primitive types, though 1 byte types must have alignment 1 due to rust's rule that size is always a multiple of alignment.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

e.g. on avr-none, integers/floats have alignment 1, and on msp430-none-elf most types have alignment 2.

// The fourth argument is the alignment, must be a power of two integer constant
let alignment = bx
.const_to_opt_u128(args[3].immediate(), false)
.expect("typeck should have ensure that this is a const");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I'm missing it because I'm not familiar with typeck's handling of intrinsics, but I do not see where you actually add a typeck check that this is actually a constant. It looks like you're just telling typeck that it's a parameter of u32.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah the expect message here is wrong.

For simd_insert/extract/shuffle, we have some ad-hoc checks in typeck that ensure this. But for intrinsics it's also fine to just ICE when they are used wrong. Rust has no concept of const arguments so this is a bad hack anyway.

// CHECK: call void @llvm.masked.store.v4p0.p0(<4 x ptr> {{.*}}, ptr {{.*}}, i32 {{.*}}, <4 x i1> [[B]])
simd_masked_store(mask, pointer, values)
// CHECK: call void @llvm.masked.store.v4p0.p0(<4 x ptr> {{.*}}, ptr {{.*}}, i32 8, <4 x i1> [[B]])
simd_masked_store(mask, pointer, values, 8)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks like it assumes the alignment of a pointer is 8, but it should be fine even if that alignment is wrong.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not assuming anything here, just trying to test if the alignment is propagated to LLVM

@RalfJung
Copy link
Member

RalfJung commented Oct 5, 2025

If we only need normally-aligned and unaligned loads, IMO it'd be better to just have a const generic boolean indicating which of them we want for any particular operation. That avoids ad-hoc hacks such as const parameters in intrinsics.


// The fourth argument is the alignment, must be a power of two integer constant
let alignment = bx
.const_to_opt_u128(args[3].immediate(), false)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not use a const generic? That is a lot easier to implement in some other codegen backends.

@programmerjake
Copy link
Member

If we only need normally-aligned and unaligned loads, IMO it'd be better to just have a const generic boolean indicating which of them we want for any particular operation. That avoids ad-hoc hacks such as const parameters in intrinsics.

for portable-simd I think we should default to element-level-alignment since I expect that to be more efficient than unaligned ops on some targets (GPUs? maybe RISC-V V?)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants