Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add an intrinsic for ptr::from_raw_parts(_mut) #123840

Merged
merged 11 commits into from Apr 21, 2024

Conversation

scottmcm
Copy link
Member

Fixes #123174
cc @CAD97 @saethlin
r? @cjgillot

As suggested in #123190 (comment), this adds a new AggregateKind::RawPtr for creating a pointer from its data pointer and its metadata.

That means that slice::from_raw_parts and friends no longer need to hard-code pointer layout into libcore, and because it no longer does union hacks the MIR is shorter and more amenable to optimizations.

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Apr 12, 2024
@rustbot
Copy link
Collaborator

rustbot commented Apr 12, 2024

Some changes occurred to MIR optimizations

cc @rust-lang/wg-mir-opt

This PR changes MIR

cc @oli-obk, @RalfJung, @JakobDegen, @davidtwco, @celinval, @vakaras

Some changes occurred to the CTFE / Miri engine

cc @rust-lang/miri

This PR changes Stable MIR

cc @oli-obk, @celinval, @ouz-a

_5 = ((*_1).1: usize);
StorageLive(_6);
_6 = _4 as *const () (PtrToPtr);
_7 = *const [u8] from (_6, _5);
Copy link
Member Author

@scottmcm scottmcm Apr 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For comparison, on nightly this line is

        _9 = std::ptr::metadata::PtrComponents::<[u8]> { data_pointer: _7, metadata: _3 };
        _8 = std::ptr::metadata::PtrRepr::<[u8]> { const_ptr: move _9 };
        _6 = (_8.0: *const [u8]);

if you put the threshold high enough to see it https://rust.godbolt.org/z/baT3bqEze

// SAFETY: Accessing the value from the `PtrRepr` union is safe since *const T
// and PtrComponents<T> have the same memory layouts. Only std can make this
// guarantee.
unsafe { PtrRepr { components: PtrComponents { data_pointer, metadata } }.const_ptr }
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sadly another PR to add an intrinsic for https://doc.rust-lang.org/nightly/std/ptr/fn.metadata.html will be needed before this can delete PtrRepr and PtrComponents. But the scroll bar was long enough already that I didn't want to do that in this change.

@rust-log-analyzer

This comment has been minimized.

@@ -910,6 +911,8 @@ impl<'body, 'tcx> VnState<'body, 'tcx> {
}
// Do not track unions.
AggregateKind::Adt(_, _, _, _, Some(_)) => return None,
// FIXME: Do the extra work to GVN `from_raw_parts`
AggregateKind::RawPtr(..) => return None,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I started down the path of an AggregateTy::RawPtr(ty), then decided that since GVN couldn't track the old code because of the union, I didn't need to shave this yak too right now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok.

@rustbot
Copy link
Collaborator

rustbot commented Apr 12, 2024

Some changes occurred in compiler/rustc_codegen_cranelift

cc @bjorn3

@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@scottmcm
Copy link
Member Author

Pulling something out of a review comment to provide some context:

and RawPtr also works for thin pointers it seems

Yeah, since ptr::from_raw_parts might not be inlined it needed to handle the unit-metadata cases too -- or at least I thought that was better than trying to do it with specialization in libcore.

InstSimplify will replace it with just a PtrToPtr cast if it does inline enough to be able to see that the metadata is (), though.

@rust-log-analyzer

This comment has been minimized.

Copy link
Contributor

@cjgillot cjgillot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for investigating this @scottmcm!

// raw pointer rvalues are checked in the MIR validator,
// and since they only deal in things that have no lifetimes,
// there's nothing that needs checking here.
return;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If RawPtr is only created by an intrinsic, this should bug!.

@@ -717,6 +717,24 @@ impl<'a, 'tcx, Bx: BuilderMethods<'a, 'tcx>> FunctionCx<'a, 'tcx, Bx> {
OperandRef { val: OperandValue::Immediate(static_), layout }
}
mir::Rvalue::Use(ref operand) => self.codegen_operand(bx, operand),
mir::Rvalue::Aggregate(box mir::AggregateKind::RawPtr(..), ref fields) => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: should this appear in an earlier commit?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first few commits here aren't split up super great, just checkpoints where it built and the tests pass. The tests "passing" for them wasn't terribly meaningful, though, since without an intrinsic for it and without the lowering using it, then the code wasn't actually ever running.

LMK if you're fine leaving it as-is or would rather I just merge the first few.

@@ -36,6 +36,7 @@ impl<'tcx> MirPass<'tcx> for InstSimplify {
ctx.simplify_bool_cmp(&statement.source_info, rvalue);
ctx.simplify_ref_deref(&statement.source_info, rvalue);
ctx.simplify_len(&statement.source_info, rvalue);
ctx.simplify_ptr_aggregate(&statement.source_info, rvalue);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could the new opt be in a standalone commit, to see its effect on tests?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/// *Both* operands must always be included, even the unit value if this is
/// creating a thin pointer. If you're just converting between thin pointers,
/// you may want an [`Rvalue::Cast`] with [`CastKind::PtrToPtr`] instead.
RawPtr(Ty<'tcx>, Mutability),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In terms of generality, should this support only raw pointers, or borrows too? Or to be left as a follow-up experiment?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now I think that Rvalue::Ref is the only thing that makes references (well, I guess Transmutes too), so I'd rather leave it on just raw pointers, for now, with the &*_2 dance for turning it back into a reference. Especially since I don't want to worry about making sure that whatever SB/TB tagging is needed is emitting in Miri for things.

It's also nice in the library for the intrinsic to be able to be safe, which it can't if it makes references. That could be done with separate intrinsics so the raw pointer one can still be safe, but that's again making the size of the change bigger again, so I think not doing it in this PR, at least, is best.

@@ -910,6 +911,8 @@ impl<'body, 'tcx> VnState<'body, 'tcx> {
}
// Do not track unions.
AggregateKind::Adt(_, _, _, _, Some(_)) => return None,
// FIXME: Do the extra work to GVN `from_raw_parts`
AggregateKind::RawPtr(..) => return None,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok.

@scottmcm scottmcm force-pushed the aggregate-kind-rawptr branch 3 times, most recently from bf66fec to 1712f5c Compare April 14, 2024 19:48
@scottmcm
Copy link
Member Author

StorageDead(_3);
StorageLive(_5);
_5 = _4 as *const () (PtrToPtr);
_0 = _4 as *const u32 (PtrToPtr);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a serendipitous bonus, this PR means that using byte_add on a thin pointer is no longer worse in MIR than casting to *const u8 and offsetting yourself, as with unit metadata there's no longer any union dances needed, just the straight-forward casting.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, even better -- with #123949 it also drops the irrelevant cast to *const ().

#[rustc_intrinsic_must_be_overridden]
#[cfg(not(bootstrap))]
pub const fn aggregate_raw_ptr<P: AggregateRawPtr<D, Metadata = M>, D, M>(_data: D, _meta: M) -> P {
// No fallback because `libcore` doesn't want to know the layout
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// No fallback because `libcore` doesn't want to know the layout
// To implement a fallback we'd have to assume the layout, but the whole point of this intrinsic is to not do that.

We do want to know the layout, we just can't :p

@@ -147,6 +157,30 @@ impl<'tcx> InstSimplifyContext<'tcx, '_> {
}
}

/// Transform "Aggregate(RawPtr, \[p, ()\])" ==> "Cast(PtrToPtr, p)".
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This escaping is really unfortunate. Is it possible to use backticks for code formatting instead?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. I was copying the other two method headers that also used ", so I've updated all three of them to backticks.

@scottmcm
Copy link
Member Author

@bors r=cjgillot

@bors
Copy link
Contributor

bors commented Apr 21, 2024

📌 Commit 5e785b1 has been approved by cjgillot

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Apr 21, 2024
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this pull request Apr 21, 2024
…cjgillot

Add an intrinsic for `ptr::from_raw_parts(_mut)`

Fixes rust-lang#123174
cc `@CAD97` `@saethlin`
r? `@cjgillot`

As suggested in rust-lang#123190 (comment), this adds a new `AggregateKind::RawPtr` for creating a pointer from its data pointer and its metadata.

That means that `slice::from_raw_parts` and friends no longer need to hard-code pointer layout into `libcore`, and because it no longer does union hacks the MIR is shorter and more amenable to optimizations.
@matthiaskrgr
Copy link
Member

@bors r-
#124237 (comment)
proly fixes the ice? could you add a test?

@bors bors added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Apr 21, 2024
@scottmcm
Copy link
Member Author

Huh, curious that it still passes CI. I'll take a look.

@matthiaskrgr
Copy link
Member

ah must have been a different PR, the rollup without this one also failed, but the passing test was something with RefTarget which threw me off 😅

@bors r=cjgillot

@bors
Copy link
Contributor

bors commented Apr 21, 2024

📌 Commit 5e785b1 has been approved by cjgillot

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Apr 21, 2024
bors added a commit to rust-lang-ci/rust that referenced this pull request Apr 21, 2024
…iaskrgr

Rollup of 5 pull requests

Successful merges:

 - rust-lang#123840 (Add an intrinsic for `ptr::from_raw_parts(_mut)`)
 - rust-lang#124224 (cleanup: unnecessary clone during lower generics args)
 - rust-lang#124229 (Add gnullvm targets to manifest)
 - rust-lang#124231 (remove from reviewers)
 - rust-lang#124235 (Move some tests)

r? `@ghost`
`@rustbot` modify labels: rollup
@bors bors merged commit 67872e7 into rust-lang:master Apr 21, 2024
12 checks passed
@rustbot rustbot added this to the 1.79.0 milestone Apr 21, 2024
rust-timer added a commit to rust-lang-ci/rust that referenced this pull request Apr 21, 2024
Rollup merge of rust-lang#123840 - scottmcm:aggregate-kind-rawptr, r=cjgillot

Add an intrinsic for `ptr::from_raw_parts(_mut)`

Fixes rust-lang#123174
cc `@CAD97` `@saethlin`
r? `@cjgillot`

As suggested in rust-lang#123190 (comment), this adds a new `AggregateKind::RawPtr` for creating a pointer from its data pointer and its metadata.

That means that `slice::from_raw_parts` and friends no longer need to hard-code pointer layout into `libcore`, and because it no longer does union hacks the MIR is shorter and more amenable to optimizations.
@scottmcm scottmcm deleted the aggregate-kind-rawptr branch April 21, 2024 23:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

UB Check blocks MIR inlining of Vec::deref
9 participants