Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

remove the memcpy-on-equal-ptrs assumption #118265

Merged
merged 1 commit into from Nov 29, 2023
Merged

Conversation

RalfJung
Copy link
Member

One of the libc we support, musl, defines memcpy with restrict pointers. This in fact matches the definition in the C standard. Calling that memcpy with overlapping pointers is clearly UB, who knows what the compiler did when optimizing this memcpy -- it certainly assumed source and destination to be disjoint.

Lucky enough, it does not seem like we actually need this assumption that memcpy(p, p, n) is always allowed. clang and GCC need it since they use memcpy to compile C assignments, but we use memmove for similar code. There are no known cases where LLVM introduces calls to memcpy on equal pointers itself. (And if there were, that would be a soundness bug in rustc due to the musl issue mentioned above.)

This does mean we must make sure to never call the LLVM memcpy builtin on equal ranges even though the LangRef says that is allowed. Currently that is the case so we just need to make sure it remains the case. :) Cc @rust-lang/opsem @rust-lang/wg-llvm

@rustbot
Copy link
Collaborator

rustbot commented Nov 24, 2023

r? @cuviper

(rustbot has picked a reviewer for you, use r? to override)

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Nov 24, 2023
@the8472
Copy link
Member

the8472 commented Nov 25, 2023

There are no known cases where LLVM introduces calls to memcpy on equal pointers itself.

A black swan has not been observed, therefore all swans are white? Or do we have stronger confirmation that LLVM won't do this?

@RalfJung
Copy link
Member Author

I didn't say it's great but I don't know any better way than asking @nikic. 🤷 LLVM does not even document anywhere what it assumes about the libc memcpy, so it's hard to say anything with reasonable amounts of certainty.

That said, the main reason I am proposing this is that if we do see a black swan, then the bit of text we have here doesn't actually help. We are unsound on musl. So I want to have the text removed from the docs to make it clear that this would be a soundness bug.

@bjorn3
Copy link
Member

bjorn3 commented Nov 27, 2023

In the StatementKind::Assign implementation I can only find memcpy, not memmove:

base::memcpy_ty(bx, dest.llval, dest.align, r, source_align, dest.layout, flags)

@bjorn3
Copy link
Member

bjorn3 commented Nov 27, 2023

Looks like rustc indeed emits llvm.memcpy for your example, but LLVM itself turns it into memmove: https://godbolt.org/z/3PKP47z9a

@RalfJung
Copy link
Member Author

RalfJung commented Nov 27, 2023

In the StatementKind::Assign implementation I can only find memcpy, not memmove:

Our semantics for Assign are somewhat subtle. I'd like to fix that eventually but oh well... also see #68364.

Basically, if the assignment is of a type with Scalar or ScalarPair layout (and the scalars must be initialized), then the two sides may overlap. The LLVM backend will implement this as a load+store. For other types MIR Assign already requires disjointness (and Miri checks that) so we can use memcpy without relying on the "equal pointer" clause.

@RalfJung
Copy link
Member Author

Looks like rustc indeed emits llvm.memcpy for your example, but LLVM itself turns it into memmove: https://godbolt.org/z/3PKP47z9a

The MIR we generate has a temporary. This gets turned into LLVM IR with two memcpy to and from the temporary, which apparently LLVM optimizes into a memmove that skips the temporary. So all is good, as far as I can see.

//! source and target pointer are equal, the function is assumed to not be UB.
//! (Note that these are standard assumptions among compilers:
//! the `n` parameter is 0, the function is assumed to not be UB, even if the pointers are NULL or
//! dangling. (Note that making extra assumptions about these functions is common among compilers:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While removing the "equal" part, you added "NULL or dangling" -- isn't that a new assumption? Or are we already living by that elsewhere?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a clarification. We already have "if the n parameter is 0, the function is assumed to not be UB", which was already meant to include all pointers, including NULL or dangling. I just made it more explicit.

@cuviper
Copy link
Member

cuviper commented Nov 28, 2023

@bors r+ rollup

@bors
Copy link
Contributor

bors commented Nov 28, 2023

📌 Commit 7304220 has been approved by cuviper

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Nov 28, 2023
bors added a commit to rust-lang-ci/rust that referenced this pull request Nov 29, 2023
…iaskrgr

Rollup of 7 pull requests

Successful merges:

 - rust-lang#116839 (Implement thread parking for xous)
 - rust-lang#118265 (remove the memcpy-on-equal-ptrs assumption)
 - rust-lang#118269 (Unify `TraitRefs` and `PolyTraitRefs` in `ValuePairs`)
 - rust-lang#118394 (Remove HIR opkinds)
 - rust-lang#118398 (Add proper cfgs in std)
 - rust-lang#118419 (Eagerly return `ExprKind::Err` on `yield`/`await` in wrong coroutine context)
 - rust-lang#118422 (Fix coroutine validation for mixed panic strategy)

r? `@ghost`
`@rustbot` modify labels: rollup
bors added a commit to rust-lang-ci/rust that referenced this pull request Nov 29, 2023
…iaskrgr

Rollup of 7 pull requests

Successful merges:

 - rust-lang#116839 (Implement thread parking for xous)
 - rust-lang#118265 (remove the memcpy-on-equal-ptrs assumption)
 - rust-lang#118269 (Unify `TraitRefs` and `PolyTraitRefs` in `ValuePairs`)
 - rust-lang#118394 (Remove HIR opkinds)
 - rust-lang#118398 (Add proper cfgs in std)
 - rust-lang#118419 (Eagerly return `ExprKind::Err` on `yield`/`await` in wrong coroutine context)
 - rust-lang#118422 (Fix coroutine validation for mixed panic strategy)

r? `@ghost`
`@rustbot` modify labels: rollup
@bors bors merged commit 92a74e4 into rust-lang:master Nov 29, 2023
11 checks passed
@rustbot rustbot added this to the 1.76.0 milestone Nov 29, 2023
rust-timer added a commit to rust-lang-ci/rust that referenced this pull request Nov 29, 2023
Rollup merge of rust-lang#118265 - RalfJung:memcpy, r=cuviper

remove the memcpy-on-equal-ptrs assumption

One of the libc we support, musl, [defines `memcpy` with `restrict` pointers](https://git.musl-libc.org/cgit/musl/tree/src/string/memcpy.c#n5). This in fact matches the definition in the C standard. Calling that `memcpy` with overlapping pointers is clearly UB, who knows what the compiler did when optimizing this `memcpy` -- it certainly assumed source and destination to be disjoint.

Lucky enough, it does not seem like we actually need this assumption that `memcpy(p, p, n)` is always allowed. clang and GCC need it since they use `memcpy` to compile C assignments, but [we use memmove for similar code](https://godbolt.org/z/bcW85WYcM). There are no known cases where LLVM introduces calls to memcpy on equal pointers itself. (And if there were, that would be a soundness bug in rustc due to the musl issue mentioned above.)

This does mean we must make sure to never call the LLVM `memcpy` builtin on equal ranges even though the LangRef says that is allowed. Currently that is the case so we just need to make sure it remains the case. :) Cc `@rust-lang/opsem` `@rust-lang/wg-llvm`
@RalfJung RalfJung deleted the memcpy branch December 1, 2023 10:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants