Clarify Semantics of Assignment in MIR #97567

JakobDegen · 2022-05-30T23:12:59Z

This PR clarifies the semantics of assignments in MIR to match what I suggested here (cc #68364).

The first commit updates documentation to reflect this change.
The second commit updates const eval to use the correct evaluation order. It does not yet update it to also retag the LHS place as unique. It is unclear how to implement this and likely needs a significant refactor, and so might be better left to future work.
The third commit adds utilities to rustc_middle to handle these semantics soundly.
The fourth commit updates InstCombine to be sound in the face of these semantics. I was not able to add a test for this, because without destprop or a similar function pass, there are too many temporaries to set this off. I briefly went through all the other passes, and I did not find anything that might not comply with this (except for dest prop, but that's being fixed elsewhere).

@RalfJung expressed some concerns that these semantics might be too subtle/footgunny, and we should instead have an explicit flag on StatementKind::Assign that indicates which version of the semantics the statement gets. This is a tradeoff - it would indeed simplify things for people trying to reason about the semantics. However, it would also be a larger refactor and we'd have to decide whether/how not to support it in borrowck, support it in codegen, etc. Seeing that at least for now the amount of complexity this introduces seems to be minimal, I am personally against such a change until we have more evidence of this causing actual problems.

r? @oli-obk

rust-highfive · 2022-05-30T23:13:02Z

Some changes occured to the CTFE / Miri engine

cc @rust-lang/miri

bors · 2022-05-31T23:59:17Z

☔ The latest upstream changes (presumably #97582) made this pull request unmergeable. Please resolve the merge conflicts.

RalfJung · 2022-06-02T13:22:57Z

compiler/rustc_const_eval/src/interpret/step.rs

                // Avoid recomputing the layout
                let op = self.eval_operand(operand, Some(dest.layout))?;
                self.copy_op(&op, &dest)?;
            }

            BinaryOp(bin_op, box (ref left, ref right)) => {
-                let layout = binop_left_homogeneous(bin_op).then_some(dest.layout);
-                let left = self.read_immediate(&self.eval_operand(left, layout)?)?;
+                let left = self.read_immediate(&self.eval_operand(left, None)?)?;


This needs careful benchmarking; the perf gain from that optimization was quite noticeable in the past.

RalfJung · 2022-06-02T13:24:10Z

compiler/rustc_const_eval/src/interpret/step.rs

+                // FIXME: Here, in `Aggregate`, and in `Repeat`, the various `dest = eval_place`s
+                // should also be retagging the pointer to ensure that it does not overlap with the
+                // RHS.
+                let dest = self.eval_place(place)?;
                // Avoid recomputing the layout
                let op = self.eval_operand(operand, Some(dest.layout))?;
                self.copy_op(&op, &dest)?;


Note that copy_op makes a non-overlapping assumption. Is that still what you want?

RalfJung · 2022-06-02T13:25:25Z

compiler/rustc_const_eval/src/interpret/step.rs

@@ -157,26 +147,29 @@ impl<'mir, 'tcx: 'mir, M: Machine<'mir, 'tcx>> InterpCx<'mir, 'tcx, M> {
        rvalue: &mir::Rvalue<'tcx>,
        place: mir::Place<'tcx>,
    ) -> InterpResult<'tcx> {
-        let dest = self.eval_place(place)?;


I don't like how this makes the evaluation of the LHS and RHS all tangled up. For some rvalues you do the LHS first, for some the LHS; it all looks like a big mess.

IMO we should just stick to left-to-right evaluation order. I don't see the benefit from changing this, and it surely has a significant complexity cost.

RalfJung · 2022-06-02T13:27:08Z

@RalfJung expressed some concerns that these semantics might be too subtle/footgunny, and we should instead have an explicit flag on StatementKind::Assign that indicates which version of the semantics the statement gets. This is a tradeoff - it would indeed simplify things for people trying to reason about the semantics.

I am not concerned about people reasoning about the semantics. I am concerned about optimizations that replace one Rvalue by a seemingly equivalent one, and don't realize that this changes assignment semantics from may-overlap to must-not-overlap! For example, replacing x+0 by x is no longer valid if Use assumes non-overlap but BinOp does not.

So, I continue to hold my objection here.

RalfJung · 2022-06-02T13:30:28Z

compiler/rustc_middle/src/mir/mod.rs

+    /// not overlap with any memory computed as a part of the `Rvalue`. Relatedly, the evaluation
+    /// order is `compute lhs place -> compute rhs value -> store value in place`. The plan is for
+    /// the rule about the evaluation order to justify the rule about the non-overlappingness via
+    /// the aliasing model.


Oh wait, having the eval order depend on the rvalue is deliberate? That thought didn't even occur to me when reading the step.rs changes.

I think this is a very bad idea. It makes justifying rvalue transformations a total disaster.

RalfJung · 2022-06-02T13:35:23Z

It does not yet update it to also retag the LHS place as unique. It is unclear how to implement this and likely needs a significant refactor, and so might be better left to future work.

This is a crucial part of the plan though. So I think we should have a plan for how to do this before changing anything here.

We can sketch it out in MiniRust first, that should be easier. (And that will also reinforce my dislike for pattern-matching on the RHS expression to determine the evaluation order and overlap condition, since it will show very clearly how non-compositional that is.) I hope to add the retagging APIs there next week, but it might also have to wait until the end of the month.

RalfJung · 2022-06-02T15:49:02Z

FWIW I also feel like we are getting close to the point where we really cannot progress without being able to directly write MIR for our tests, and feed that to particular transformations (and Miri). Just imagine LLVM without the ability to write LLVM IR testcases -- it'd be a disaster.

apiraino · 2022-06-30T13:04:08Z

Seems this is waiting on author to incorporate suggestions. Feel free to request a review when ready, thanks!

@rustbot author

JakobDegen · 2022-07-03T00:32:47Z

Yeah. I'm going to try and investigate what we need to change to just fully get the "left to right" evaluation order in all cases

JakobDegen · 2022-07-28T00:18:29Z

I'm closing this in favor of having consistent semantics anywhere, will try and make that change in the near-ish future

JakobDegen added 4 commits May 29, 2022 17:29

Define evaluation order of rvalues and StatementKind::Assign

9b6b689

Update evaluation order of assignments in const eval

a50fc36

Add utilities to soundly handling overlapping statements.

51a19e1

Make InstCombine compliant with the assignment non-overlapping rules.

b7e7a44

rust-highfive assigned oli-obk May 30, 2022

rustbot added the T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. label May 30, 2022

rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label May 30, 2022

RalfJung reviewed Jun 2, 2022

View reviewed changes

rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jun 30, 2022

JakobDegen closed this Jul 28, 2022

JakobDegen deleted the assign-semantics branch August 10, 2022 00:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clarify Semantics of Assignment in MIR #97567

Clarify Semantics of Assignment in MIR #97567

JakobDegen commented May 30, 2022

rust-highfive commented May 30, 2022

bors commented May 31, 2022

RalfJung Jun 2, 2022

RalfJung Jun 2, 2022

RalfJung Jun 2, 2022

RalfJung commented Jun 2, 2022

RalfJung Jun 2, 2022

RalfJung commented Jun 2, 2022 •

edited

Loading

RalfJung commented Jun 2, 2022

apiraino commented Jun 30, 2022

JakobDegen commented Jul 3, 2022

JakobDegen commented Jul 28, 2022

Clarify Semantics of Assignment in MIR #97567

Clarify Semantics of Assignment in MIR #97567

Conversation

JakobDegen commented May 30, 2022

rust-highfive commented May 30, 2022

bors commented May 31, 2022

RalfJung Jun 2, 2022

Choose a reason for hiding this comment

RalfJung Jun 2, 2022

Choose a reason for hiding this comment

RalfJung Jun 2, 2022

Choose a reason for hiding this comment

RalfJung commented Jun 2, 2022

RalfJung Jun 2, 2022

Choose a reason for hiding this comment

RalfJung commented Jun 2, 2022 • edited Loading

RalfJung commented Jun 2, 2022

apiraino commented Jun 30, 2022

JakobDegen commented Jul 3, 2022

JakobDegen commented Jul 28, 2022

RalfJung commented Jun 2, 2022 •

edited

Loading