Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.
Sign upSettle execution order uncertainty for `+=` #28160
Comments
nikomatsakis
added
I-nominated
I-needs-decision
T-lang
labels
Sep 1, 2015
This comment has been minimized.
This comment has been minimized.
|
See also #27868. |
This comment has been minimized.
This comment has been minimized.
|
I think this is a degenerate case, and should be at least linted against. |
This comment has been minimized.
This comment has been minimized.
|
I'm surprised this is allowed by the borrow checker (regardless of non-lexical lifetimes).
, but then modified here
despite being borrowed. |
This comment has been minimized.
This comment has been minimized.
|
@petrochenkov |
This comment has been minimized.
This comment has been minimized.
|
@llogiq |
This comment has been minimized.
This comment has been minimized.
|
There is no borrow. |
This comment has been minimized.
This comment has been minimized.
|
@llogiq I mean, the effective signature of |
This comment has been minimized.
This comment has been minimized.
|
Ah, I see. You're right, |
This comment has been minimized.
This comment has been minimized.
|
The following code prints fn main() {
let mut a;
a += { a = 2; 3 };
println!("{:?}", a);
} |
This comment has been minimized.
This comment has been minimized.
|
Does it add the address of &a to the value? |
This comment has been minimized.
This comment has been minimized.
|
Edit: erroneous example removed. |
This comment has been minimized.
This comment has been minimized.
|
@llogiq no. 3 is getting added to uninitialized memory. It's a repeating byte (0x1d) followed by a 0x1d + 3 byte. http://is.gd/Q0Hhr3 |
This comment has been minimized.
This comment has been minimized.
|
IMO AddAssign::add_assign(&mut self, rhs);where AddAssign is some trait added in the future to overload a = 0;
AddAssign::add_assign(&mut a, &{ a = 22; 2 }); |
This comment has been minimized.
This comment has been minimized.
|
@nagisa Yes, that could be a valid desugaring, but no, the borrow doesn't have to happen before There can be no memory unsafety here, even if borrowck allows
However, there's an evaluation bug in trans here (looking at @tomaka's example): the LHS must not be read before evaluating the RHS, but the current implementation disobeys that rule and can cause UB AFAICT. EDIT: clarified some bits here and there. |
This comment has been minimized.
This comment has been minimized.
|
Currently borrowck evaluates fn main() {
let a = Box::new(2);
let mut b = 2;
*{ *a; &mut b } //~ ERROR use of moved value
+=
{ drop(a); 1 };
}
Unfortunately, evaluating let x: &mut [u8] = calculate();
x[i] = x[j];As it would translate into
Here the borrow of One way to fix this (suggested by @eddyb) is to move the lvalue evaluation (including the bounds check) to after the RHS computation, which relies on lvalue evaluations not being able to invalidate borrows. We also thought of potentially packing coercions into the lvalue evaluation. This could also make code like |
This comment has been minimized.
This comment has been minimized.
Heh. fn report(a: &Vec<i32>) {
println!("{:?} {:?} {:?} {:?}", a.len(), a.capacity(), a[0], &a[0] as *const _);
}
fn main() -> () {
let mut a = vec![0];
report(&a);
let xxx = &a[0] as *const _;
println!("{:?} {:?}", xxx, unsafe { *xxx });
a[0] += {
report(&a);
a.push(1);
report(&a);
a.push(2);
report(&a);
3
};
report(&a);
println!("{:?} {:?}", xxx, unsafe { *xxx });
}
1 1 0 0x7f4386024008
0x7f4386024008 0
1 1 0 0x7f4386024008
2 2 0 0x7f4386024008
3 4 0 0x7f4386023000
3 4 0 0x7f4386023000
0x7f4386024008 3 |
This comment has been minimized.
This comment has been minimized.
|
Since several people voiced their disagreement on the need for supporting assignment in the RHS, as a counterexample I have this piece of code written this week (coincidentally) which was miscompiled ( The relevant bit: dirty |= match event {
E::KeyboardInput(ElementState::Pressed, _, Some(key)) => {
dirty |= root.dispatch(&ui::event::KeyDown(key));
for e in key_tracker.down(key) {
dirty |= root.dispatch(&e);
}
false
}
...
};And the workaround looks like this: dirty |= match event {
E::KeyboardInput(ElementState::Pressed, _, Some(key)) => {
let mut dirty = root.dispatch(&ui::event::KeyDown(key));
for e in key_tracker.down(key) {
dirty |= root.dispatch(&e);
}
dirty
}
...
};Is this an usecase we want to support or not? If there is an easy way to trigger borrowck errors on such usecases (or warnings, but that seems less likely), can we get a crater run to estimate the prevalence of LHS reads/writes in the RHS of |
This comment has been minimized.
This comment has been minimized.
|
@arielb1 You could argue that bounds-checked indexing is not "pure" and thus cannot avoid being borrowed, in some way, unlike other lvalues. |
This comment has been minimized.
This comment has been minimized.
|
On IRC I came up with this example which seems to make the smarter schemes fall apart: let mut boxed = Box::new(vec![...]);
(*boxed).push({
mem::replace(&mut boxed, Box::new(vec![])).len()
})Do we need to differentiate between |
This comment has been minimized.
This comment has been minimized.
|
This is annoying because overloaded deref can be both. We know about the bugs with |
This comment has been minimized.
This comment has been minimized.
|
One proposal that mostly maintains LTR and handles DefinitionsAn lvalue expression is basically the current rustc lvalue expression. Unresolved Question: do we include overloaded index/deref in index/deref lvalues? this makes more code compile but could be confusing in some cases?
Evaluating an expression "to an lvalue" is just evaluating all the rvalue-expressions in it. Note that after an expression is evaluated to an lvalue, finalizing its evaluation can still read memory and possibly run user code (if we allow overloaded derefs/indexing). Evaluating an expression "to an rvalue" is just standard evaluation. To evaluate an expression with a receiver, that's it "LHS = RHS", "LHS OP= RHS", "LHS.foo(ARG1, ARG2)", first the pre-final-autoref receiver is evaluated to an lvalue, then the other operands are evaluated to rvalues, then the receiver evaluation is finalized and autoref-ed if needed. Unresolved question: should this happen with by-value-self taking methods too? Provide an example for and against. ExamplesSimple assignmentx[I] = x[J];
// equiv
let i = I;
let rhs = x[J];
x[I] = rhs;Simple function calla.b.f(a.b.g(), a.b.h())
// equiv
let arg0 = a.b.g();
let arg1 = a.b.h();
a.b.f(arg0, arg1); // potentially with overloaded autoderefChanging receiver(*boxed).push({
mem::replace(&mut boxed, Box::new(vec![])).len()
})
// equiv
let arg0 = mem::replace(&mut boxed, Box::new(vec![])).len();
Vec::push(&mut *boxed, arg0); // this can be surprising, I guess.Changing receiver, deeplet y = Vec::new();
(***boxed).get({boxed=&&&y; 42})
// equiv
let y = Vec::new();
let ix = {boxed=&&&y; 42};
<[u8]>::get(&<Vec<u8> as Deref>::deref(&***boxed), ix)Cutting your own receiverlet boxed: Box<&[u8]> = get();
boxed.get({drop(boxed); 4})
// equiv
let boxed: Box<&[u8]> = get();
let ix = { drop(boxed); 4 };
<[u8]>::get(&**boxed) //~ ERROR use of moved valueCutting your own receiver, by-valuelet a: &mut [u32] = &mut [1,2];
let b: &mut [u32] = &mut [3,4];
let c;
a.get_mut({a=b; c=a; 1})
// equiv
let ix = {a=b; c=a; 1}
<&mut [u32]>::get_mut(a, ix) //~ ERROR use of moved valueChained function calla.b().c().d()
// equiv
let t0 = a.b(); // this is an rvalue
let t1 = t0.c();
let t2 = t1.d();Pushing lengthx[ix()].push(x[0].len());
// equiv
let index = ix();
let arg0 = x[0].len();
x[index].push(arg0);Pushing length, via functionx.get_mut().push(x[0].len());
// equiv
let t0 = x.get_mut();
let len = x[0].len(); //~ ERROR cannot borrow
t0.push(len);Simple assign-opdirty |= SOMETHING_MODIFYING_DIRTY;
// equiv
let rhs = SOMETHING_MODIFYING_DIRTY;
dirty = dirty | rhs;AdvantagesMost code should compile and do something sane. DisadvantagesIndexing/deref is handled somewhat differently from normal methods. If we don't allow overloaded deref/indexing, they can behave differently from primitive deref/indexing. If we do, their order-of-evaluation can be somewhat confusing. The handling of coercions needs to be thought about - if they can occur in the middle of an lvalue, they can break it. It would be nice if someone who knows them could provide an example. This also changes order-of-evaluation, which could subtly break existing code. CoercionsCould someone help me there (@eddyb, you know these best) Lvalues and derefWith the borrow checker, overloaded deref as well as deref of This is actually not totally precise: references can be "leaked". A by-value Dereference of |
This comment has been minimized.
This comment has been minimized.
ApplicationWhen should the "evaluate lvalue's rvalue components first, then lvalue" rule be applied? Only to method calls and assignment ops? Everywhere an lvalue is required? Does this include the pseudo-lvalue of |
This comment has been minimized.
This comment has been minimized.
|
@arielb1 I think the "Simple assignment" should read
|
This comment has been minimized.
This comment has been minimized.
|
@nikomatsakis (replying here because discourse refuses to work)
I don't really have a problem with a "typically LTR, receiver and assignment
You can't have the trivial desugaring anyway - |
This comment has been minimized.
This comment has been minimized.
|
@arielb1 I don't think changing the order of assignment for receiver is a On Fri, Sep 11, 2015 at 4:48 PM, arielb1 notifications@github.com wrote:
|
This comment has been minimized.
This comment has been minimized.
|
I want to treat method receivers like assignment LHS-es to fix the |
nikomatsakis
referenced this issue
Sep 17, 2015
Closed
Primitive binops are translated wrong when the LHS is mutated #27054
This comment has been minimized.
This comment has been minimized.
|
triage: P-medium |
rust-highfive
added
P-medium
and removed
I-nominated
I-needs-decision
labels
Oct 15, 2015
bstrie
added
the
I-unsound 💥
label
Oct 24, 2015
This comment has been minimized.
This comment has been minimized.
|
Given that commenters in here have demonstrated undefined behavior in here, I'm tagging this with I-soundness. Personally I'd like to keep the rules as simple as possible even if it means a trivial amount of breakage (which is permissible from fixing soundness flaws). If the operators can be desugared to method calls as you'd expect everywhere else, that would be ideal. |
This comment has been minimized.
This comment has been minimized.
|
This will not be unsound with the MIR, just cause more compile-time errors. |
This comment has been minimized.
This comment has been minimized.
|
@arielb1 , good to hear, but I'd prefer not to remove the label until we start trans-ing from MIR, which won't be for a while yet. :) |
nrc
added
the
A-mir
label
Nov 19, 2015
This comment has been minimized.
This comment has been minimized.
|
Now that MIR presumably has had to deal with this, what was the decision? |
This comment has been minimized.
This comment has been minimized.
|
Actually this has not been formally decided. (I mean, we've always had the impl doing something in any case, and it still does -- but is it the right thing?) I'm pretty dubious about changing the order of execution just to allow for fewer errors though -- I think we should stick to left-to-right whenever possible ( |
This comment has been minimized.
This comment has been minimized.
|
@nikomatsakis Huh, I'm very surprised Only overloaded augmented assignment seems to have LTR ordering in MIR. |
This comment has been minimized.
This comment has been minimized.
|
@nikomatsakis If I had to give one reason for evaluating assignments LTR it'd be that it enables RVO. |
This comment has been minimized.
This comment has been minimized.
|
@nikomatsakis I agree with @eddyb: LTR is better. It's how the rest of the language works, so it's less confusing. Also, I'd like it if |
nrc
referenced this issue
Oct 20, 2016
Open
Inconsistent evaluation order for assignment operations #27868
This comment has been minimized.
This comment has been minimized.
|
See this internals thread: https://internals.rust-lang.org/t/settling-execution-order-for/4253 |
This comment has been minimized.
This comment has been minimized.
|
Putting all the examples into the playground now suggests soundness issues have been resolved. Maybe remove the label, since now it's just a matter of figuring out what the right execution order is? https://is.gd/2xjOGo (prints 5 rather than the uninitialised memory it did previously) |
nikomatsakis
removed
the
I-unsound 💥
label
Mar 2, 2017
This comment has been minimized.
This comment has been minimized.
|
I'm not sure that we have much leeway to change this at this point anyhow. Maybe a little. =) |
This comment has been minimized.
This comment has been minimized.
|
It could use someone to kind of draw together all the considerations and make a summary of current status, at minimum. |
Mark-Simulacrum
added
the
C-tracking-issue
label
Jul 22, 2017
This comment has been minimized.
This comment has been minimized.
|
Triage, @nikomatsakis any updates? |
nikomatsakis commentedSep 1, 2015
When translating something like
a += bto MIR, an uncertainty arose about what the semantics ofought to be. Should resulting value of
abe24or2, basically? The current MIR yields 24, on the basis that this is more-or-less what will happen in the case of overloading as well (presuming non-lexical lifetimes, since I think otherwise you get a borrowck error).