Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Coercing &mut to *const should not create a shared reference #56604

Open
RalfJung opened this Issue Dec 7, 2018 · 16 comments

Comments

Projects
None yet
4 participants
@RalfJung
Copy link
Member

RalfJung commented Dec 7, 2018

It has long been a rule in Rust that you must not mutate through a shared reference, or a raw pointer obtained from a shared reference.

Unfortunately, that rule currently forbids the following code:

fn direct_mut_to_const_raw() {
    let x = &mut 0;
    let y: *const i32 = x;
    unsafe { *(y as *mut i32) = 1; }
    assert_eq!(*x, 1);
}

The reason for this is that coercing &mut T to *const T implicitly first creates a shared reference and then coerces that to *const T, meaning y in the example above is technically a raw pointer obtained from a shared reference.

We should fix our coercion logic to no longer create this intermediate shared reference.

See #56161 for how we uncovered this problem.

Cc @eddyb @nikomatsakis

@RalfJung

This comment has been minimized.

Copy link
Member Author

RalfJung commented Dec 7, 2018

I believe the relevant code might be

coerce_mutbls(mt_a.mutbl, mutbl_b)?;
// Although references and unsafe ptrs have the same
// representation, we still register an Adjust::DerefRef so that
// regionck knows that the region for `a` must be valid here.
if is_ref {
self.unify_and(a_unsafe, b, |target| {
vec![Adjustment {
kind: Adjust::Deref(None),
target: mt_a.ty
}, Adjustment {
kind: Adjust::Borrow(AutoBorrow::RawPtr(mutbl_b)),
target
}]
})
} else if mt_a.mutbl != mutbl_b {
self.unify_and(a_unsafe, b, simple(Adjust::MutToConstPointer))
} else {
self.unify_and(a_unsafe, b, identity)
}

but I am not sure because this is type checking. I have no idea where the code that decides about lowering of coercions to reborrows/casts lives.

@eddyb

This comment has been minimized.

Copy link
Member

eddyb commented Dec 7, 2018

I have no idea where the code that decides about lowering of coercions to reborrows/casts lives.

What do you mean? That code is generating a deref and a borrow, each of those gets turned into the equivalent MIR later, and the result is as if the user wrote &* explicitly.

@eddyb

This comment has been minimized.

Copy link
Member

eddyb commented Dec 7, 2018

@nikomatsakis I believe the comment above if is_ref { has been outdated for quite a while (and makes no sense for NLL), can you confirm?

@RalfJung

This comment has been minimized.

Copy link
Member Author

RalfJung commented Dec 7, 2018

@eddyb that code generates "adjustments". I have no idea what that means. I found no use of AutoBorrow::RawPtr that would turn this into casts or reborrows or so.

@eddyb

This comment has been minimized.

Copy link
Member

eddyb commented Dec 7, 2018

Oh I got confused as to what this is doing, I missed the AutoBorrow::RawPtr bit, looks like "borrow" and "cast reference to raw pointer" are fused into one adjustment. The lowering is here, btw:

Adjust::Borrow(AutoBorrow::RawPtr(m)) => {
// Convert this to a suitable `&foo` and
// then an unsafe coercion. Limit the region to be just this
// expression.
let region = ty::ReScope(region::Scope {
id: hir_expr.hir_id.local_id,
data: region::ScopeData::Node
});
let region = cx.tcx.mk_region(region);
expr = Expr {
temp_lifetime,
ty: cx.tcx.mk_ref(region,
ty::TypeAndMut {
ty: expr.ty,
mutbl: m,
}),
span,
kind: ExprKind::Borrow {
region,
borrow_kind: m.to_borrow_kind(),
arg: expr.to_ref(),
},
};
let cast_expr = Expr {
temp_lifetime,
ty: adjustment.target,
span,
kind: ExprKind::Cast { source: expr.to_ref() }
};

So looking again at the testcase, y as *mut i32 still goes through the reference->raw pointer coercion logic, but it does so for coercing &mut T to *mut T which does a mutable reborrow instead of an immutable reborrow (and then there's a separate *mut T to *const T coercion).

Changing AutoBorrow::RawPtr(mutbl_b) to AutoBorrow::RawPtr(mt_a.mutbl) would work, but you then also need to push Adjust::MutToConstPointer to that vec![...], if mt_a.mutbl != mutbl_b.

EDIT: this may be backwards incompatible if the mutable reference being coerced was also already immutably borrowed, since you'd be introducing a mutable reborrow.

Maybe we can avoid reborrows altogether here? But I'd leave that to @nikomatsakis.

@RalfJung

This comment has been minimized.

Copy link
Member Author

RalfJung commented Dec 7, 2018

Changing AutoBorrow::RawPtr(mutbl_b) to AutoBorrow::RawPtr(mt_a.mutbl) would work, but you then also need to push Adjust::MutToConstPointer to that vec![...], if mt_a.mutbl != mutbl_b.

I did the first but not the last and tests seem to still pass...^^

But yeah, there is a compatibility problem with outstanding shared references. Ouch.

@eddyb

This comment has been minimized.

Copy link
Member

eddyb commented Dec 7, 2018

Without Adjust::MutToConstPointer the MIR may end up malformed (using *mut T where *const T is expected), I'm surprised you don't get any errors!

@RalfJung

This comment has been minimized.

Copy link
Member Author

RalfJung commented Dec 7, 2018

using *mut T where *const T is expected

Yeah I figured. run-pass and ui tests all pass, so either the change did nothing or nothing checks the MIR^^

For the backwards compatibility issue:
I think my inclination is that we don't want to reborrow on a cast-to-raw. That would also make testing Stacked Borrows in miri easier, these implicit unavoidable reborrows are a pain. :P We'd have to make sure though that the borrow checker understands that after a (x: &mut T) as *mut T, x is still there -- it hasn't been moved away.

@eddyb

This comment has been minimized.

Copy link
Member

eddyb commented Dec 7, 2018

I guess the tricky bit with making it a cast is that what we have to work with is Operand::{Move,Copy}.

Instead, we could just add to the MIR a sort of "borrow to raw pointer" (a lot like AutoBorrow::RawPtr, really), and then we just need to make sure the old borrowck doesn't treat even a mutable AutoBorrow::RawPtr like a real borrow.

@RalfJung

This comment has been minimized.

Copy link
Member Author

RalfJung commented Dec 8, 2018

"borrow to raw pointer"

You mean like rust-lang/rfcs#2582? :D

@eddyb

This comment has been minimized.

Copy link
Member

eddyb commented Dec 8, 2018

@RalfJung Yupp, that's what I mean, that seems like the perfect solution here.

@RalfJung RalfJung closed this Dec 9, 2018

@RalfJung RalfJung reopened this Dec 9, 2018

@RalfJung

This comment has been minimized.

Copy link
Member Author

RalfJung commented Dec 9, 2018

(Sorry, that was the wrong button.)

Notice that the lint discussed in rust-lang/rfcs#2582, at least when implemented on the MIR, would actually flag these reborrows that are in our way now: The new reference is created just to turn it into a raw pointer, it has no other use.

@RalfJung

This comment has been minimized.

Copy link
Member Author

RalfJung commented Dec 12, 2018

Instead, we could just add to the MIR a sort of "borrow to raw pointer"

Following this idea, I have updated miri to treat the case of "escaping a ptr to be usable for raw accesses" as just another reborrow: Both trigger a retag; the retag after a cast-to-raw is a bit special because it also retags raw pointers.

So, I am convinced now that the best way forward is to entirely ditch ref-to-raw casts, and encode them all as reborrow-to-raw.

@RalfJung

This comment has been minimized.

Copy link
Member Author

RalfJung commented Dec 21, 2018

@RalfJung

This comment has been minimized.

Copy link
Member Author

RalfJung commented Mar 29, 2019

This is actually more subtle than I thought. The thing that I did not know when discovering this problem is that the borrow checker treats *const and *mut different. The following is safe code:

let x = &mut 0;
let shared = &*x;
let y = x as *const i32; // if we use *mut here instead, this stops compiling
let _val = *shared;

IOW, the borrow checker considers a cast to a const raw pointer as just a read access. Also see rust-lang/unsafe-code-guidelines#106 and Cc @matthewjasper who brought this to my attention.

This means that if we pull through with the plan to make x as *const i32 (where x is a mutable reference) behave like x as *mut i32 as *const i32, that code will stop compiling! Ouch.

So, everybody in rust-lang/unsafe-code-guidelines#106 (including me and @nikomatsakis but also in particular @SimonSapin) seemed to feel that the following code should be allowed (that's from the OP here):

// example 1
let x = &mut 0;
let y: *const i32 = x;
unsafe { *(y as *mut i32) = 1; }

But how do we all feel about this code?

// example 2
let x = &mut 0;
let shared = &*x;
let y: *const i32 = x;
let _val = *shared;
unsafe { *(y as *mut i32) = 1; }

Here, we cannot possibly maintain the idea that *const vs *mut does not matter, because the code stops compiling if we use *mut instead.

Does that mean that we are fine with example 2 being UB? And if it does, does that change our position about example 1?

My opinion

Personally I am less sure now about example 1, and feel maybe it should actually be UB, after all. The reason for this is that it leads to the cleanest model, with fewer cases to consider. I will go into several possible models for a bit here.

Clean and strict

The IMO most clean model (with the usual caveat that we'll not know how clean this ends up being until it got implemented and tested) is that the behavior of cast-to-raw depends only on the type of the raw pointer (mut vs const). We would have a rule that "writing through a raw pointer that was originally created as a *const is UB" (except for UnsafeCell). This is nice because it would replace/subsume the current rule that "writing through a raw pointer created from a shared reference is UB" (UnsafeCell yada yada), since shared references can only be cast to *const -- so it would be very consistent in that regard.

Basically, x as *const T would be the same as &*x as *const T, including all quirks around UnsafeCell.

However, this would rule out the example in the OP, and run counter the idea (that I repeated a lot) that *const vs *mut should not matter.

Extrapolate from current behavior

Currently, we accept example 1 if it gets changed to let y: *const i32 = x as *mut i32;. Applying the same rewrite to example 2 does not work, the borrow checker rejects that. Stacked Borrows ignores *const vs *mut and just checks whether the location is currently frozen; if it is, then the resulting raw pointer cannot be used for writing. If we want to continue along this line, we do have to consider a &mut to *const cast as a primitive operation instead of being composed of two steps, &mut to *mut and then *mut to *const. So we'd have three reference-to-raw-pointer-casts as three distinct primitive operations:

  • &mut to *mut: Can always be written to (so memory gets unfrozen and a Raw tag pushed, in the model).
  • &mut to *const: Can be written to (except UnsafeCell) only if there are no outstanding shared references. If memory is not frozen, we push a Raw and you can write through this, but if memory is frozen we just allow read access.
  • & to *const: Cannot be written to (except UnsafeCell).

This would allow example 1. disallow example 2, and it would also disallow the following:

// example 2
let x = &mut 0;
let shared = &*x;
let y: *const i32 = x;
unsafe { *(y as *mut i32) = 1; }

shared gets never used again, but we cannot know that when y gets created so we cannot make y a writable pointer. This might seem strange because here, the rewrite to let y: *const i32 = x as *mut i32; actually works, but that is because the borrow checker "knows the future" and determines that shared does not get used again. We cannot do this in a dynamic semantics such as Stacked Borrows.

Two-phase

@matthewjasper suggested it might be possible to treat cast-to-const as something like a two-phase borrow. I don't think that makes me happy.^^

@matthewjasper

This comment has been minimized.

Copy link
Contributor

matthewjasper commented Mar 29, 2019

@matthewjasper suggested it might be possible to treat cast-to-const as something like a two-phase borrow. I don't think that makes me happy.^^

No, I'm was saying that a model that accepts both example 1 and 2 would effectively be modelling two-phase borrows. I can't see any way to allow this without complicating the model significantly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.