New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regression from stable: pointer to usize conversion no longer compiles #54709

Open
mjbshaw opened this Issue Oct 1, 2018 · 43 comments

Comments

@mjbshaw
Contributor

mjbshaw commented Oct 1, 2018

The following code works with Rust 1.29.0:

union TransmuteHack<T: Copy, U: Copy> {
    from: T,
    to: U,
}

pub static VALUE: usize = 42;
pub static ADDRESS: usize = unsafe { TransmuteHack { from: &VALUE }.to };

But it fails with the current beta (and nightly):

error[E0080]: this static likely exhibits undefined behavior

 --> <source>:7:1

  |

7 | pub static ADDRESS: usize = unsafe { TransmuteHack { from: &VALUE }.to };

  | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ type validation failed: encountered a pointer, but expected the type usize

  |

  = note: The rules on what exactly is undefined behavior aren't clear, so this check might be overzealous. Please open an issue on the rust compiler repository if you believe it should not be considered undefined behavior

error: aborting due to previous error

For more information about this error, try `rustc --explain E0080`.

Compiler returned: 1

I think this code should still be valid. I don't see any reason for this to be undefined behavior. This is a hack, yes, but it's very useful for FFI-related work.

@dtolnay

This comment has been minimized.

Show comment
Hide comment
@dtolnay

dtolnay Oct 1, 2018

Member

This check was added in #51361 so mentioning @oli-obk and @RalfJung. It looks quite intentional and justifiable to reject this code. The same code without the unsafe hack is already disallowed in 1.29.0.

pub static ADDRESS: usize = &VALUE as *const usize as usize;
error[E0018]: raw pointers cannot be cast to integers in statics
 --> src/lib.rs:7:29
  |
7 | pub static ADDRESS: usize = &VALUE as *const usize as usize;
  |                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

I would expect this to be reconsidered after stabilizing #51910. Here is the relevant message on nightly:

error[E0658]: casting pointers to integers in statics is unstable (see issue #51910)
 --> src/lib.rs:7:29
  |
7 | pub static ADDRESS: usize = &VALUE as *const usize as usize;
  |                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  |
  = help: add #![feature(const_raw_ptr_to_usize_cast)] to the crate attributes to enable
Member

dtolnay commented Oct 1, 2018

This check was added in #51361 so mentioning @oli-obk and @RalfJung. It looks quite intentional and justifiable to reject this code. The same code without the unsafe hack is already disallowed in 1.29.0.

pub static ADDRESS: usize = &VALUE as *const usize as usize;
error[E0018]: raw pointers cannot be cast to integers in statics
 --> src/lib.rs:7:29
  |
7 | pub static ADDRESS: usize = &VALUE as *const usize as usize;
  |                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

I would expect this to be reconsidered after stabilizing #51910. Here is the relevant message on nightly:

error[E0658]: casting pointers to integers in statics is unstable (see issue #51910)
 --> src/lib.rs:7:29
  |
7 | pub static ADDRESS: usize = &VALUE as *const usize as usize;
  |                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  |
  = help: add #![feature(const_raw_ptr_to_usize_cast)] to the crate attributes to enable
@oli-obk

This comment has been minimized.

Show comment
Hide comment
@oli-obk

oli-obk Oct 1, 2018

Contributor

Yea, this kind of constant is very dangerous. I personally don't think there can be a good enough use case to outweigh the problems with it.

You can always use a *const () for FFI, which also makes your intent clearer. Pointers are not integers and treating them like that will get you into trouble not just in Rust

Contributor

oli-obk commented Oct 1, 2018

Yea, this kind of constant is very dangerous. I personally don't think there can be a good enough use case to outweigh the problems with it.

You can always use a *const () for FFI, which also makes your intent clearer. Pointers are not integers and treating them like that will get you into trouble not just in Rust

@RalfJung

This comment has been minimized.

Show comment
Hide comment
@RalfJung

RalfJung Oct 1, 2018

Member

Agreed, this was a deliberate change in behavior (and we cratered it, I assume?). The union hack was more of a loophole that we closed.

Now, to be fair, this is a static and not a const. For const there are plenty of examples of such a "pointer disguising as an integer" causing problems. For static, however, the situation is less clear: Consts cannot refer to static, so the final value being bad for CTFE is not necessarily a big problem. And the final value is fine for run-time consumption. @oli-obk what do you think? Would it make sense to relax the validity check for statics? (The validity check needs an argument anyway to switch between checking const and runtime validity, because I want to use it in miri.)

Member

RalfJung commented Oct 1, 2018

Agreed, this was a deliberate change in behavior (and we cratered it, I assume?). The union hack was more of a loophole that we closed.

Now, to be fair, this is a static and not a const. For const there are plenty of examples of such a "pointer disguising as an integer" causing problems. For static, however, the situation is less clear: Consts cannot refer to static, so the final value being bad for CTFE is not necessarily a big problem. And the final value is fine for run-time consumption. @oli-obk what do you think? Would it make sense to relax the validity check for statics? (The validity check needs an argument anyway to switch between checking const and runtime validity, because I want to use it in miri.)

@oli-obk

This comment has been minimized.

Show comment
Hide comment
@oli-obk

oli-obk Oct 1, 2018

Contributor

A static can refer to another static's value though. and then we have the same problem.

Contributor

oli-obk commented Oct 1, 2018

A static can refer to another static's value though. and then we have the same problem.

@oli-obk

This comment has been minimized.

Show comment
Hide comment
@oli-obk

oli-obk Oct 1, 2018

Contributor

And I don't think we want to make referring to other statics be dependent on the other static's validity checks succeeding

Contributor

oli-obk commented Oct 1, 2018

And I don't think we want to make referring to other statics be dependent on the other static's validity checks succeeding

@RalfJung

This comment has been minimized.

Show comment
Hide comment
@RalfJung

RalfJung Oct 1, 2018

Member

Oh right, good point. Also, promoted code can refer to statics I assume. Same problem.

So, I think this issue should be closed as "this regression is a bugfix". The work-around is to no longer lie about the actual type of the data stored in that static, and make it e.g. of type *const usize.

Member

RalfJung commented Oct 1, 2018

Oh right, good point. Also, promoted code can refer to statics I assume. Same problem.

So, I think this issue should be closed as "this regression is a bugfix". The work-around is to no longer lie about the actual type of the data stored in that static, and make it e.g. of type *const usize.

@oli-obk oli-obk closed this Oct 1, 2018

@mjbshaw

This comment has been minimized.

Show comment
Hide comment
@mjbshaw

mjbshaw Oct 1, 2018

Contributor

A static can refer to another static's value though.

Can they? I thought a static could only refer to another static's address, not its value. Which is pretty different.

Contributor

mjbshaw commented Oct 1, 2018

A static can refer to another static's value though.

Can they? I thought a static could only refer to another static's address, not its value. Which is pretty different.

@oli-obk

This comment has been minimized.

Show comment
Hide comment
@oli-obk

oli-obk Oct 1, 2018

Contributor

Works just fine on stable: https://play.rust-lang.org/?gist=a07a6558a87a86b0ed2e23855f2b4c44&version=stable&mode=debug&edition=2015

This was implemented half a year ago by lifting a bunch of restrictions of the old const evaluator

Contributor

oli-obk commented Oct 1, 2018

Works just fine on stable: https://play.rust-lang.org/?gist=a07a6558a87a86b0ed2e23855f2b4c44&version=stable&mode=debug&edition=2015

This was implemented half a year ago by lifting a bunch of restrictions of the old const evaluator

@RalfJung

This comment has been minimized.

Show comment
Hide comment
@RalfJung

RalfJung Oct 1, 2018

Member

And here's an example of how the static it complains about can cause a problem. There's no reasonable way to implement "divide a pointer by 2" in CTFE.

Member

RalfJung commented Oct 1, 2018

And here's an example of how the static it complains about can cause a problem. There's no reasonable way to implement "divide a pointer by 2" in CTFE.

@mjbshaw

This comment has been minimized.

Show comment
Hide comment
@mjbshaw

mjbshaw Oct 1, 2018

Contributor

Works just fine on stable: https://play.rust-lang.org/?gist=a07a6558a87a86b0ed2e23855f2b4c44&version=stable&mode=debug&edition=2015

This was implemented half a year ago by lifting a bunch of restrictions of the old const evaluator

Interesting! I wasn't aware of those changes. Thanks for the demo.

And here's an example of how the static it complains about can cause a problem. There's no reasonable way to implement "divide a pointer by 2" in CTFE.

I guess I don't really see what's wrong in that example. It fails to compile (which is correct) with a reasonable error message in a reasonable location. Seems okay to me. I don't see why this has to be preemptively prevented by disabling the union transmute hack.

Contributor

mjbshaw commented Oct 1, 2018

Works just fine on stable: https://play.rust-lang.org/?gist=a07a6558a87a86b0ed2e23855f2b4c44&version=stable&mode=debug&edition=2015

This was implemented half a year ago by lifting a bunch of restrictions of the old const evaluator

Interesting! I wasn't aware of those changes. Thanks for the demo.

And here's an example of how the static it complains about can cause a problem. There's no reasonable way to implement "divide a pointer by 2" in CTFE.

I guess I don't really see what's wrong in that example. It fails to compile (which is correct) with a reasonable error message in a reasonable location. Seems okay to me. I don't see why this has to be preemptively prevented by disabling the union transmute hack.

@RalfJung

This comment has been minimized.

Show comment
Hide comment
@RalfJung

RalfJung Oct 1, 2018

Member

Things get more tricky if the static with the "bad" usize is in a different crate than the place where it is used.

Also, if you write e.g. &(STATIC / 2) in normal code, it will get promoted to 'static lifetime. That will cause trouble if the STATIC does not follow the rules for const safety.

Member

RalfJung commented Oct 1, 2018

Things get more tricky if the static with the "bad" usize is in a different crate than the place where it is used.

Also, if you write e.g. &(STATIC / 2) in normal code, it will get promoted to 'static lifetime. That will cause trouble if the STATIC does not follow the rules for const safety.

@oli-obk

This comment has been minimized.

Show comment
Hide comment
@oli-obk

oli-obk Oct 1, 2018

Contributor

Also, if you write e.g. &(STATIC / 2) in normal code, it will get promoted to 'static lifetime. That will cause trouble if the STATIC does not follow the rules for const safety.

That only works for constants, I hope. Accessing a static's value only works in statics.

But yes. Creating a bad constant and using it in runtime code can cause everything from weird errors to undefined behaviour if used in runtime places.

Contributor

oli-obk commented Oct 1, 2018

Also, if you write e.g. &(STATIC / 2) in normal code, it will get promoted to 'static lifetime. That will cause trouble if the STATIC does not follow the rules for const safety.

That only works for constants, I hope. Accessing a static's value only works in statics.

But yes. Creating a bad constant and using it in runtime code can cause everything from weird errors to undefined behaviour if used in runtime places.

@RalfJung

This comment has been minimized.

Show comment
Hide comment
@RalfJung

RalfJung Oct 1, 2018

Member

Still this is probably sth. that some team should look at. AFAIK we do not have a team decision on this so far, and it seems prudent to get one. So maybe we should reopen and tag for, eh, T-lang?

Member

RalfJung commented Oct 1, 2018

Still this is probably sth. that some team should look at. AFAIK we do not have a team decision on this so far, and it seems prudent to get one. So maybe we should reopen and tag for, eh, T-lang?

@mjbshaw

This comment has been minimized.

Show comment
Hide comment
@mjbshaw

mjbshaw Oct 1, 2018

Contributor

Creating a bad constant and using it in runtime code can cause everything from weird errors to undefined behaviour if used in runtime places.

Yes, but usize doesn't have any "bad" values (I'm assuming "bad" means undefined behavior values (e.g. a zero NonZeroUsize) or a trap representation (e.g. a signaling NaN)), does it?

Contributor

mjbshaw commented Oct 1, 2018

Creating a bad constant and using it in runtime code can cause everything from weird errors to undefined behaviour if used in runtime places.

Yes, but usize doesn't have any "bad" values (I'm assuming "bad" means undefined behavior values (e.g. a zero NonZeroUsize) or a trap representation (e.g. a signaling NaN)), does it?

@RalfJung

This comment has been minimized.

Show comment
Hide comment
@RalfJung

RalfJung Oct 1, 2018

Member

It doesn't at run-time (except for Uninit aka undef/poison, i.e., except for uninitialized data). It has for constant evaluation. See this blog post for some more thoughts on that.

Member

RalfJung commented Oct 1, 2018

It doesn't at run-time (except for Uninit aka undef/poison, i.e., except for uninitialized data). It has for constant evaluation. See this blog post for some more thoughts on that.

@mjbshaw

This comment has been minimized.

Show comment
Hide comment
@mjbshaw

mjbshaw Oct 2, 2018

Contributor

So, @RalfJung, after going through the blog post (you definitely understand this better than I do, so let me know if I've misunderstood this), I'm still not sure why this has to be disallowed for CTFE. The expression &VALUE is not a constant expression and should not be (and indeed cannot be) (fully) evaluated at compile time. The address of VALUE is determined by the linker (at best) or the executable loader at runtime. I don't think this expression should have all the strict const eval rules applied to it because it's not a constant expression.

Contributor

mjbshaw commented Oct 2, 2018

So, @RalfJung, after going through the blog post (you definitely understand this better than I do, so let me know if I've misunderstood this), I'm still not sure why this has to be disallowed for CTFE. The expression &VALUE is not a constant expression and should not be (and indeed cannot be) (fully) evaluated at compile time. The address of VALUE is determined by the linker (at best) or the executable loader at runtime. I don't think this expression should have all the strict const eval rules applied to it because it's not a constant expression.

@RalfJung

This comment has been minimized.

Show comment
Hide comment
@RalfJung

RalfJung Oct 2, 2018

Member

Every expression you write in the initializer of a static or const is a constant expression and evaluated at compile time. If the value of a static or const is a pointer, we emit appropriate relocations for LLVM to handle, but that doesn't change the fact that all this is a constant expression -- e.g. if you do ptr.wrapping_offset(4).wrapping_offset(8), we will compute the offset 12 at compile time (well, we would if we allowed those methods to be called, but the underlying engine can handle all that).

What is the problem with using a pointer type here? Pointers are not integers, and for CTFE this is apparent even by looking at what we send to LLVM (for pointers, we need relocations). This distinction is important to describe optimizing compilers even if you ignore constant evaluation, but it surfaces even more in const context.

Member

RalfJung commented Oct 2, 2018

Every expression you write in the initializer of a static or const is a constant expression and evaluated at compile time. If the value of a static or const is a pointer, we emit appropriate relocations for LLVM to handle, but that doesn't change the fact that all this is a constant expression -- e.g. if you do ptr.wrapping_offset(4).wrapping_offset(8), we will compute the offset 12 at compile time (well, we would if we allowed those methods to be called, but the underlying engine can handle all that).

What is the problem with using a pointer type here? Pointers are not integers, and for CTFE this is apparent even by looking at what we send to LLVM (for pointers, we need relocations). This distinction is important to describe optimizing compilers even if you ignore constant evaluation, but it surfaces even more in const context.

@oli-obk

This comment has been minimized.

Show comment
Hide comment
@oli-obk

oli-obk Oct 2, 2018

Contributor

The problem is that you can already do

let x: &'static u32 = &42;

on stable Rust since a year or so. We need to keep supporting it, so we're very careful to not allow any oddball features like pointer values with integer types. Since &SOME_CONSTANT is (almost) always promoted to a static, we also need to ensure that constants are somewhat sane (which they were by definition until miri came along).

It was an accident that we ever allwoed it.

Contributor

oli-obk commented Oct 2, 2018

The problem is that you can already do

let x: &'static u32 = &42;

on stable Rust since a year or so. We need to keep supporting it, so we're very careful to not allow any oddball features like pointer values with integer types. Since &SOME_CONSTANT is (almost) always promoted to a static, we also need to ensure that constants are somewhat sane (which they were by definition until miri came along).

It was an accident that we ever allwoed it.

@mjbshaw

This comment has been minimized.

Show comment
Hide comment
@mjbshaw

mjbshaw Oct 2, 2018

Contributor

Some context about my use case: Objective-C type encoding uses different encodings for integers vs pointers. Some FFI-related data structures use usize rather than *mut T (because they pack extra tag flags in the low two bits) (and to be clear: I'm just writing the Rust FFI bindings; I'm not the author of the native library that I'm interacting with).

I could use *mut T but it would mean I would have to special-case the Objective-C type encodings for each struct that does this, and I'd probably have to create a union (of usize and *mut T) so I can assign the correct pointer to the static variable (and later at run time twiddle the bits as needed).

I'll stop pushing for this, though.

Contributor

mjbshaw commented Oct 2, 2018

Some context about my use case: Objective-C type encoding uses different encodings for integers vs pointers. Some FFI-related data structures use usize rather than *mut T (because they pack extra tag flags in the low two bits) (and to be clear: I'm just writing the Rust FFI bindings; I'm not the author of the native library that I'm interacting with).

I could use *mut T but it would mean I would have to special-case the Objective-C type encodings for each struct that does this, and I'd probably have to create a union (of usize and *mut T) so I can assign the correct pointer to the static variable (and later at run time twiddle the bits as needed).

I'll stop pushing for this, though.

@oli-obk

This comment has been minimized.

Show comment
Hide comment
@oli-obk

oli-obk Oct 2, 2018

Contributor

so I can assign the correct pointer to the static variable (and later at run time twiddle the bits as needed).

Why can't you just use the *mut T and during your runtime twiddling convert to a usize, do the twiddling, and then convert back? That seems strictly more correct than storing a pointer in a usize just because you sometimes need to modify its bits direclty.

Contributor

oli-obk commented Oct 2, 2018

so I can assign the correct pointer to the static variable (and later at run time twiddle the bits as needed).

Why can't you just use the *mut T and during your runtime twiddling convert to a usize, do the twiddling, and then convert back? That seems strictly more correct than storing a pointer in a usize just because you sometimes need to modify its bits direclty.

@RalfJung

This comment has been minimized.

Show comment
Hide comment
@RalfJung

RalfJung Oct 2, 2018

Member

It might also be an option to turn this into a deny-by-default lint for cases when people really are asking for a footgun?

The interaction with promotion is ugly, though.

Member

RalfJung commented Oct 2, 2018

It might also be an option to turn this into a deny-by-default lint for cases when people really are asking for a footgun?

The interaction with promotion is ugly, though.

@mjbshaw

This comment has been minimized.

Show comment
Hide comment
@mjbshaw

mjbshaw Oct 5, 2018

Contributor

Why can't you just use the *mut T and during your runtime twiddling convert to a usize, do the twiddling, and then convert back?

That's what I would need to do. But storing the value in *mut T means that the type encoding for the struct can no longer be automatically derived, since *mut T and usize are encoded differently.

It might also be an option to turn this into a deny-by-default lint for cases when people really are asking for a footgun?

I personally would love for this to be a lint. It can be a bit of a footgun, yes, but low level FFI programming is full of footguns.

Contributor

mjbshaw commented Oct 5, 2018

Why can't you just use the *mut T and during your runtime twiddling convert to a usize, do the twiddling, and then convert back?

That's what I would need to do. But storing the value in *mut T means that the type encoding for the struct can no longer be automatically derived, since *mut T and usize are encoded differently.

It might also be an option to turn this into a deny-by-default lint for cases when people really are asking for a footgun?

I personally would love for this to be a lint. It can be a bit of a footgun, yes, but low level FFI programming is full of footguns.

@RalfJung

This comment has been minimized.

Show comment
Hide comment
@RalfJung

RalfJung Oct 5, 2018

Member

The footgun here, however, is of a different nature: If you ever write &(ADDRESS/2), that code will fail to compile because the compiler thinks it can execute it at compile-time, and when it finds out it cannot, it is too late.

Member

RalfJung commented Oct 5, 2018

The footgun here, however, is of a different nature: If you ever write &(ADDRESS/2), that code will fail to compile because the compiler thinks it can execute it at compile-time, and when it finds out it cannot, it is too late.

@mjbshaw

This comment has been minimized.

Show comment
Hide comment
@mjbshaw

mjbshaw Oct 6, 2018

Contributor

Forgive my ignorance, but why is that a problem? Failure to compile is exactly what I'd expect in that scenario.

Contributor

mjbshaw commented Oct 6, 2018

Forgive my ignorance, but why is that a problem? Failure to compile is exactly what I'd expect in that scenario.

@RalfJung

This comment has been minimized.

Show comment
Hide comment
@RalfJung

RalfJung Oct 6, 2018

Member

Do you really? This code compiles:

union TransmuteHack<T: Copy> {
    from: T,
    to: usize,
}

pub static VALUE: usize = 42;

fn main() {
    println!("{:?}", unsafe { TransmuteHack { from: &VALUE }.to } / 2);
}

But with promotion, this would not:

union TransmuteHack<T: Copy> {
    from: T,
    to: usize,
}

pub static VALUE: usize = 42;
pub static ADDRESS: usize = unsafe { TransmuteHack { from: &VALUE }.to };

fn main() {
    println!("{:?}", ADDRESS / 2);
}

Notice that println! adds an & in front of the arguments. So in the second case, we have &(ADDRESS/2), which gets promoted and then fails during CTFE. And working around this is quite hard actually, you have to write something like (|| ADDRESS)() / 2 to disable promotion.

@oli-obk Why does the last example compile on stable? Shouldn't it promote and then fail...?

EDIT: Seems like we don't promote uses of a static. Interesting.

Member

RalfJung commented Oct 6, 2018

Do you really? This code compiles:

union TransmuteHack<T: Copy> {
    from: T,
    to: usize,
}

pub static VALUE: usize = 42;

fn main() {
    println!("{:?}", unsafe { TransmuteHack { from: &VALUE }.to } / 2);
}

But with promotion, this would not:

union TransmuteHack<T: Copy> {
    from: T,
    to: usize,
}

pub static VALUE: usize = 42;
pub static ADDRESS: usize = unsafe { TransmuteHack { from: &VALUE }.to };

fn main() {
    println!("{:?}", ADDRESS / 2);
}

Notice that println! adds an & in front of the arguments. So in the second case, we have &(ADDRESS/2), which gets promoted and then fails during CTFE. And working around this is quite hard actually, you have to write something like (|| ADDRESS)() / 2 to disable promotion.

@oli-obk Why does the last example compile on stable? Shouldn't it promote and then fail...?

EDIT: Seems like we don't promote uses of a static. Interesting.

@RalfJung

This comment has been minimized.

Show comment
Hide comment
@RalfJung

RalfJung Oct 7, 2018

Member

@rust-lang/lang I think there is a decision to be made here. (I hope I am using these tags correctly...)

The problem is that there are some values that are safe but not const-safe (using the terminology from my blog post). Namely, &0 as *const _ as usize is a perfectly fine and safe usize at run-time, but not so at compile-time because there are integer operations that we cannot perform on it (e.g. division).

We have a check running on every constant and static making sure that it follows the invariants implied by its type, and this check currently rejects pointer values at integer type on the ground that using them in further constant computations can cause unexpected failure. In particular, things get nasty once promotion is involved: If we decide to promote but then later const evaluation fails, we just made a program not compile for no good reason. So we should only promote if we can be sure that all the values used in there are const-safe. (This is why promoting const fn is such a can of worms, and we will have the same issue with generic consts.)

In this issue, however, we have a usecase for actually storing such a "const-bad" value in a static. It seems that it is not possible to refer to statics in constants or promoteds, but just in other statics, making the "const-bad" values slightly less of a problem. It is still the case that if someone uses a "const-bad" static in what seems like a safe usage of a usize, they can get an error:

union TransmuteHack<T: Copy> {
    from: T,
    to: usize,
}

pub static VALUE: usize = 42;
pub static ADDRESS: usize = unsafe { TransmuteHack { from: &VALUE }.to };

pub static AD2: usize = ADDRESS/2; // Error: Cannot divide a pointer by 2

But that's not fundamentally different from other failures induced by a static not having the value any more that it used to have -- i.e., changing a pub static is semver-breaking.

So, as long as we decide we will never want to promote expressions depending on the value of a static, I think we could change the sanity checks on static (and on static only!) to check for "run-time safety" as opposed to "const-safatey", and hence accept pointer values stored in integers. Alternatively, we could keep the current check but on statics downgrade it to a lint that can be allowed. Either way, for const we still have to rule out pointer values at integer type because they might flow into promotion. @oli-obk please correct me if that is wrong.

Member

RalfJung commented Oct 7, 2018

@rust-lang/lang I think there is a decision to be made here. (I hope I am using these tags correctly...)

The problem is that there are some values that are safe but not const-safe (using the terminology from my blog post). Namely, &0 as *const _ as usize is a perfectly fine and safe usize at run-time, but not so at compile-time because there are integer operations that we cannot perform on it (e.g. division).

We have a check running on every constant and static making sure that it follows the invariants implied by its type, and this check currently rejects pointer values at integer type on the ground that using them in further constant computations can cause unexpected failure. In particular, things get nasty once promotion is involved: If we decide to promote but then later const evaluation fails, we just made a program not compile for no good reason. So we should only promote if we can be sure that all the values used in there are const-safe. (This is why promoting const fn is such a can of worms, and we will have the same issue with generic consts.)

In this issue, however, we have a usecase for actually storing such a "const-bad" value in a static. It seems that it is not possible to refer to statics in constants or promoteds, but just in other statics, making the "const-bad" values slightly less of a problem. It is still the case that if someone uses a "const-bad" static in what seems like a safe usage of a usize, they can get an error:

union TransmuteHack<T: Copy> {
    from: T,
    to: usize,
}

pub static VALUE: usize = 42;
pub static ADDRESS: usize = unsafe { TransmuteHack { from: &VALUE }.to };

pub static AD2: usize = ADDRESS/2; // Error: Cannot divide a pointer by 2

But that's not fundamentally different from other failures induced by a static not having the value any more that it used to have -- i.e., changing a pub static is semver-breaking.

So, as long as we decide we will never want to promote expressions depending on the value of a static, I think we could change the sanity checks on static (and on static only!) to check for "run-time safety" as opposed to "const-safatey", and hence accept pointer values stored in integers. Alternatively, we could keep the current check but on statics downgrade it to a lint that can be allowed. Either way, for const we still have to rule out pointer values at integer type because they might flow into promotion. @oli-obk please correct me if that is wrong.

@oli-obk

This comment has been minimized.

Show comment
Hide comment
@oli-obk

oli-obk Oct 8, 2018

Contributor

We should either error on them as we do now, or not lint at all. We should not add a lint to the compiler that we immediately expect to have unfixable false positives.

Contributor

oli-obk commented Oct 8, 2018

We should either error on them as we do now, or not lint at all. We should not add a lint to the compiler that we immediately expect to have unfixable false positives.

@RalfJung

This comment has been minimized.

Show comment
Hide comment
@RalfJung

RalfJung Oct 8, 2018

Member

If we only warn for runtime-bad data -- like a NULL value at reference type -- which unfixable false positives are you thinking of?

Member

RalfJung commented Oct 8, 2018

If we only warn for runtime-bad data -- like a NULL value at reference type -- which unfixable false positives are you thinking of?

@oli-obk

This comment has been minimized.

Show comment
Hide comment
@oli-obk

oli-obk Oct 8, 2018

Contributor

For these I don't see the point in warning. If having a runtime local of reference type with NULL as its value is UB, why isn't a static once the program runs?

Contributor

oli-obk commented Oct 8, 2018

For these I don't see the point in warning. If having a runtime local of reference type with NULL as its value is UB, why isn't a static once the program runs?

@RalfJung

This comment has been minimized.

Show comment
Hide comment
@RalfJung

RalfJung Oct 8, 2018

Member

Oh I see, you are saying if the info is any good we should make it a hard error. Makes sense.

Member

RalfJung commented Oct 8, 2018

Oh I see, you are saying if the info is any good we should make it a hard error. Makes sense.

@pietroalbini pietroalbini added this to Triaged in 1.30 regressions via automation Oct 11, 2018

@pietroalbini pietroalbini added this to the 1.30 milestone Oct 11, 2018

@pietroalbini pietroalbini modified the milestone: 1.30 Oct 11, 2018

@nikomatsakis

This comment has been minimized.

Show comment
Hide comment
@nikomatsakis

nikomatsakis Oct 11, 2018

Contributor

@RalfJung

So, as long as we decide we will never want to promote expressions depending on the value of a static

I didn't quite understand this part. You did say that "It seems that it is not possible to refer to statics in constants or promoteds", so I guess you mean that -- today -- everything should be fine, since the tainted value stays in "runtime land". (Also, since constants cannot refer to statics (e.g., const X: u32 = Y), so they can't sneak into a promoted indirectly that way.)

But, if we accept these sorts of statics, then in the future we could not lift those restrictions without the results depending on the precise value of the static?

Contributor

nikomatsakis commented Oct 11, 2018

@RalfJung

So, as long as we decide we will never want to promote expressions depending on the value of a static

I didn't quite understand this part. You did say that "It seems that it is not possible to refer to statics in constants or promoteds", so I guess you mean that -- today -- everything should be fine, since the tainted value stays in "runtime land". (Also, since constants cannot refer to statics (e.g., const X: u32 = Y), so they can't sneak into a promoted indirectly that way.)

But, if we accept these sorts of statics, then in the future we could not lift those restrictions without the results depending on the precise value of the static?

@oli-obk

This comment has been minimized.

Show comment
Hide comment
@oli-obk

oli-obk Oct 11, 2018

Contributor

It's an either or. Either we allow referring to statics in constants and promoteds or we allow creating statics with const-unsafe values. We can't have both.

allow referring to statics don't allow referring to statics
allow statics with compile-time bad values can turn runtime code into hard errors without it being a problem at runtime OK
check statics' initializer value like const values are checked OK OK (today)
Contributor

oli-obk commented Oct 11, 2018

It's an either or. Either we allow referring to statics in constants and promoteds or we allow creating statics with const-unsafe values. We can't have both.

allow referring to statics don't allow referring to statics
allow statics with compile-time bad values can turn runtime code into hard errors without it being a problem at runtime OK
check statics' initializer value like const values are checked OK OK (today)
@RalfJung

This comment has been minimized.

Show comment
Hide comment
@RalfJung

RalfJung Oct 11, 2018

Member

Indeed. However, the bottom-left corner of that square has some other quirks -- we have to be very careful in what we let you do with the statics. Taking the address is always okay, but that remains the case even if we allow bad statics! So what remains is read-access to immutable statics (their type must be Freeze). We cannot allow read-access to mutable statics (static mut, but also static _: Cell<T>) because they could read differently at run-time.

So we don't lose all that much here.

Member

RalfJung commented Oct 11, 2018

Indeed. However, the bottom-left corner of that square has some other quirks -- we have to be very careful in what we let you do with the statics. Taking the address is always okay, but that remains the case even if we allow bad statics! So what remains is read-access to immutable statics (their type must be Freeze). We cannot allow read-access to mutable statics (static mut, but also static _: Cell<T>) because they could read differently at run-time.

So we don't lose all that much here.

@oli-obk

This comment has been minimized.

Show comment
Hide comment
@oli-obk

oli-obk Oct 11, 2018

Contributor

Addresses are not ok if we ever want to allow dereferencing. or do we already do that? Dereferencing a reference to a static is just as bad as directly accessing it

Contributor

oli-obk commented Oct 11, 2018

Addresses are not ok if we ever want to allow dereferencing. or do we already do that? Dereferencing a reference to a static is just as bad as directly accessing it

@RalfJung

This comment has been minimized.

Show comment
Hide comment
@RalfJung

RalfJung Oct 11, 2018

Member

Ah right. I think it'd be fine to never allow dereferencing in promoteds, but for const that seems awfully restrictive.

Member

RalfJung commented Oct 11, 2018

Ah right. I think it'd be fine to never allow dereferencing in promoteds, but for const that seems awfully restrictive.

@mjbshaw

This comment has been minimized.

Show comment
Hide comment
@mjbshaw

mjbshaw Oct 11, 2018

Contributor

These comments have helped clarify some of my initial confusion as to why ADDRESS/2 would be problematic. I didn't realize that:

  1. Constant promotion is a one-way street. That is, once the compiler decides to promote something to a constant and evaluate it at compile time, the compiler won't fall back and "demote" it to a runtime value if it determines it can't actually compute the value at compile time.
  2. Constant promotion is done in non-constant statements and functions (i.e. regular fns, not const fns).

I don't really understand why either of these things are the case. Item 1 would be taken care of if the compiler could "demote" something it had previously promoted and just fall back to evaluating it at run time. Item 2 seems to be unnecessary? The constant propagation optimization pass already exists, so I'm not sure what additional performance benefits are to be gained by doing item 2, and I can't see any additional non-performance benefits that come from constant promotion in regular fns.

I'm just a user of Rust, though, and obviously lack much of the big-picture design efforts and goals that are ongoing right now, so forgive me if my reply is super naive or outright wrong.

Contributor

mjbshaw commented Oct 11, 2018

These comments have helped clarify some of my initial confusion as to why ADDRESS/2 would be problematic. I didn't realize that:

  1. Constant promotion is a one-way street. That is, once the compiler decides to promote something to a constant and evaluate it at compile time, the compiler won't fall back and "demote" it to a runtime value if it determines it can't actually compute the value at compile time.
  2. Constant promotion is done in non-constant statements and functions (i.e. regular fns, not const fns).

I don't really understand why either of these things are the case. Item 1 would be taken care of if the compiler could "demote" something it had previously promoted and just fall back to evaluating it at run time. Item 2 seems to be unnecessary? The constant propagation optimization pass already exists, so I'm not sure what additional performance benefits are to be gained by doing item 2, and I can't see any additional non-performance benefits that come from constant promotion in regular fns.

I'm just a user of Rust, though, and obviously lack much of the big-picture design efforts and goals that are ongoing right now, so forgive me if my reply is super naive or outright wrong.

@RalfJung

This comment has been minimized.

Show comment
Hide comment
@RalfJung

RalfJung Oct 11, 2018

Member

I'm not sure what additional performance benefits are to be gained by doing item 2

Promotion is not at all about performance. It's about letting you write

fn foo() -> &'static i32 { &42 } // note the lifetime!

I suggest you have a look at the RFC for some further motivation.

This also connects to item 1: The order of events is "we decide what to promote", "we do borrow checking", "we do const evaluation". Promoted things get the 'static lifetime, which is exploited by the borrow checker. If later const evaluation fails... well, bad luck, we already did borrow checking assuming it had 'static lifetime so we can't take that back.

Member

RalfJung commented Oct 11, 2018

I'm not sure what additional performance benefits are to be gained by doing item 2

Promotion is not at all about performance. It's about letting you write

fn foo() -> &'static i32 { &42 } // note the lifetime!

I suggest you have a look at the RFC for some further motivation.

This also connects to item 1: The order of events is "we decide what to promote", "we do borrow checking", "we do const evaluation". Promoted things get the 'static lifetime, which is exploited by the borrow checker. If later const evaluation fails... well, bad luck, we already did borrow checking assuming it had 'static lifetime so we can't take that back.

@aturon

This comment has been minimized.

Show comment
Hide comment
@aturon

aturon Oct 11, 2018

Member

@rfcbot fcp close

Member

aturon commented Oct 11, 2018

@rfcbot fcp close

@rfcbot

This comment has been minimized.

Show comment
Hide comment
@rfcbot

rfcbot Oct 11, 2018

Team member @aturon has proposed to close this. The next step is review by the rest of the tagged teams:

No concerns currently listed.

Once a majority of reviewers approve (and none object), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

rfcbot commented Oct 11, 2018

Team member @aturon has proposed to close this. The next step is review by the rest of the tagged teams:

No concerns currently listed.

Once a majority of reviewers approve (and none object), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

@RalfJung

This comment has been minimized.

Show comment
Hide comment
@RalfJung

RalfJung Oct 11, 2018

Member

Just for context, the lang team is suggesting to close this because it got accidentally stabilized. This could still be a useful feature, but then it should go through the usual process for new features (i.e., an RFC), and not happen by accident.

Member

RalfJung commented Oct 11, 2018

Just for context, the lang team is suggesting to close this because it got accidentally stabilized. This could still be a useful feature, but then it should go through the usual process for new features (i.e., an RFC), and not happen by accident.

@nrc nrc removed the I-nominated label Oct 11, 2018

@mjbshaw

This comment has been minimized.

Show comment
Hide comment
@mjbshaw

mjbshaw Oct 12, 2018

Contributor

Promotion is not at all about performance. It's about letting you write

fn foo() -> &'static i32 { &42 } // note the lifetime!

I see; that makes sense. Thank you. Sorry I kept missing the significance of this.

Okay, well with that in mind I really can't argue against closing this. Thank you all (especially @RalfJung!) for looking into this thoroughly (and for your patience with me).

Contributor

mjbshaw commented Oct 12, 2018

Promotion is not at all about performance. It's about letting you write

fn foo() -> &'static i32 { &42 } // note the lifetime!

I see; that makes sense. Thank you. Sorry I kept missing the significance of this.

Okay, well with that in mind I really can't argue against closing this. Thank you all (especially @RalfJung!) for looking into this thoroughly (and for your patience with me).

@pnkfelix

This comment has been minimized.

Show comment
Hide comment
@pnkfelix

pnkfelix Oct 18, 2018

Member

discussed at compiler team meeting. FCP (to close) in process.

Member

pnkfelix commented Oct 18, 2018

discussed at compiler team meeting. FCP (to close) in process.

@rfcbot

This comment has been minimized.

Show comment
Hide comment
@rfcbot

rfcbot Oct 18, 2018

🔔 This is now entering its final comment period, as per the review above. 🔔

rfcbot commented Oct 18, 2018

🔔 This is now entering its final comment period, as per the review above. 🔔

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment