New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC for unsafe blocks in unsafe fn #2585

Open
wants to merge 9 commits into
base: master
from

Conversation

Projects
None yet
@RalfJung
Member

RalfJung commented Nov 4, 2018

No longer treat the body of an unsafe fn as being an unsafe block. To avoid a breaking change, this is a warning now and may become an error in a future edition.

Rendered

Cc @rust-lang/wg-unsafe-code-guidelines

@RalfJung RalfJung referenced this pull request Nov 4, 2018

Open

Tracking issue for unsafe operations in const fn #55607

0 of 4 tasks complete
@Diggsey

This comment has been minimized.

Contributor

Diggsey commented Nov 4, 2018

If this were to be accepted, it would be much better to get the warning in before the 2018 epoch hits stable. If there's no time to get it in for 2018 then I don't think it should be accepted.

# Drawbacks
[drawbacks]: #drawbacks
This new warning will likely fire for the vast majority of `unsafe fn` out there.

This comment has been minimized.

@oli-obk

oli-obk Nov 4, 2018

Contributor

We can start as allow by default. The fact that const unsafe fn already behaves this way and that clippy can uplift this lint to warn, will already make sure to migrate large parts of the ecosystem.

Oh... you are already mentioning that below in the unresolved questions...

This comment has been minimized.

@RalfJung

RalfJung Nov 4, 2018

Member

Yeah, I had to follow the RFC structure, didn't I? ;)

@RalfJung

This comment has been minimized.

Member

RalfJung commented Nov 4, 2018

@Diggsey There will be another edition. And once we no longer warn about unsafe blocks in unsafe fn being redundant, we can indeed phase this in more smoothly e.g. with Clippy.

@burdges

This comment has been minimized.

burdges commented Nov 4, 2018

We needed this years ago, so the sooner the better.

# Prior art
[prior-art]: #prior-art
None that I am aware of: Other languages do not have `unsafe` blocks.

This comment has been minimized.

@Ixrec

Ixrec Nov 4, 2018

Contributor

C# has unsafe blocks in addition to unsafe methods. Though it's not super helpful since I'm not aware of the C# community ever discussing burden of proof issues like this RFC does, probably because 99.99% of the time the answer in that language is "unsafe just isn't worth it". I couldn't even find a C# style guide that mentions the existence of unsafe code, much less has guidelines for making it less dangerous.

This comment has been minimized.

@RalfJung

RalfJung Nov 5, 2018

Member

@Centril But this RFC is specifically about blocks and nesting of unsafe escape hatches. I do not think any of the examples you mention apply there, do they?

@Ixrec Thanks, I had no idea! And it looks like unsafe operations can be used freely in unsafe functions. :/

This comment has been minimized.

@Centril

Centril Nov 5, 2018

Contributor

@RalfJung not with that specificity no; the languages have the "block" form, e.g.:

x = unsafePerformIO $ do
    foo
    bar
    ...

what they lack is the unsafe function form.

This comment has been minimized.

@RalfJung

RalfJung Nov 5, 2018

Member

That's still quite different from Rust. It's just a normal function to the compiler, no checks for "unsafe operations" or so are performed. I do not see a close enough relation to this RFC.

This comment has been minimized.

@Centril

Centril Nov 5, 2018

Contributor

@RalfJung alright; fair enough. Let's leave this bit (the comment thread) open for interested readers who want to see the associated material linked. :)

Show resolved Hide resolved text/0000-unsafe-block-in-unsafe-fn.md
[drawbacks]: #drawbacks
This new warning will likely fire for the vast majority of `unsafe fn` out there.

This comment has been minimized.

@Centril

Centril Nov 4, 2018

Contributor

Other possible drawbacks to list:

  1. It will become less ergonomic to write unsafe code (it's justified I think, but worth mentioning...).

  2. People might just do this:

unsafe fn frobnicate(x: T, y: U, ...) -> R {
    unsafe {
        ... // Actual code.
    }
}

and then nothing has been gained. I don't know what the risk of this is, but worth mentioning.

This comment has been minimized.

@RalfJung

RalfJung Nov 5, 2018

Member

I added (a variant of) 1.

For 2., I think something has been gained: It is not possible to incrementally improve this function's unsafe locality. Or maybe it is not worth it, then that has at least been explicitly documented in the code.

This comment has been minimized.

@Centril

Centril Nov 5, 2018

Contributor

Yeah I'm not entirely sure 2. is a drawback or not; I usually try to write the section as what someone else might think is a potential drawback (but not necessarily me) -- i.e. this is the section where I try to bring out my inner Dr. Phil / empathy =P

This comment has been minimized.

@RalfJung

RalfJung Nov 7, 2018

Member

The drawbacks section now says

Many unsafe fn are actually rather short (no more than 3 lines) and will
likely end up just being one large unsafe block. This change would make such functions less ergonomic to write.

That should cover 2, right?

This comment has been minimized.

@Centril

Centril Nov 7, 2018

Contributor

@RalfJung the concern is actually slightly different here; it is that people will just go and write one big unsafe { ... } when they shouldn't.

This comment has been minimized.

@mark-i-m

mark-i-m Nov 8, 2018

Contributor

Isn't that already possible today?

This comment has been minimized.

@Centril

Centril Nov 8, 2018

Contributor

@mark-i-m yes; sure -- the concern is that the change we do here might not have any noticable effect cause people could be lazy and...

Show resolved Hide resolved text/0000-unsafe-block-in-unsafe-fn.md
Show resolved Hide resolved text/0000-unsafe-block-in-unsafe-fn.md
@Centril

This comment has been minimized.

Contributor

Centril commented Nov 4, 2018

I've added T-dev-tools for the possible clippy lint for now. If no such clippy lint is proposed in the final version before final review of the RFC I'll remove that team.

@scottmcm

This comment has been minimized.

Member

scottmcm commented Nov 4, 2018

I think this RFC need more examples of realistic code where this would help, and an explanation of why it helps in enough cases that it's worse the obvious pain in current cases. That seems especially true since "safe" code is just as suspect when it's around unsafe.

Another alternative: an ununsafe block (obvious strawman name) that disallows calling unsafe code again (that can be undone with another unsafe block, of course).

More generally, async fn puts an async block around the body (among other things), so it doesn't seem insane that unsafe fn puts an unsafe block around the body. Though we won't have effect polymorphism any time soon, is there some inconsistency here that should be fixed at a different level?

@mark-i-m

This comment has been minimized.

Contributor

mark-i-m commented Nov 5, 2018

@scottmcm There are some places in the language where you are forced to use unsafe fn. For example, SIMD or alternate calling conventions or implementing traits with unsafe functions.

Here are some example from a project I worked on:

  1. Allocator trait: https://github.com/mark-i-m/os2/blob/47136c645878e0295142213bf63e03fe4e0bca45/kernel/memory/heap.rs#L26-L43
    You might notice that it's very non-obvious from the body of these implementations if they actually use any unsafe operations. IIRC, they don't.

  2. Weird calling conventions: https://github.com/mark-i-m/os2/blob/47136c645878e0295142213bf63e03fe4e0bca45/kernel/process/sched.rs#L158-L179
    This function is part of the context-switching code of an OS kernel. The stack is in a very weird state when this is called. In this case, the caller does have a proof obligation (it should only be called from a particular part of the kernel). It also happens that there are one or two patches of unsafe operations. It would be really nice to separate these concerns

@sfackler

This comment has been minimized.

Member

sfackler commented Nov 5, 2018

Another concrete use case I find valuable:

Callback functions used when interacting with C libraries are almost always unsafe extern "C" fns, since they're usually passed raw pointers. However, the actual scope of unsafety in the implementations of the callbacks is commonly limited to casting those raw pointers in to Rust references. Currently, that's not called out visually since the entire function body is already an unsafe block but this RFC would enable more tightly scoped blocks.

@RalfJung

This comment has been minimized.

Member

RalfJung commented Nov 5, 2018

More generally, async fn puts an async block around the body (among other things), so it doesn't seem insane that unsafe fn puts an unsafe block around the body. Though we won't have effect polymorphism any time soon, is there some inconsistency here that should be fixed at a different level?

unsafe is not an effect and behaves nothing like an effect. :)

async says "this function is externally observable to not behave like a normal function, not even calling it works the normal way". unsafe just means "this is a normal function but you have some preconditions". unsafe can be discharged: By proving some things (just proving and checking, not actually doing anything!), you can make an unsafe function safe (think: get_unchecked vs get). This is impossible with effects. You cannot remove async from your function after proving some things about it or adding some assert!.

There is some syntactical similarity between async and unsafe, but semantically speaking they are worlds apart.

@Centril

This comment has been minimized.

Contributor

Centril commented Nov 5, 2018

unsafe is not an effect and behaves nothing like an effect. :)

well... not everyone agrees (as you know) ;) https://internals.rust-lang.org/t/what-does-unsafe-mean/6696/2
cc @eternaleye

@RalfJung

This comment has been minimized.

Member

RalfJung commented Nov 5, 2018

Some further examples of longer unsafe fn that look like they would benefit from a more clear demarcation of where the danger is in there:

Basically any time your unsafe fn contains any non-trivial logic, the implicit unsafe block is not your friend.

However, I also have to admit that the vast majority of unsafe fn are less than 3 lines and just call another unsafe fn or perform a raw ptr deref or so. For all of them, this change would mostly be syntactic noise, which is unfortunate.

@RalfJung

This comment has been minimized.

Member

RalfJung commented Nov 5, 2018

well... not everyone agrees (as you know) ;)

I am aware. However, I gave my usual arguments above, and AFAIK they have not been refuted yet, so I will keep claiming that everyone who says unsafe is an effect is wrong. ;) In this particular case I think it is actually actively harmful to think of unsafe this way, because it emphasizes a focus on the "additional power" aspect of unsafe, instead of focusing on the "proof obligation" aspect. I think the latter is vastly more important, and the language agrees with me: If the focus was "additional power", we would not let you write unsafe blocks on a safe fn. If the focus was additional power, we would e.g. use unsafe to mark the presence of interior mutability and we would want a guarantee like "calling a safe fn will never write to shared data" (akin to "calling a non-async fn will never yield to another task"). We do not have this guarantee, because this is not what unsafe is for. It is not an effect. We could have an annotation for "writes to shared data", and I agree that would be an effect.

I think whoever claims that unsafe is an effect should formally define what you can then say about a function that is not marked unsafe, in terms of its observable behavior. Because that's what effects make for: To make statements like "does not panic" or "does not allocate" or "does not yield". "Has been manually proven correct" is very, very different from that in that it can be hidden behind an abstraction barrier.

But anyway, this is getting off-topic. ;)

@SimonSapin

This comment has been minimized.

Contributor

SimonSapin commented Nov 5, 2018

I’m worried about the migration of existing code.

I’d like to see this RFC make it a requirement that rustfix / cargo fix need to fully support automating the necessary code changes, before this lint can warn by default.

But this is tricky, simply wrapping the entire body of a function into a new unsafe block sort of defeats the purpose of this change. On the other hand it would likely be very noisy to minimize the size of generated unsafe blocks as much as possible by wrapping each unsafe fn call (or other operation that needs it) separately. Finding a balance between those likely requires case-by-case subjective human judgment.

@RalfJung

This comment has been minimized.

Member

RalfJung commented Nov 5, 2018

But this is tricky, simply wrapping the entire body of a function into a new unsafe block sort of defeats the purpose of this change.

I wouldn't necessarily say so. It still provides benefits for new unsafe code written later, and it permits gradual migration of existing unsafe code.

@SimonSapin

This comment has been minimized.

Contributor

SimonSapin commented Nov 5, 2018

Good point, my "sort of" only applies to existing code. I didn’t meant to argue against this RFC, I was only pondering the merits of different ways to deal with the migration. Sorry if I implied otherwise.

@RalfJung

This comment has been minimized.

Member

RalfJung commented Nov 5, 2018

It's okay. :) I will add something about migrations to the RFC text.

@newpavlov

This comment has been minimized.

newpavlov commented Nov 5, 2018

I wonder how many of existing unsafe fns would simply wrap the whole function body with unsafe block with the proposed change. If it's more than 90% (for my code I think its true), I think it will be better to introduce some kind of ununsafe/safe block which will turn on safety checks for a wrapped code. I would hate if code like this will be common:

unsafe fn foo() {
    unsafe {
        // ..
    }
}

And while treating unsafe fn as an effect is debatable (I agree with @RalfJung argumentation here), but I think that the current behaviour is consistent and easy to understand, while the snippet above can be somewhat surprising for new users.

@mark-i-m

This comment has been minimized.

Contributor

mark-i-m commented Nov 5, 2018

And while treating unsafe fn as an effect is debatable (I agree with @RalfJung argumentation here), but I think that the current behaviour is consistent and easy to understand, while the snippet above can be somewhat surprising for new users.

I think it is only surprising because we have already enforced the wrong idea. Specifically, it is really easy to learn about unsafe fn and think of it as "this is a function with an unsafe body" as opposed to "this is a function with an unsafe caller". In other words, the fact that the body is implicitly wrapped in unsafe syntactically hides the important fact that the caller has some obligation. IMHO the distinction is important but subtle, and the current state of things doesn't help.

@newpavlov

This comment has been minimized.

newpavlov commented Nov 5, 2018

In other words, the fact that the body is implicitly wrapped in unsafe syntactically hides the important fact that the caller has some obligation.

How does it hide it if it's impossible to call such function outside of unsafe context?

I think that debating semantics comes second to finding how much of the unsafe fns will benefit from the proposed change. If for most of the functions it will just make code noisier, I doubt we should go with it, especially considering the migration churn.

@mark-i-m

This comment has been minimized.

Contributor

mark-i-m commented Nov 6, 2018

How does it hide it if it's impossible to call such function outside of unsafe context?

Fair point. I was thinking more from the perspective of a person writing code, especially a newbie.

I think that debating semantics comes second to finding how much of the unsafe fns will benefit from the proposed change. If for most of the functions it will just make code noisier, I doubt we should go with it, especially considering the migration churn.

I think I just have to disagree there... I think this change should be made because it's semantically correct. I'm pretty sure there will be a lot of diffs wrapping function bodies in unsafe {...}. I think it's very worth the one-time churn.

@pnkfelix

This comment has been minimized.

Member

pnkfelix commented Nov 7, 2018

Alternative I think should be added to the Alternatives section:

  • an unsafe fn with no unsafe blocks in its body get an implicit unsafe around their body. Ie, most uses today keep compiling.
  • an unsafe fn with an unsafe block in its body (post macro expansion) does not get the implicit unsafe.

In other words, using an explicit unsafe allows one to specify the narrower region of code where unsafe operations are performed

If the entirety of the unsafe fn body is made up of safe code, then one can add an empty unsafe {} to its start to statically assert this.

@vi

This comment has been minimized.

vi commented Nov 7, 2018

What if make unsafe unsafe fn qqq() { ... } equivalent to unsafe fn qqq() { unsafe { ... }}?

This way short unsafe functions stay relatively nice.

@Nokel81

This comment has been minimized.

Contributor

Nokel81 commented Nov 7, 2018

I do wonder what the requirements for unsafe fn now would be. Since it is "safe" to call a function whose entire body is "unsafe"

@oli-obk

This comment has been minimized.

Contributor

oli-obk commented Nov 7, 2018

Reading the comments I think a valid alternative is a unsafe_to_call (bikesheddable keyword) function which makes the body safe. We can still have a lint for changing unsafe fns with no unsafety in the body to unsafe_to_call fn

@burdges

This comment has been minimized.

burdges commented Nov 8, 2018

I think @pnkfelix variant assumes unsafe fns necessarily contain any unsafe code, which appears false, ala Vec::set_len.

I disagree with that snippet being surprising @newpavlov because I found the current incorrect semantics extremely surprising. In the book, unsafe fns are clearly explained as an fn that's unsafe to call, not as a convenience for writing large unsafe blocks.

Afaik, there is no good reason to provide any sugar for fns that want entirely unsafe bodies, but if anyone finds such a reasons then the syntax should be an unsafe keyword between the declaration and body, sofn foo(&self) -> Bar unsafe { .. }.

@pnkfelix

This comment has been minimized.

Member

pnkfelix commented Nov 8, 2018

@burdges my proposal assumes that most unsafe fn bodies contain some unsafe code, but it does not assume that of all such bodies.

did you not see my statement that you would use an empty unsafe {} at the start to assert that the body is entirely safe code??

@ogoffart

This comment has been minimized.

ogoffart commented Nov 8, 2018

Just an idea: if the unsafe function contains no unsafe blocks, consider the whole function as unsafe (current behaviour).
If the unsafe function contains at least one unsafe block, unsafe operations are not allowed outside the unsafe blocks block.

    // Ok: no unsafe block  (as currently)
    unsafe fn get_unchecked1(&self, i: usize) -> &T {
        let index = *self.mappings.get_unchecked(i);
        self.array.get_unchecked(index)
    }

    // Ok (currently "unnecessary `unsafe` block" warning )
    unsafe fn get_unchecked2(&self, i: usize) -> &T {
        unsafe {
            let index = *self.mappings.get_unchecked(i);
            self.array.get_unchecked(index)
        }
    }
   
    // Error (or warning): unsafe code outside of an unsafe block.
    // (currently a warning about un unnecessary unsafe block)
    unsafe fn get_unchecked3(&self, i: usize) -> &T {
        let index = *self.mappings.get_unchecked(i);
        unsafe { self.array.get_unchecked(index) }
    }

This keeps a better compatibility since the code without unsafe block continue to work. And small unsafe function do not need to add an unsafe block.

Edit: I just realized this is what @pnkfelix suggested earlier in #2585 (comment)

@pnkfelix

This comment has been minimized.

Member

pnkfelix commented Nov 8, 2018

@ogoffart sounds like an interesting alternative...

Update: I was trying to go for a friendly joke but it might be taken poorly. Let’s pretend I said “great minds think alike... and so do we two!”

@burdges

This comment has been minimized.

burdges commented Nov 8, 2018

I see now @pnkfelix so actually your variant gives a faster migration path. So the Reference-level explanation could become:

  1. [current text]
  2. [adopt your variant now and warn in clippy and in rustc nightly]
  3. [warn in rustc stable]
  4. [error in rustc in 2021 and clippy suggests removing your unsafe {} tags]

@Centril Centril assigned cramertj and unassigned RalfJung Nov 8, 2018

@gnzlbg

This comment has been minimized.

Contributor

gnzlbg commented Nov 11, 2018

+1. This is something that cargo fix could "trivially" fix, but the whole point of this change is that it should not do so automatically (maybe behind a feature flag).

@RalfJung

This comment has been minimized.

Member

RalfJung commented Nov 12, 2018

@pnkfelix

an unsafe fn with no unsafe blocks in its body get an implicit unsafe around their body. Ie, most uses today keep compiling.

This is an interesting middle-ground suggestion! However, I do not very much like the part where you say

If the entirety of the unsafe fn body is made up of safe code, then one can add an empty unsafe {} to its start to statically assert this.

That seems... really awkward.

I see now @pnkfelix so actually your variant gives a faster migration path. So the Reference-level explanation could become:

I do not think this is about migration. Short unsafe functions are here to stay, and at least my understanding of @pnkfelix' proposal is that this is the final behavior, not some step on a migration path.


In terms of having ways to opt-in to the body being unafe, note that the following is an alternative way of writing that, with a somewhat unconventional style (but style can change) and avoiding the rightward drift:

unsafe fn foo(...) -> ... { unsafe {
  // Code goes here
} }
@xfix

This comment has been minimized.

Contributor

xfix commented Nov 14, 2018

@pnkfelix

Worth nothing that this still breaks backwards compatibility unless it would be just a warning or only work like this for new Rust editions.

You may ask: Rust warns about unsafe within unsafe fn, why would anyone put unsafe in unsafe fn? Yes, it does. Doesn't mean much.

Consider a macro internally using unsafe (like array-macro). If an unsafe function uses this macro, then it would be considered to have unsafe block in it and cause weird errors. The macro author cannot really hide the unsafety of it, as an internal implementation detail of it breaks the compilation. Yes, the problem exists today already with #[forbid(unsafe_code)], but this would make it worse.

@pnkfelix

This comment has been minimized.

Member

pnkfelix commented Nov 14, 2018

Yep, my alternative is definitely not 100% backward compat. I just figure it 1. avoids a lot of the breakage and (more importantly) 2. keeps the current set of small trivial wrappers around FFI calls just as small and trivial.

As for the awkwardness of adding an empty unsafe { } to unsafe functions with entirely safe bodies: do we have any statistics on how often such functions arise? A number of examples showing their existence have been given, but I would be surprised if they occur frequently enough for us to be optimizing the Lang design for them

@Ekleog

This comment has been minimized.

Ekleog commented Nov 14, 2018

I would personally be very surprised if adding an unsafe{} block reduced the unsafety of my code. Even (especially?) if said block isn't empty, or when a macro would come with an inner unsafe{} block. making me not even see it.

@eternaleye

This comment has been minimized.

eternaleye commented Nov 14, 2018

I think @burdges makes a good point regarding "lifting" the unsafe for short functions with unsafe bodies. This leads to the following:

// Vec::set_len() is unsafe, but does not use any
unsafe fn set_len(&mut self) { ... }
// ioctl is and uses unsafe (syscall)
unsafe fn ioctl(...) unsafe { ... }
// blarg is unsafe and uses a little unsafe
unsafe fn blarg() { ... unsafe { ... } ... }
// bleh is safe and uses a little unsafe
fn bleh() { ... unsafe { ... } ... }
// fnord is safe-by-types but is just an unsafe call
fn fnord(f: Foo) unsafe { risky(Foo.0) }

This avoids double-bracing (and rightwards drift), while still being explicit.

@Nemo157

This comment has been minimized.

Contributor

Nemo157 commented Nov 15, 2018

@eternaleye where would you put the unsafe on the body when you have return types and/or where clauses?

fn foo(&mut self) -> i32 unsafe {
     unimplemented!()
}

fn bar<T>(&mut self, t: T)
where
    T: Iterator,
    T::Item: fmt::Debug,
unsafe {
    unimplemented!()
}
@mark-i-m

This comment has been minimized.

Contributor

mark-i-m commented Nov 15, 2018

Personally, I would like to avoid ending up in a situation, where we can have the same keyword at the beginning or end of a fn header:

unsafe fn foo() { ... } // caller unsafe
fn foo() unsafe { ... } // implementor unsafe

Also, I don't think we should use unsafe anywhere in the header to denote that the body is unsafe. IMHO, the header should only talk about things a caller would care about. So far, rust has done a really great job at keeping that property, and I would hate to see it go.

I would instead rather use a new style convention: if the function is less that 5 lines, we put the unsafe { on the same line:

fn foo() { unsafe {
  blah_unsafe();
  ...
} }
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment