Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove several usages of 'unsafe' #968

Merged
merged 4 commits into from Jul 17, 2019
Merged

Conversation

Aaron1011
Copy link
Contributor

@Aaron1011 Aaron1011 commented Jul 10, 2019

This PR removes several different usages of unsafe (specifically, UnsafeCell), and replaces them with safe code. While these changes technically add some amount of overhead (Copying a type instead of taking a reference, and runtime RefCell checks), it should be so small as to be invisible on any benchmark.

By replacing UnsafeCell with the safe Cell/UnsafeCell types, we ensure that any mistakes will lead to a panic! at worst, instead of undefined behavior.

Each commit modifies a single file, and describes the changes made.

The previous API was extremely dangerous - calling `get_ref()`
followed by `reset()` would trigger instant UB, without requiring
any `unsafe` blocks in the caller.

By making DateInner `Copy`, we can use a normal `Cell` instead
of an `UnsafeCell`. This makes it impossible to cause UB (or even panic)
with the API.
Also add explanation of the safety of the usage of `unsafe`
This ensures that a mistake in the usage of 'get_mut' will cause
a panic, not undefined behavior.
@fafhrd91
Copy link
Member

What is the point? What solves this Pr?

@Aaron1011
Copy link
Contributor Author

It's very easy to accidentally cause undefined behavior with the custom Cell struct - it doesn't even require an unsafe block in the caller. This PR eliminates several potential source of undefined behavior.

@fafhrd91
Copy link
Member

Could you show me potential UB?

@Aaron1011
Copy link
Contributor Author

Aaron1011 commented Jul 10, 2019

With the previous implementation:

let date = DateServiceInner::new()
let my_ref = date.get_ref();
date.reset()

is UB.

@fafhrd91
Copy link
Member

I guess everybody should switch to interpreted language otherwise we will die in ub

@fafhrd91 fafhrd91 closed this Jul 10, 2019
@ketsuban
Copy link

@fafhrd91 Why did you close this PR?

@fafhrd91
Copy link
Member

I don’t see reason for this change

@ketsuban
Copy link

The OP pretty clearly demonstrated the reason - it fixes sources of memory unsafety in safe Rust.

@nico-abram
Copy link

You asked for a demonstration of UB and they clearly provided it. At this point, I think the question should be, are there any reasons not to merge this? (i.e, has anyone run actual benchmarks on this?)

@JeanMertz
Copy link

I guess everybody should switch to interpreted language otherwise we will die in ub

That's a bit of a strange response? It's your project, so you can obviously do what you want, but I also agree with @nico-abram, people are using Rust because of its safety guarantees (amongst other things), so if there are no (or minimal) downsides to not using unsafe here, and there are clear upsides as demonstrated, then why not consider the improvements @Aaron1011 made?

@Shnatsel
Copy link

Memory safety and absence of undefined behaviour is Rust's core value proposition. If it's possible to cause UB using only safe Rust, that's automatically a deal-breaker, and must be fixed even at the expense of performance.

Nobody cares if your car is the fastest around if it has an unshielded nuclear reactor for the engine. And UB triggered from safe Rust is the equivalent of an unshielded nuclear reactor.

@Shnatsel
Copy link

@Aaron1011 could you elaborate on what kind of UB this is? Does this expose contents of uninitialized memory to the caller?

@Lokathor
Copy link
Contributor

one example was here: #968 (comment)

@Aaron1011
Copy link
Contributor Author

@Shnatsel: This code:

let date = DateServiceInner::new()
let my_ref = date.get_ref();
date.reset()

obtains an immutable reference to data via get_ref, and then modifies the pointed-to data via data.reset().

Per the UnsafeCell docs:

For example, this means that if you take the *mut T from an UnsafeCell and cast it to an &T, then the data in T must remain immutable (modulo any UnsafeCell data found within T, of course) until that reference's lifetime expires.

The current DateServiceInner lets you cause UB in safe code.

@apopiak
Copy link

apopiak commented Jul 16, 2019

@Aaron1011 making my_ref a dangling pointer I assume?

@fafhrd91
Copy link
Member

DateServiceInner is not even pub(crate), so example assume that I am brain dead

@Lokathor
Copy link
Contributor

Or that someone submits a PR, or forks and doesn't know exactly the special knowledge you have and assumes it's find since the unsafe keyword isn't there.

@fafhrd91
Copy link
Member

Reviews, no?

@CryZe
Copy link

CryZe commented Jul 17, 2019

Well it sounds like the unsafety should be more thoroughly documented there at least and then it may be is fine. Although it sounds like the functions should be marked as unsafe at well then. Doesn't hurt if it's not affecting the public API anyway.

@polarathene
Copy link

polarathene commented Jul 17, 2019

Reviews assume the reviewer has that prior knowledge and can pick it up. If you intend to always maintain the project and not hand it off to others in the future for whatever reasons, and that there is no chance of another user doing reviews instead(without such knowledge which is presumably not documented), then perhaps there is no concern, provided you do pick up on it?

actix is a great project and many of the users truly appreciate your work on it, yet at the same time Rust is often chosen for avoiding potential UB problems. Yes, this is internal specific, and yes if you're reviewing it there's a good chance you'll notice what appears to be safe code is using unsafe code behind the scenes and could cause a UB, but doesn't that add additional mental effort on your part for the actix code base and it's related projects you maintain regarding anywhere that unsafe is used?

You dismissed the PR on the grounds of your own confidence that while unsafe, it doesn't pose a real world problem to it's users. While appreciated, the concern being echo'd here is what is wrong with removing the possibility of causing UB? Is there any negative impact on your part in accepting the PR? Does it make the project more difficult to maintain, does it negatively impact performance? Or is it based on ego alone?

The latter will discourage contributions and trust from the community. If that's not important to you, no problem. It does seem a bit silly to make a fuss over not merging, vs removing the UB from happening(now the compiler prevents it, one less thing to mentally remain aware of when reviewing).


Looking over the PR, it makes it more clear what to expect and reduces the unsafe block scopes. That makes it more accessible to contributors, they can more confidently contribute code there without the chance of missing something and hoping you pick up on it during review.

@fafhrd91
Copy link
Member

I will review this Pr again in couple weeks, I don’t have any motivation to deal with open source and community at the moment.

@SRGOM
Copy link
Member

SRGOM commented Jul 17, 2019

@fafhrd91 OSS can be a very toxic community- I'll come out and say it ouright. You'll find a lot of complainers and moochers and choosing-beggars.

That said (and I'm acting like a lecturing beggar here because I like this framework but have never contributed) in this particular case it'd be better to just merge since the arguments are clear and not merging is bad, both technically and in terms of the implied contract of OSS.

I would also 100% stand by you even in the event where you just quietly stopped supporting the project or handed it over to somebody or formed a committe or whatever. But I do think, as a programmer that in this particular case, the original pull request does make sense.

Once again, lots of respect for (almost) single handedly creating probably the fastest web server program, in your own time (I believe, or even if it was Microsoft's time, they can share some respect).

@JeanMertz
Copy link

@fafhrd91 OSS can be a very toxic community

While I agree in general, I don't think there has been any toxicity in this thread, has there? I see people trying to educate others on why it's good to merge this PR (similar to your comment), but also being polite and understanding that this is someone's open-source work that we're all benefiting from.

Having said that, GitHub's negative drive-by emoji reactions are toxic in my opinion, for anyone who feels the same, here's a CSS snippet to make your GitHub browsing experience a lot less negative:

.comment-reactions button[value="THUMBS_DOWN unreact"] { display: none !important; }
.comment-reactions button[value="THUMBS_DOWN react"] { display: none !important; }
.comment-reactions button[value="CONFUSED unreact"] { display: none !important; }
.comment-reactions button[value="CONFUSED react"] { display: none !important; }

@awulkan
Copy link

awulkan commented Jul 17, 2019

@JeanMertz I think @SRGOM was talking about the Reddit thread that grew out of control a bit. I totally understand if @fafhrd91 feels disheartened after reading it. And I think it should have been handled in another way than it was.

A lot of people are very passionate about Rust and its ecosystem, and sometimes that passion turns into a pretty demoralizing experience for anyone on the receiving end.

Regardless if this PR is right or not, or if the response to it was in bad taste, @fafhrd91 has the final word on this project. And I don't think that trying to strong arm him into accepting a PR is the right solution. It's free open source software, and he has no obligation to do as anyone tells him, wether it's right or wrong.

@Xunjin
Copy link

Xunjin commented Jul 17, 2019

I found it funny (not really, just being sarcastic) that people confuses OSS with owning the other people projects, and they "demand" that X or Y PR just "have to" be merged because just don't follow their own vision on how it's suppose to be.

People from the thread mentioned by @awulkan (thank you for your wise words) careful how you approach situations, also learn how to deal with your passion about a rapidly growing Rust ecossystem that we all love. This kind of attitude (the ones who were just bashing the project) does not help at all.

@fafhrd91 keep up the great job you are doing here, and congrats for this amazing framework 😉

PS: Just to note, before we are developers, we are human beings, have to deal everyday with our own problems and still reserve (sometimes even sacrifice) our rest/family/social time to do Open Source Stuff (when not paid to do it full time), so please i kindly ask you, be reasonable 😅

@fafhrd91 fafhrd91 reopened this Jul 17, 2019
@fafhrd91
Copy link
Member

fafhrd91 commented Jul 17, 2019

before we continue I have couple questions to @Aaron1011 and others

  1. Does code that this Pr changes contain UB?
  2. Is it possible to trigger UB by using public api?
  3. Does this Pr solve any immediate bug in code?

@bvinc
Copy link

bvinc commented Jul 17, 2019

I'm just a bystander, but I just want to point out that there seems to be a disagreement about what unsafe is for. To quote withoutboats:

unsafe doesn't mark that "an unsafe operation occurs inside here"- it marks "I am garanteeing that I uphold the necessary invariants here."

To quote the Rustonomicon:

The unsafe keyword has two uses: to declare the existence of contracts the compiler can't check, and to declare that a programmer has checked that these contracts have been upheld.
No matter what, Safe Rust can't cause Undefined Behavior.

It appears that you have functions marked as safe that do not uphold the necessary invariants. This is a concern that's completely independent of bugs reachable by the public API.

An alternative solution to this concern, if you want to keep the behavior as is, would be to mark the unsafe functions as unsafe. It might be noisy, as you would have to mark each calling location unsafe, but it would be accurately modeling the transitive explosion of places where safe rust's guarantees aren't being upheld and can't be relied upon.

@fafhrd91
Copy link
Member

It is always unsafe on first level and then every next level must be marked as unsafe. Am I wrong?

I am actually do not understand where my code does not satisfy any of those quotes

@bvinc
Copy link

bvinc commented Jul 17, 2019

let date = DateServiceInner::new()
let my_ref = date.get_ref();
date.reset()

If the previous code is written in safe rust, and causing undefined behavior, that means that the unsafe code is not upholding the necessary invariants. Either get_ref or reset, or both, is actually an unsafe function. If the safety gaurantees of safe rust were upheld, it would be impossible to cause undefined behavior from safe rust code.

You can see examples of the standard library:

https://doc.rust-lang.org/std/vec/struct.Vec.html#method.set_len

Vec uses unsafe code. But the invairants are upheld and almost every method is marked as safe, and it would be impossible to cause undefined behavior from any of these methods. But there are exceptions. set_len can be used to cause undefined behavior. That method is marked as an unsafe function.

@fafhrd91
Copy link
Member

fafhrd91 commented Jul 17, 2019

@ketsuban @nico-abram @64 you were vocal about UB, could you answer my questions

@fafhrd91
Copy link
Member

fafhrd91 commented Jul 17, 2019

@bvinc should DateServiceInner::get_ref() be exposed as public to satisfy set_len example?

@Lokathor
Copy link
Contributor

It doesn't matter if it's public, it only matters if anyone could cause UB using only safe code. If they can, then some other code somewhere is using an unsafe block that it shouldn't.

public has nothing to do with it.

@bvinc
Copy link

bvinc commented Jul 17, 2019

I'm not sure why you're asking about it being public. All I'm saying is that if get_ref upholds all of the invariants of safe rust, it should be marked as safe. If it doesn't, it should be marked as an unsafe function. (Or alternatively, its behavior could be changed to be a safe function).

@fafhrd91
Copy link
Member

Ok, I guess I don’t have enough knowledge and not ready to use unsafe in rust. Thanks everyone.

@fafhrd91 fafhrd91 merged commit b36fdc4 into actix:master Jul 17, 2019
@ketsuban
Copy link

The only thing I was vocal about was not casually dismissing with "I don't agree with this" a pull request addressing unsafety. I mildly resent being dragged back when I'd unsubscribed because other more capable people had taken over the discussion (though I apologise for any role I played that resulted in your receiving personal abuse).

@64
Copy link

64 commented Jul 17, 2019

  1. Does code that this Pr changes contain UB?

No

  1. Is it possible to trigger UB by using public api?

No

  1. Does this Pr solve any immediate bug in code?

No

This code is a problem because it allows UB to be triggered through safe Rust, which goes against the entire purpose of safe Rust. While there is no immediate issue with this code in actix, it is best to remove it for a couple of reasons:

  1. Someone sends a PR to actix at some point down the line with some buggy usage of this code that you miss. Humans make mistakes, and in the case of a PR it can be easy to miss something in a review, especially if there's a huge diff or whatever. Admitting it's possible you can make a simple mistake is not the same thing as calling you braindead. There is just no reason to impose extra cognitive burden on yourself.
  2. Someone makes a fork, doesn't understand the invariants that they need to uphold with this code and causes UB because they figured 'hey, it's safe, the worst thing that can happen inside this function is a panic' - which is what everyone expects when they're using a safe API. You reviewing other people's forks is not an option.

If this particular code can be used to trigger UB in safe Rust then you have two options:

  1. Mark the functions as unsafe. Although this does not completely remove the possibility that someone forks and violates the invariants, or sends a PR which you miss, it greatly reduces the chances of this happening. Seeing the unsafe keyword in code is a massive flag that someone actually needs to investigate it properly - even if you miss it in a PR review, it's far easier for some random person who wants to audit the codebase to be able to just grep for unsafe and figure out that this is a problem.
  2. Remove the unsafety entirely (which is what this PR does). This is a better option in this scenario because there is (probably) no performance impact and it doesn't massively complicate the code either.

It is always unsafe on first level and then every next level must be marked as unsafe.

Correct. Because every level has to uphold the invariants or risk UB. If there's a point where no matter what, UB cannot happen (e.g you have a function which takes no arguments, and calls an unsafe function with some constant known good inputs), then you'd be fine to leave that function as not unsafe, because there is absolutely no way UB can happen through safe Rust code.

Just to really make the point clear: safe Rust means that no matter what, you cannot trigger UB. No matter what. Before this PR, you could trigger UB from safe Rust. So there is a problem.

This code should never have been unsafe in the first place. There is absolutely no reason not to merge this PR and I am happy that you have done so.

@SRGOM
Copy link
Member

SRGOM commented Jul 18, 2019

I think this could be a good place for @fafhrd91 to ask people (since it seems quite a few of them are watching) if they can help in maintaining this framework? OSS management can be hard and if @fafhrd91 needs help, maybe somebody can step up?

@Aaron1011 Aaron1011 deleted the fix/remove-unsafe branch July 18, 2019 01:44
@fafhrd91
Copy link
Member

fafhrd91 commented Jul 18, 2019

Project exists almost 2 year, any help always was/is welcomed, features/fixed/docs/maintenance anything. Unfortunately, capable people are busy or do small contributions and we left with those who can not do development

@ExpHP
Copy link

ExpHP commented Jul 19, 2019

Now hold on just a minute, here. I am as strongly against incorrect usage of unsafe as the next guy, but let's not send the wrong message.

People are stating as an absolute that if a function can cause UB in safe code, it must be marked unsafe—even if it is private to that module.

@Lokathor
It doesn't matter if it's public, it only matters if anyone could cause UB using only safe code. If they can, then some other code somewhere is using an unsafe block that it shouldn't.

public has nothing to do with it.

@bvinc
I'm not sure why you're asking about it being public. All I'm saying is that if get_ref upholds all of the invariants of safe rust, it should be marked as safe. If it doesn't, it should be marked as an unsafe function.

@64
Just to really make the point clear: safe Rust means that no matter what, you cannot trigger UB. No matter what. Before this PR, you could trigger UB from safe Rust. So there is a problem.

However, this contradicts the general consensus of the Rust community that the unit of encapsulation of unsafety in Rust is the module, not the function. This is to say, there is a specific line to be crossed when the lack of the unsafe keyword on a fn or trait becomes definitively wrong, and that line is when it has a visibility of pub(super) or greater.

We should not, and ultimately can not possibly require that safe code inside of a module is isolated from all possible UB due to unsafe blocks within that same module. Here's some examples of where that philosophy falls short:

  • Suppose you have a private field on a struct, and that modifying this field can cause UB in safe methods on the type. (e.g. Vec::len). You can't mark fields as unsafe. Rust will allow any safe code in the same module to read or modify it without an unsafe block.
  • Suppose you are forced to implement Clone for a type, but your Clone impl can cause UB under certain conditions. You cannot mark Clone::clone as unsafe.
  • Suppose that you have an RAII type whose destructor could potentially cause UB under certain conditions. You cannot mark a type's Drop glue as unsafe.

If you view the function as the unit of safety, then all of these clearly appear to be examples of UB in safe code. So are these all examples of things that must be forbidden? Of course not! If you view the module as the unit of unsafety, the solutions become obvious:

  • All code in the module that defines a field like Vec::len must be audited for how it uses that field, whether that code is safe or not. (more generally, if a module contains at least one unsafe block, then all of its code must be audited, period)
  • If a type has a Clone impl that can invoke UB, you need only ensure that it is used correctly within the module, and that the type is private to the module.
  • If a type has a Drop impl that can invoke UB, you must ensure that it is used correctly within the module, and that there is no manner in which code outside the module can obtain a type which (recursively) contains the dangerous type in question. (unless the outer type sets the appropriate conditions for safe destruction, or etc.)

All I'm trying to say is: @fafhrd91 asked a legitimate question.

Yes. Privacy does matter.

If all of these unsafe blocks did indeed rely on invariants that are correctly protected at the module barrier (I haven't read too deep into them yet), then these changes are an improvement not because the original code was wrong, but rather, because it was sloppy.

Many people (myself included) feel they have a responsibility to review a crate's implementation before using it. When a crate author uses unsafe needlessly and without documenting the conditions and reasons for safety, this makes our lives miserable, because we are not psychic, and, quite honestly, we cannot trust that you considered all of the ways that it can go wrong. We are all human, after all.

@Lokathor
Copy link
Contributor

Yes. I overstated it. Sorry about that.

The unsafe code in question didn't rely on any documented module invariant, and didn't perform any safety check at runtime, it just blindly hoped that you didn't do the wrong thing.

@RalfJung
Copy link

the unit of encapsulation of unsafety in Rust is the module, not the function

Shameless plug, for a post that goes into more details about this: https://www.ralfj.de/blog/2016/01/09/the-scope-of-unsafe.html

@actix actix locked and limited conversation to collaborators Jan 17, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet